As you may have noticed we've had some intermittent stability and response time issues over the last few days. We believe we have the problem solved - or at least quarantined.
We have one particularily high volume user who submits a very wide range of content types to us. Due to some errors in the way they were using the API and some errors in the way we were handling errors (whew..) - we were seeing system utilizations that were off the chart.
We've moved that user to their own little quarantine island until we get things worked out with them. As soon as we did this - the remainder of our servers paused for a moment, took a deep breath, and then went back to almost idle - where they belong.
So - things are looking good. We'll continue to keep a close eye on the system and make sure things stay settled down.
We learned a few things about how to debug these types of errors and will be faster in the future.
Thanks all for your patience.
Tom
Trackback URL for this post:
Thank you for submitting this cool story - Trackback from NewsPeeps
- Tom's blog
- Login or register to post comments





