iPhone app iPad app Android phone app Android tablet app More

Amazon Apologizes For EC2 Cloud Service Downtime

Amazon Ec2

The Huffington Post   First Posted: 04/29/11 10:58 AM ET Updated: 06/29/11 06:12 AM ET

It's been more than a week after Amazon's Elastic Compute Cloud (EC2) service went down, taking sites like Reddit, Quora and Foursquare with it. Amazon is finally issuing an apology and in-depth explanation on how and why the cloud service failed so spectacularly.

The issues began a 12:47AM PT on April 21st, when an incorrect network change occurred on Amazon's Elastic Block Storage. NewEnterprise's Arik Hesseldahl notes that Amazon said it will "increase automation" to prevent the mistake from happening again. He says, "From that statement I gather that it was a human-caused mistake that was then exacerbated by the way the cloud system was designed to work."

For those affected by the downtime, Amazon will be providing an automatic 10 day credit "equal to 100% of their usage of EBS Volumes, EC2 Instances and RDS database instances that were running in the affected Availability Zone." Additionally, Amazon promises improved communication in the case of future downtime.

Here's the apology, and check out the full technical explanation here.

Last, but certainly not least, we want to apologize. We know how critical our services are to our customers’ businesses and we will do everything we can to learn from this event and use it to drive improvement across our services. As with any significant operational issue, we will spend many hours over the coming days and weeks improving our understanding of the details of the various parts of this event and determining how to make changes to improve our services and processes.

FOLLOW HUFFPOST TECH

It's been more than a week after Amazon's Elastic Compute Cloud (EC2) service went down, taking sites like Reddit, Quora and Foursquare with it. Amazon is finally issuing an apology and in-depth expla...
It's been more than a week after Amazon's Elastic Compute Cloud (EC2) service went down, taking sites like Reddit, Quora and Foursquare with it. Amazon is finally issuing an apology and in-depth expla...
 
 
  • Comments
  • 8
  • Pending Comments
  • 0
  • View FAQ
Comments are closed for this entry
View All
Recency  | 
Popularity
03:33 PM on 04/29/2011
I had not known about this until just now.

Earlier today I went to Amazon and was going to make a purchase. It said I could get $10.00 off if I would take their credit card. I don't usually do that, but they didn't have other choices to pay like paypal. I didn't want to put my regular card online.

I typed in all the info and it kept asking about how much our annual income was and it would not accept what I typed in.

I ended up getting furious and getting on amazon's chat and griping to amazon about the way it went down. I received 3 emails from Amazon. They told me I had applied for the wrong credit card to get the $10 off. I told them to cancel my order and their card. I don't know what they have done since.

I don't know if it is their techs or what, but that experience wasted an hour of my life and I am worried what else it has done.
HUFFPOST SUPER USER
Marionette
04:18 PM on 04/29/2011
every time someone applies for a credit card and then cancels it, a kitten dies..

lol, j/k
10:29 PM on 04/29/2011
FYI, your experience (though not fun) has nothing to do with this story. The downtime was for Amazon's Web Services that other sites use - not amazon.com itself.
photo
HUFFPOST SUPER USER
Stoopid American
Trooth, justice, and the American way ...
01:51 PM on 04/29/2011
A cautionary tale for anyone who builds, maintains, or operates complex systems.
This user has chosen to opt out of the Badges program
04:43 PM on 04/29/2011
They don't even need to be complex. Jeff Bezos came from the D.E.Shaw, company that started Juno Online Services, an early provider of free email. I've still got a Juno account because it's apparently *impossible* to cancel it (or to change the password), because my account originated with an installed program on my own PC, instead of as a webmail account. To this day, Juno doesn't use a secure login, and in the past the login page included a search field that took focus from the password field, resulting in doing an internet search for your password.

Until recently, the clever folks at Amazon truncated passwords after 8 characters, and counted upper case letters the same as lower case passwords.

Those seem like fairly basic and elementary (and simple) tasks for anyone running any kind of e-commerce site, but perhaps that's just not part of the corporate culture at Amazon.
photo
HUFFPOST SUPER USER
Steven Travis
Really, do you need one?
01:35 PM on 04/29/2011
This is what happens (and will happen more in the future) when you trust your information to an outside vendor. You better be willing to live with the downtime...
This user has chosen to opt out of the Badges program
single malt
I can't spell. I blame msn.
05:36 AM on 04/30/2011
Actually Amazon will likely study, learn from this and become better. Cloud based hosting is still fairly new in terms of how it is being used by Amazon so growing pains should be expected. The real advantage of it is you can scale much easier than with traditional servers. Hosting your own stuff doesn't guarantee you more up-time either. That depends on how good your engineers are.
12:49 PM on 04/29/2011
NewEnterprise's Arik Hesseldahl notes that Amazon said it will "increase automation" to prevent the mistake from happening again.
He says, "From that statement I gather that it was a human-caused mistake that was then exacerbated by the way the cloud system was designed to work."

I say, "From that statement, I gather that somebody will lose their job."