A few days ago, Amazon’s Elastic Compute Cloud (or EC2 to you and I) had a catastrophic failure. The world kept turning, but unfortunately, the third-party collaboration/SVN tool we use was on the affected East USA zone where the issue started on Thursday. It wasn’t until late Sunday night that we regained access to our SVN server.
That’s effectively four days of downtime, where our Alfred development was at standstill at a point where we had huge plans for the next release. The third-party was kept in the dark by Amazon as much as we were, twiddling our thumbs waiting for things to get moving.
In the past year, Twitter’s uptime has improved and it has become even more of an essential tool to many people than before. Increasingly, Twitter is in fact being seen as a source of up-to-the-second information and news, with the newly redesigned homepage further driving the point home.
Yet, at the time of writing, Amazon hasn’t used the awscloud account to update customers of the status of the outage or the reasons behind it. There are plenty of theories floating around about hardware failure, Amazon’s Cloud Player becoming too popular too soon, but we’ve not heard an official word.
It isn’t for lack of smart cookies at Amazon either; knowing a few of them, I’m baffled why no one felt it was worth using it as a channel for communication. The AWS Health Dashboard was updated fairly frequently with obscure, meaningless status updates and no background information.
Many organisations dismiss Twitter as a social network made for sharing what you’ve had for breakfast but in times of crisis, it can truly come into its own. As far back as 2007, emergency services have used Twitter to disseminate information and help the population when fires raged across Southern California. The Los Angeles Fire Department as well as news outlets tweeted updates to help people get to safety or stay away from affected areas.
More recently, Japan’s phone networks were overloaded after the earthquakes in March – with NTT DoCoMo restricting up to 80 per cent of voice calls, especially in Tokyo – but Twitter, Facebook, Mixi and Skype were lifelines for those hiding under desks during the seemingly never-ending earthquake.
While the EC2 debacle was nowhere near as life-threatening as an earthquake, it was the perfect opportunity to post short, simple updates on Twitter, letting those directly and indirectly affected know that Amazon wasn’t asleep at the switch.
My confidence in cloud computing has been less dented by the outage itself, and much more by the feeling of helplessness Amazon caused by giving us no clue what was happening! I wonder if we’ll ever find out why they chose to be so uncommunicative, and whether they’ll improve if there’s a “next time”.