Yesterday (29th Feb 2012) our website at www.kynetix.com was down for most of the day. Our site is hosted in the cloud on the Windows Azure platform which suffered from a major outage for most of the day.
The outage, which affected the Windows Azure platform across the world, has been attributed to a leap year bug. It’s ironic that, 12 years after the damp squib that was the Year 2000 changeover, the Azure platform suffered from a bug that should have been easily preventable.
The Windows Azure outage has, of course, played into the hands of the cloud computing cynics and will give them plenty of fuel for their arguments. However, I won’t be joining them even after an outage like this and we will continue to host our site on the Windows Azure platform.
This outage will certainly knock the credibility of Microsoft as a cloud platform provider but it should not be seen as proof that cloud computing is doomed to fail. On the contrary, I think this outage will only improve the Azure platform in particular and cloud computing generally.
Every serious outage like this leads to improvements and to tighter process that will ensure that it is highly unlikely to happen again – certainly for the same reason. It will also have severely tested the capability of Microsoft to respond to such a critical issue and you can bet that Steve Ballmer will be ensuring that a full and thorough post-mortem will be held and improvements implemented.
I doubt that other cloud computing vendors will be gloating much about this outage. Every platform vendor knows that they are just one line of code away from having the same problem. In fact most of the major competitors to Microsoft (Amazon, Google etc.) have all suffered from their own outages so it can happen to anybody.
I would venture that there is not a single large enterprise that hasn’t had some form of internal service outage. From power cuts to loss of internet connectivity through to crashed servers nobody can guarantee 100% uptime.
As further proof you only need to look at that beacon of the the financial markets, The London Stock Exchange (LSE), which has had a number of outages over the last few years.
Cloud computing cynics who will justify their position on the back of this outage are akin to people who are afraid of flying and justify their position after a plane crash. Yes, it occasionally happens, but each crash greatly improves the safety and reliability for the future. A plane crash doesn’t stop people flying and I don’t think this outage will stop the growth of cloud computing.