helldesigner December 23, 2012 at 19:55

The main failures in the work of cloud services in 2012, and what conclusions can be drawn from this

Based on a recent report by IWGCR (International Working Group on Cloud Computing Resiliency), cloud computing services are unavailable every year for an average of 7.5 hours. Companies that use the cloud partially or fully for their applications and services have suffered several times this year. Let's look at the biggest failures in the work of cloud services in 2012.

Microsoft Windows Azure
The largest and most extensive failure of Microsoft Windows Azure was in February, it affected all geographical locations, a full recovery of the service lasted more than 24 hours. Microsoft said the failure was caused by a software bug related to the incorrect calculation of the time and date for a leap year. The problem caused an angry reaction from users of the service, who expected more coverage of the problem and more communication from Microsoft.
In July, the Microsoft Azure cloud computing service was again unavailable, this time in Western Europe, the failure time was 2.5 hours. The reason was a misconfigured network device, which caused problems with connecting users.
Late in the fall, another Office365 malfunction occurred during which millions of user mailboxes were unavailable.

Amazon Web Services
The Amazon Web Services power outage in June cut users off essential services by 6 hours. Affected services: Amazon Elastic Compute Cloud, Amazon Relational Database Service and AWS Elastic Beanstalk located in the US East region in the data center of Virginia. In addition, companies providing cloud management services and PaaS providers such as: Stratalux, Digitaria, Heroku and the PaaS service provided by Salesforce.com were affected. Popular sites: Netflix, Pinterest, Reddit, Forsquare and Instagram also suffered from this failure.
In less than a month, the second AWS malfunction occurred, after which one of the major customers publicly announced that he was stopping using Amazon services and was forced to look for alternative options.

Apple iCloud
In September, a large number of users of this cloud service could not access their mailboxes. The problem was related to the central iCloud service, so it was common for users of computers running Mac OS, users of iOS devices and users using the iCloud.com Web interface.

Google gmail
This year, Google Gmail service crashed more than in the past. The first failure occurred in April and lasted one hour. The problem affected less than 10 percent of users, the main reason was a misconfiguration when performing a routine system upgrade operation. A second crash occurred in June and hit less than 1.5 percent of users.

Conclusion
Despite all the precautions that cloud service providers take, outages occur regularly for various reasons, such as human errors, technical failures, or natural disasters. But this is not a reason to abandon the clouds. All of these factors can be monitored with a comprehensive disaster recovery plan and the same resiliency plan. Over time, the reliability of cloud platforms will grow, and fault tolerance will strive for 100% (real, not declared by the marketing department). Sooner or later, iron hosting will be replaced by cloud.

Original article: www.rickscloud.com/major-cloud-outages-of-2012-to-learn-from
Posted by Rick Blaisdell

Tags:

The main failures in the work of cloud services in 2012, and what conclusions can be drawn from this

Also popular now: