929RA — Cloud IT Failures Emphasize Need for Expectation Management
What is Happening? — This week Amazon Web Services (AWS) experienced two service outages in facilities on different continents. Not surprisingly, there has been considerable media and analyst focus and hyperbole on the causes, durations, and impacts of the outages.
Saugatuck continues to believe that Cloud IT as a phenomenon, and Cloud IT providers and services in general, can and should be relied upon to deliver critical IT-as-a-service. Cloud services and providers tend to be built for, and managed with, greater reliability and security than the vast majority of traditional data centers and on-premise systems.
But we are surprised to see that “conventional wisdom” among users appears to have become that Cloud-based services are somehow impervious – or at least less prone to – the types of failures and outages suffered by typical private data center infrastructures. Saugatuck’s ongoing research indicates that current and prospective users of Cloud IT continue to express surprisingly high levels of expectation and complacency when it comes to outsourcing critical IT to Cloud providers.
While the specifics of the outages are important, Saugatuck considers it more important for Cloud IT users to view the outages as highly-visible motivations to “level set” their expectations about Cloud offerings and to sharpen their focus on Service Level Agreements.
Why Is It Happening? — The outages experienced by AWS this week do not prove, or even suggest, that Cloud IT is fragile or somehow not ready for production workloads. In fact, typical vendors of Cloud IT offerings invest heavily in skilled staffing, IT infrastructure, and site facilities with the goal of delivering highly available services. The result is that despite the rapid growth in usage, Cloud services (including those of AWS), are some of the most reliable IT infrastructures in the world.
However, as history has taught us repeatedly (i.e., consider the ‘unsinkable’ Titanic), human constructs are not impervious to failure (Saugatuck Lens 360 Blog, The Last Word: Clouds Fail, So Plan and Manage Accordingly). This is particularly true for something as complex as the infrastructure underlying any Cloud IT offering. Recall that the availability of a group of components is the product of all of the individual component availabilities. For example, the overall availability of 5 components, each with 99 percent availability, is: 0.99 X 0.99 X 0.99 X 0.99 X 0.99 = 95 percent.
Quiz any experienced enterprise IT leader, and they will understand this complexity and inter-relationship, and the effects on IT reliability and availability. They “get” the complexities of Cloud IT, in other words. And yet, they, and their associated business executives / leaders, continue to maintain extremely, possibly unrealistic high expectations of Cloud IT and its providers. Saugatuck attributes this to four factors. . . Click Here to Read the full RA