Cloud capacity planning not just about capex -- it's an SLA issue, too

Do you know how to play Tetris? Good -- then you're halfway to understanding the rigors of cloud capacity planning. Learn the rest in this free chapter download.

Editor's note: Do you know how to play Tetris? Good -- then you're halfway to understanding the rigors of cloud capacity planning.

In the eyes of customers, the cloud offers a seemingly limitless pool of computing and storage resources. In that fantasy, cloud providers never run out of capacity and effortlessly match nonstop cloud bursting. But the real challenges of cloud capacity planning are no match for that myth.

This excerpt from the twelfth chapter of Cloud Computing: Automating the Virtualized Data Center, "Cloud Capacity Management," lays out the basic challenges of and requirements for managing capacity in an Infrastructure as a Service (IaaS) platform. The chapter, which can be downloaded for free courtesy of and Cisco Press, also covers how to model and manage capacity, the importance of demand management and the role of the procurement process in the cloud.

Cloud capacity planning: A game of Tetris

As discussed throughout this book, the cloud consumer views the cloud as an infinite set of resources that can be consumed as required. This clearly is not the case from the provider's point of view, nor is it a sustainable business model. The cloud provider, then, must create the illusion of infinite resources while continually optimizing and growing the underlying cloud platform in line with capital expenditure (capex) and service-level agreement (SLA) targets.

Thanks to falling hardware prices, there is often debate within IT organizations regarding the necessity of a formal capacity management process. This debate is based on the premise that capacity management is purely a capex issue; however, with the introduction of the cloud, capacity planning and management also becomes an SLA issue.

Poor cloud capacity planning and management may cause service providers to suffer SLA breaches due to oversubscribed equipment. A lack of resources to support the bursting of existing services or the instantiation of new services may also cause SLA violations. The self-service nature of the cloud and the increased use of virtualization and automation make cloud capacity planning and management a complex task, as providers can no longer overprovision their infrastructure to support peak usage. That's because, in the cloud, peak usage will vary dramatically -- both among different customers and for a single customer's various workloads. To visualize this, think of a typical data center as the video game Tetris, as illustrated in Figure 12-1.

A finite number of colored shapes (workloads) fall down into a clearly defined area (infrastructure point of delivery [POD]), and the position and orientation of the shape are optimized to fit that area. If the optimization process is successful, more shapes fall down; if not, they build up and until they fill the hole. Now, increase the speed and use an infinite number of different shapes and colors (as workloads will have different resource dimensions), and you have a true cloud. Somewhere in the middle will be the cloud that is available today from most providers. There aren't an infinite number of workloads, but the demand is still much higher.

Capacity management, as defined by Information Technology Infrastructure Library Version 3 (ITIL V3), is seen as an ongoing process that starts in service strategy with an understanding of what is needed. It then evolves into a process of designing Capacity management into the services, ensuring that all capacity gates have been met before the service is operational, and managing and optimizing the capacity throughout the service lifetime. You can view cloud capacity planning and management as the process of balancing supply against a demand from a number of different dimensions -- the main two being cost and SLA.

Referring to Table 12-1, you can see that Forrester defines a number of maturity levels for capacity management. We think it's clear from the preceding passage that cloud providers need to have a maturity of at least the Defined level, which mandates a cloud capacity planning model of some sort.

Excerpted from Cloud Computing: Automating the Virtualized Data Center by Venkata Josyula, Malcolm Orr and Greg Page (ISBN: 978-1-58720-434-0). Copyright 2012, Cisco Press. All rights reserved.

 Download this free PDF to continue reading this chapter excerpt about cloud capacity planning and management from the book Cloud Computing: Automating the Virtualized Data Center.

About the book:

Cloud Computing brings together the realistic, start-to-finish guidance that cloud providers need to plan, implement and manage cloud solution architectures for tomorrow's virtualized data centers. It introduces cloud "newcomers" to essential concepts and offers experienced operations professionals detailed guidance on delivering IaaS, Platform as a Service (PaaS) and Software as a Service (SaaS).

This book's replicable solutions and fully-tested best practices will help enterprises, service providers, consultants and Cisco partners meet the challenge of provisioning end-to-end cloud infrastructures. Cloud Computing will help cloud providers:

  • Review the key concepts needed to successfully deploy clouds and cloud-based services;
  • Transition common enterprise design patterns and use cases to the cloud;
  • Master architectural principles and infrastructure designs for "real-time" managed IT services;
  • Understand the Cisco approach to cloud-related technologies, systems and services;
  • Develop a cloud management architecture using ITIL, TMF, and ITU-TMN standards;
  • Implement best practices for cloud service provisioning, activation and management;
  • Automate cloud infrastructure to simplify service delivery, monitoring and assurance;
  • Choose and implement the right billing/chargeback approaches for your business;
  • Design and build IaaS services, from start to finish;
  • Manage the unique capacity challenges associated with sporadic, real-time demand;
  • Provide a consistent and optimal cloud user experience.

About the authors:

Venkata (Josh) Josyula, Ph.D., CCIE No. 13518, is a distinguished services engineer in Cisco Services Technology Group (CSTG) and advises Cisco customers on OSS/BSS architecture and solutions.

Malcolm Orr, solutions architect for Cisco's Services Technology Solutions, advises telecom and enterprise clients on architecting, building and operating OSS/BSS and cloud management stacks. He is Cisco's lead architect for several Tier 1 public cloud projects.

Greg Page has spent the last 11 years with Cisco in technical consulting roles relating to data center architecture/technology and service provider security. He is now exclusively focused on developing cloud/IaaS solutions with service providers and systems integrator partners.

This was last published in February 2012

Dig Deeper on Telecommunication networking