Tuesday, July 21, 2009

Between Elasticity and Scalability

Followup on an interesting question being asked in the Cloud Computing Forum.

Definition of Scalability

"Scalability" is the ability to meet an increasing workload requirement by incrementally adding a proportional amount of resources capacity. Therefore, Scalability is quantified by a measure of "linearity", which maintains a constant ratio between resource : workload. This linearity allows the enterprise to plan for its resource budget according to their foreseeable workload growth.

In reality, the range of linearity is not infinite. It is important to understand where the linearity reaches its limit and whether the workload will growth beyond such limit.

A "scalable" design can push the limit of such linearity very far. Far enough that you don't need to worry about the workload ever get beyond that. Scalability architecture is typically built on a repeatable basic infrastructure with minimal coupling such that growth can be achieved by simply repeatedly buying the same set of basic hardware.

Traditionally, people design for scalability to make sure that their operation cost can grow linearly with respect to the workload. They usually don't worry about "removing" resources, nor it worries about whether the resource is fully utilized, because the purchased resources is already a sunk cost. At least for now, "Scalability" is not about managing fluctuating workload and also doesn't have a time dimension. But I expect such mindset will change as people move to Cloud computing when operation cost depends on usage. There are financial motivations to decommission resource once they are no longer used.

Many of existing distributed applications have a very scalable architecture, but very few of them are design with extreme elasticity in mind. Usually they expect the underlying infrastructure are relatively stable with low churn rate (machine crashes are the only cause of infrastructure changes, which is relatively infrequent), therefore they are willing to pay a higher degradation when the churn happens.

Definition of Elasticity

On the other hand, "elasticity" is the ability to instantly commission and decommission large amount of resource capacity on the fly, and then charge purely based on the actual resource usage. "Elasticity" is quantitatively measured by
  • Speed of commissioning / decommissioning
  • Max amount of resource can be brought in
  • Granularity of usage accounting
It is important to understand that even you get "elasticity" automatically when your application is deployed in the cloud, you DO NOT get "scalability" (which is a characteristic of your application architecture). A non-scalable in-house application deployed into the cloud won't magically becomes scalable. Instead, you will just receive a huge bill from your cloud provider at the month end. "Elasticity" is a double edge sword, it lets bad things happen fast.

"Elasticity" is a pretty new feature to our industry and we are in the process of discovering how we can fully utilize it. Most of existing distributed applications are built with scalability but not extreme elasticity in mind. I think an ideal app should have both "scalability" and "elasticity" built in, but I haven't seen much yet. Therefore, although most of existing applications can run in the cloud without modification, I don't think they have fully exploit the benefits of the cloud yet.

"Statefulness" and "Elasticity" are fundamentally in conflict with each other. Data transfer rate has put a limit of the speed of commissioning/decommissioning, which is a measure of elasticity. In other words, I believe smart data distribution algorithm can reduce this friction by reducing the amount of data transfer during commissioning. This is an area of research that I'd love to see.

New Design Paradigm for Cloud Applications

In order to fully exploit the benefits of the cloud, it is necessary to have a new set of design considerations, with both scalability and elasticity in mind. There are many coverages in how to do a scalable design, which I am not trying to repeat here. Lets focus more on "design for elasticity".

  1. Make your application as stateless as possible (but don't just dump all state somewhere else)
  2. Use a smart data migration / balancing algorithm which minimize data movement when the infrastructure grow and shrink elastically. A typical way to do this is to use asynchronous data transfer so that your application can serve immediately (even before the completion of data transfer)
  3. Use a smart scheduler that control the growth/shrink of the environment. Such scheduling algorithm takes the usage charge and performance impact during the dust settlement period into consideration to decide whether a change of the environment should be initiated. Underlying, there is also a monitoring mechanism in place that collects the necessary metrics, perform the analytics and feedback the result to the scheduler to fine tune its earlier policy decision.


Anonymous said...

Good point that "nor it worries about whether the resource is fully utilized".

Indeed, I would say before we have mature virtualization technologies, it is nearly impossible to make use of idling resources.

Tinniam V Ganesh said...

If there is a need to share global read, writes between instances what is the technique? Sometimes we cannot avoid stateful designs, isn't it?