Monday, August 17, 2009

Multi-tenancy in cloud computing

Followup on an interesting discussion in Cloud Computing discussion group. What is a tenant ? Is multi-tenancy an important feature of cloud ? Who are the participants and their roles in the cloud ecosystem ?

Participants in the cloud
In my model, a "SaaS provider" is the organization that provides a domain specific SaaS App to its users (e.g. SmugMug for photo sharing). In this case, the SaaS consumer is just any individual who has a SmugMug account. The SaaS provider may choose an infrastructure provider (e.g. Amazon) to host its SaaS App. In this example, SmugMug is a SaaS provider and Infrastructure consumer at the same time.

Definition of a Tenant
Now, who is the "tenant" in this picture. I think Amazon will consider SmugMug as a tenant. But I doubt SmugMug will consider its individual user a tenant.

But what if SmugMug offer a services to car manufacturers so they can store, organize and image process their photos, which will show up in the car manufacturer's website. Will SmugMug consider BMW a tenant ? I think the answer is "yes". So maybe the definition of a tenant is "my user who has her own users".

You can see there can be a value chain built up. So except the start and end point of this value chain, everyone is a "tenant" to its service provider.

After we defne what a "tenant" is, what does "multi-tenancy" mean ? In my opinion, "multi-tenancy" is for the benefit of the service provider so they can manage the resource ultization more efficiently, but multi-tenancy is not to the tenant's advantage at all. In the fake example I gave above, would BMW prefers a multi-tenancy environment from SmugMug ? My guess is that BMW would in fact worry if their data is sitting together with their competitors in a shared infrastructure. I bet they would prefer an environment which is isolated as much as possible.

While "multi-tenancy" indicates that some infrastructure is shared, at what layers are things being shared can make a big difference. For example, Amazon AWS is multi-tenant at the hardware level in that its users may be sharing a physical machine. On the other hand, is multi-tenant at the DB level in that its users are sharing data in the same DB tables. And Amazon is relying on the hypervisor to provide the isolation between tenants while is relying on a query rewriter to do the same.

While "multi-tenancy" at the highest layer basically advocates a shared-DB approach, does it enables better collaboration or sharing between tenants ? I don't think so. I think all we need is to have an authentication model such that spontaneous workgroup can be formed and membership can be identified. Then it is just a matter of a requesting tenant to presents his membership to another tenant when making a SaaS service call. What I mean is they are using an SOA approach to access data, rather than directly access a shared-DB.

1 comment:

rasputnik said...

Another model seems to be the one adopted by Heroku ( ) or the Google App Engine - users essentially package their apps/workloads as discrete bundles which can co-exist on the service provider servers (that is, Herokus ec2 instances).

There's been a lot of debate about how Java on GAE is not fully JEE compliant; to my mind, most of the difference is due to the requirement to host disparate customer applications safely.