How it works ?
Memcached is organized as a farm of N servers. The storage model can be considered as a huge HashTable partitioned among these N servers.
Every API request takes a "key" parameter. There is a 2-step process at the client lib ...
- Given the key, locate the server
- Forward the request to that server
API
Memcached provide a HashTable-like interface, so it has ...
- get(key)
- set(key, value)
- get_multi(["k1", "k2", "k3"])
- get_multi(["user1:k1", "user1:k2", "user1:k3"]) -- This request just go to the server hosting all keys of "user1:*"
For updating data, Memcached provides a number of variations.
- set(key, value, expiration) -- Memcached guarantees the item will never be staying in the cache once the expiration time is reached. (Note that it is possible that the item being kicked out before expiration due to cache full)
- add(key, value, expiration) -- Success only when no entry of the key exist.
- replace(key, value, expiration) -- Success only when an entry of the key already exist.
When one of the server crashes, all entries owned by that server is lost. Higher resilience can be achieved by storing redundant copies of data in different servers. Memcached has no support for data replication. This has to be taken care by the application (or client lib).
Note that the default server hashing algorithm doesn't handle the growth and shrink of the number of servers very well. When the number of servers changes, the ownership equation (key mod N) will all be wrong. In other words, if the crashed server needs to be taken out from the pool, the total number of servers will be decreased by one and all the existing entries needs to be redistributed to different server. Effectively, the whole cache (among all server) is invalidated even when just one server crashes.
So one approach to address this problem is to retain the number of Memcached servers across system crashes. We can have a monitor server to detect the heartbeat of all Memcached server and in case any crashes is detected, start a new server with the same IP address as the dead server. In this case, although the new server will still lost all the entries and has to repopulate the cache, the ownership of the keys are unchanged and data within the surviving node doesn't need to be redistributed.
Another approach is to run logical servers within a farm of physical machines. When a physical machine crashes, its logical servers will be re-start in the surviving physical machines. In other words, the number of logical servers is unchanged even when crashes happens. This logical server approach is also good when the underlying physical machines has different memory capacity. We can start more Memcached process in the machine with more memory and proportionally spread the cache according to memory capacity.
We also can use a more sophisticated technique called "consistent hashing", which localize the ownership changes to just the neighbor of the crashed server. Under this schema, each server is assigned with an id under the same key space. The ownership of a key is determined by the closest server whose key is the first one encountered when walking in the anti-clockwise direction. When a server crashes, its immediate upstream neighbor server (walking along the anti-clockwise direction) will adopt the key ownership of the dead server, while all other servers has the same ownership of key range unchanged.
Atomicity
Each request to Memcached is atomic by itself. But there is no direct support for atomicity across multiple requests. However, App can implement its own locking mechanism by using the "add()" operation provide by Memcached as follows ...
success = add("lock", null, 5.seconds)
if success
set("key1", value1)
set("key2", value2)
cache.delete("lock")
else
raise UpdateException.new("fail to get lock")
end
Memcached also support a "check-and-set" mechanism that can be used for optimistic concurrency control. The basic idea is to get a version stamp when getting an object and pass that version stamp in the set method. The system will verify the version stamp to make sure the entry hasn't been modified by something else or otherwise, fail the update.
data, stamp = get_stamp(key)
...
check_and_set(key, value1, stamp)
What Memcached doesn't do ?
Memcached's design goal is centered at performance and scalability. By design, it doesn't deal with the following concerns.
- Authentication for client request
- Data replication between servers for fault resilience
- Key > 250 chars
- Large object > 1MB
- Storing collection objects
First of all, think carefully about whether you really need to have data replication at the cache level, given that cache data should always be able to recreated from the original source (although at a higher cost).
The main purpose of using a cache is for "performance" reason. If your system cannot tolerate data lost at the cache level, rethink your design !
Although Memcached doesn't provide data replication, it can easily be done by the client lib or at the application level, based on a similar idea described below.
At the client side, we can use multiple keys to represent different copies of the same data. A monotonically increasing version number is also attached with the data. This version number is used to identify the most up-to-date copy and will be incremented for each update.
When doing update, we update all the copies of the same data via different keys.
def reliable_set(key, versioned_value)
key_array = [key+':1', key+':2', key+':3']
new_value = versioned_value.value
new_version = versioned_value.version + 1
new_versioned_value =
combine(new_value, new_version)
for k in key_array
set(k, new_versioned_value)
end
end
For reading the data from cache, use "multi-get" for multiple keys (one for each copy) and return the copy which has the latest version. If any discrepancy is detected (ie: some copies have a lacking version, or some copies are missing), start a background thread to write the latest version back to all copies.def reliable_get(key)
key_array = [key+':1', key+':2', key+':3']
value_array = get_multi(key_array)
latest_version = 0
latest_value = nil
need_fix = false
for v in value_array
if (v.version > latest_verson)
if (!need_fix) && (latest_version > 0)
need_fix = true
end
latest_version = v.version
latest_value = v.value
end
end
versioned_value =
combine(latest_value, latest_version)
if need_fix
Thread.new do
reliable_set(key, versioned_value)
end
end
return versioned_value
end
When we delete the data, we don't actually remove it. Instead, we mark the data as deleted but keep it in the cache and let it expire.User Throttling
An interesting use case other than caching is to throttle user that is too active. Basically you want to disallow user request that is too frequent.
user = get(userId)
if user != null
disallow request and warn user
else
add(userId, anything, inactive_period)
handle request
end
9 comments:
Note that as of 1.4.2 you *can* store objects greater than 1MB, but that's still discouraged (most people who try that are doing the wrong thing).
Also, as of 1.4.3 (which will be in RC in a few hours), the server can be configured to use SASL authentication and refuse clients that don't authenticate. The clients you have deployed today won't have this support, but the functionality should be trivial for your client vendor to add.
Thanks for comment !
I just add a section on data replication. Love to hear your comment.
I think your replication section is pretty well on. I like to ask people if they replicate their L1 or L2 processor caches. :)
But there are clients that do the replication for you without so much work at the application level (though yours is a good idea, I think).
Thanks for comment.
Yes, if the client lib provides replication, that's even better. But even doing at the App level is not too much of work, you have seen pretty much all the code already.
Do I misunderstand the throttling example or did you mean to write:
if user != null
?
You are right ! I correct it.
Thanks for pointing it out
very nicely explained article, keep it up!
Hi Ricky,
As the last point in the section -'What Memcached doesn't do?', you have mentioned that memcached can't store collection objects. Do you mean Java Collection Objects like Lists, Sets, HashMaps etc?
hello,
nice article, i know 2 years ago, but this could be interesting for somebody.
there is a path for memcached 1.2.x, add multi master replication feature
http://repcached.lab.klab.org/
thx, faha
Post a Comment