How it works ?
Memcached is organized as a farm of N servers. The storage model can be considered as a huge HashTable partitioned among these N servers.
Every API request takes a "key" parameter. There is a 2-step process at the client lib ...
- Given the key, locate the server
- Forward the request to that server
Memcached provide a HashTable-like interface, so it has ...
- set(key, value)
- get_multi(["k1", "k2", "k3"])
- get_multi(["user1:k1", "user1:k2", "user1:k3"]) -- This request just go to the server hosting all keys of "user1:*"
For updating data, Memcached provides a number of variations.
- set(key, value, expiration) -- Memcached guarantees the item will never be staying in the cache once the expiration time is reached. (Note that it is possible that the item being kicked out before expiration due to cache full)
- add(key, value, expiration) -- Success only when no entry of the key exist.
- replace(key, value, expiration) -- Success only when an entry of the key already exist.
When one of the server crashes, all entries owned by that server is lost. Higher resilience can be achieved by storing redundant copies of data in different servers. Memcached has no support for data replication. This has to be taken care by the application (or client lib).
Note that the default server hashing algorithm doesn't handle the growth and shrink of the number of servers very well. When the number of servers changes, the ownership equation (key mod N) will all be wrong. In other words, if the crashed server needs to be taken out from the pool, the total number of servers will be decreased by one and all the existing entries needs to be redistributed to different server. Effectively, the whole cache (among all server) is invalidated even when just one server crashes.
So one approach to address this problem is to retain the number of Memcached servers across system crashes. We can have a monitor server to detect the heartbeat of all Memcached server and in case any crashes is detected, start a new server with the same IP address as the dead server. In this case, although the new server will still lost all the entries and has to repopulate the cache, the ownership of the keys are unchanged and data within the surviving node doesn't need to be redistributed.
Another approach is to run logical servers within a farm of physical machines. When a physical machine crashes, its logical servers will be re-start in the surviving physical machines. In other words, the number of logical servers is unchanged even when crashes happens. This logical server approach is also good when the underlying physical machines has different memory capacity. We can start more Memcached process in the machine with more memory and proportionally spread the cache according to memory capacity.
We also can use a more sophisticated technique called "consistent hashing", which localize the ownership changes to just the neighbor of the crashed server. Under this schema, each server is assigned with an id under the same key space. The ownership of a key is determined by the closest server whose key is the first one encountered when walking in the anti-clockwise direction. When a server crashes, its immediate upstream neighbor server (walking along the anti-clockwise direction) will adopt the key ownership of the dead server, while all other servers has the same ownership of key range unchanged.
Each request to Memcached is atomic by itself. But there is no direct support for atomicity across multiple requests. However, App can implement its own locking mechanism by using the "add()" operation provide by Memcached as follows ...
success = add("lock", null, 5.seconds) if success set("key1", value1) set("key2", value2) cache.delete("lock") else raise UpdateException.new("fail to get lock") end
Memcached also support a "check-and-set" mechanism that can be used for optimistic concurrency control. The basic idea is to get a version stamp when getting an object and pass that version stamp in the set method. The system will verify the version stamp to make sure the entry hasn't been modified by something else or otherwise, fail the update.
data, stamp = get_stamp(key) ... check_and_set(key, value1, stamp)
What Memcached doesn't do ?
Memcached's design goal is centered at performance and scalability. By design, it doesn't deal with the following concerns.
- Authentication for client request
- Data replication between servers for fault resilience
- Key > 250 chars
- Large object > 1MB
- Storing collection objects
First of all, think carefully about whether you really need to have data replication at the cache level, given that cache data should always be able to recreated from the original source (although at a higher cost).
The main purpose of using a cache is for "performance" reason. If your system cannot tolerate data lost at the cache level, rethink your design !
Although Memcached doesn't provide data replication, it can easily be done by the client lib or at the application level, based on a similar idea described below.
At the client side, we can use multiple keys to represent different copies of the same data. A monotonically increasing version number is also attached with the data. This version number is used to identify the most up-to-date copy and will be incremented for each update.
When doing update, we update all the copies of the same data via different keys.
For reading the data from cache, use "multi-get" for multiple keys (one for each copy) and return the copy which has the latest version. If any discrepancy is detected (ie: some copies have a lacking version, or some copies are missing), start a background thread to write the latest version back to all copies.
def reliable_set(key, versioned_value) key_array = [key+':1', key+':2', key+':3'] new_value = versioned_value.value new_version = versioned_value.version + 1 new_versioned_value = combine(new_value, new_version) for k in key_array set(k, new_versioned_value) end end
When we delete the data, we don't actually remove it. Instead, we mark the data as deleted but keep it in the cache and let it expire.
def reliable_get(key) key_array = [key+':1', key+':2', key+':3'] value_array = get_multi(key_array) latest_version = 0 latest_value = nil need_fix = false for v in value_array if (v.version > latest_verson) if (!need_fix) && (latest_version > 0) need_fix = true end latest_version = v.version latest_value = v.value end end versioned_value = combine(latest_value, latest_version) if need_fix Thread.new do reliable_set(key, versioned_value) end end return versioned_value end
An interesting use case other than caching is to throttle user that is too active. Basically you want to disallow user request that is too frequent.
user = get(userId) if user != null disallow request and warn user else add(userId, anything, inactive_period) handle request end