By now everybody who has ever worked on a distributed system has heard of the CAP theorem. Simply put, it states that you have to choose between being (C)onsistent or being (A)vailable. The P, standing for Partition tolerant, is not really a choice for a distributed system (e.g. see the number 1 fallacy of distributed programming).
So that leaves us in a peculiar state: we can’t deliver what our customers want. It does give us a great excuse for creating inconsistent or unavailable software, but I am not sure if that’s much of an upside. But is the situation really that dire?
In proving Brewer’s conjecture and thereby making it a theorem, Seth Gilbert and Nancy Lynch had to define what it means to be consistent. And that is a very strict definition indeed. Loosening the definition of ‘being consistent’ can circumvent the CAP triangle. This is exactly what conflict-free replicated data types (or CRDT for short) intend to do.
What’s a CRDT?
A CRDT is a group of data structures that allow you to store and retrieve data in a distributed way. The data structures in Java’s Collection Framework are fast and efficient per-jvm structures. But they are not distributed; that is something you have to coordinate yourself. CRDT’s are distributed. They’ll work over multiple replicas (that’s what the (D)istributed in CRDT stands for). And, they deliver some form of consistency that is actually quite useable. Maybe not as strict as you would like, but hey, blame CAP’s mathematics for that impossibility.
So what kind of consistency does a CRDT deliver? Well, they promise strong eventual consistency and monotonicity. Let’s have a look at what that means.
Strong eventual consistency
Eventual consistency is a loosely defined term that means that replicas of a certain data structure don’t always have to be consistent with each other. But somehow, magically, there will be a time when they are, sometimes. Strong eventual consistency is a bit more well defined. It states that two replicates who have received exactly the same updates are in the same state. But these updates do not have to have arrived in the same order. That’s actually quite nice and feels ‘natural’: a replica that hasn’t received all updates naturally isn’t, well, up-to-date. And since the updates don’t have to come in order, we can receive updates on all replicas and don’t have to care too much about timing and synchronisation that much. (Hence the conflict-free part of CRDT). Of course, we are only “really consistent” if we don’t have updates for a while. If that never happens, well, you are out of luck. But still, from some update back in time, the replicas will be consistent.
This is another nice property of a CRDT. It guarantees that if you read multiple times from a replica, you’ll never go back in time. It may seem a bit obvious but for distributed systems it’s not necessarily guaranteed. For example, if you have a master-slave replication, typically you update the master and read from a slave. If that’s hidden from your client, you could easily go back in time. Imagine that you update the master and immediately read from the slave. The slave may be behind in updates and from your client’s perspective you go back in time.
Ok, cool. Give me an example!
One of the simplest CRDT’s is a global tick counter. Imagine that you want to keep track of how often your ads on your website are viewed (a tick). Since that might be a massive amount and it is critical for your business, you want to distribute that. A GCounter datastructure can do that for you. Each instance of the GCounter can receive ticks, and they’ll distribute their counts amongst all instances. Since they’re CRDT’s, you’ll receive the strong eventual consistency and monotonicity guarantees.
The magic works like this: each instance keeps track of all other instances’ counts. In other words, each instance has a Map<Instance, Integer>. The total count (over all instances) is thus simply the sum of all values in that map. If an instance receives a new tick, it increases the counter for its own Instance in the map. And each time something changes in that map, publish that to all other instances. The other instances will then update their own maps.
I want more!
Ok, a global tick counter doesn’t seem to impressive. Even if it has strong eventual consistency and monotonicity, there has to be more! And yes, there is more! For example, if you add a second Map to the implementation, you could do up- and down ticks. That’s called a PNCounter. But you can also create Set’s that guarantee uniqueness of objects. For example, there’s GSet (grow only set), 2PSet (with adds and removes, remove-wins), Observed Removed Set (with multiple adds and removes), Add-wins Last-Writer-Wins Set (with ‘timestamps’) and AWORSet (add wins observed removed set), RWORSet, MVRegister, SUSet. And probably by the time you read this, plenty more.
You can find more on CRDT’s on in this paper and actual implementations on these links: