Conflict-Free Replicated Data Types
CRDTs (Conflict-Free Replicated Data Types) are data structures designed so that replicas can be updated independently and merged without conflicts.
The problem they solve: in a distributed system, you often want multiple nodes to update data without coordinating first. But then how do you merge those updates? What if two nodes update the same thing differently?
CRDTs make merging deterministic by construction. The math guarantees that all replicas converge to the same state, regardless of the order updates arrive. No coordination needed.
Two flavors:
Operation-based (op-based): You broadcast operations, not state. Operations need to be commutative - order can’t matter. Think “increment counter” rather than “set counter to 5”.
State-based: You periodically send your whole state to other replicas, which merge it with theirs. The merge function has to be commutative, associative, and idempotent (merging twice is the same as merging once).
Common CRDT types:
- G-Counter: A counter that only goes up. Each node tracks its own count, totals are summed.
- PN-Counter: Can increment and decrement - essentially two G-Counters.
- LWW-Register: Last-writer-wins, using timestamps to resolve conflicts.
- OR-Set: A set where you can add and remove elements. Uses unique tags to handle add/remove conflicts.
CRDTs show up in collaborative editing (Google Docs uses something similar), distributed databases (Riak, Redis CRDB), and anywhere you want availability over strong consistency.
The tradeoff: CRDTs give you eventual consistency without coordination, but you’re limited to operations that can merge cleanly. Not everything fits the model, and the ones that do can have subtle semantics (what does “delete” mean if another replica just added the same thing?).