Synchronization in Distributed Systems
Imagine an orchestra with no conductor, where each musician is in a different room and can only communicate by passing notes under the door. Notes might arrive late, in the wrong order, or not at all. Now play a symphony.
That’s synchronization in distributed systems. You have independent nodes that need to coordinate - agree on time, agree on state, agree on who does what - and the only tool they have is unreliable message passing.
The meta-problem
Ordering is one piece of the puzzle. But synchronization is the bigger challenge that ordering, consensus, consistency, and replication are all facets of. Every time you have multiple nodes and they need to behave as something resembling a coherent system, you’re doing synchronization.
The constraints are always the same. No shared memory - nodes can’t peek at each other’s state, so everything they know about each other comes through messages that are stale by the time they arrive. No shared clock - I wrote about this in the ordering post. Unreliable communication - messages get lost, delayed, duplicated, reordered. And partial failure - some nodes crash while others keep running, and you often can’t tell whether a node is dead or just slow. A 10-second network delay looks exactly like a crashed node until the response finally shows up.
These aren’t problems you solve once. They’re the water you swim in.
The design space
There’s a fundamental tension at the core of all synchronization: coordination makes systems correct, but it also makes them slow. Every approach to synchronization is a different answer to the question “how much coordination can we afford?”
On one end you have logical clocks - lightweight, no central coordination, just some metadata piggybacking on messages you were already sending. They track causality, which is enough for ordering events. But that’s all they do. They don’t help nodes agree on values or make decisions together. They’re the cheapest possible synchronization, and for plenty of systems, they’re enough.
When you need actual agreement - multiple nodes committing to the same value - you need consensus. Paxos, Raft, ZAB. The core idea is always: propose a value, get a majority to accept it, now everyone agrees. Raft is the one most people encounter first because it was explicitly designed to be understandable (Paxos is notoriously hard to reason about, which is itself an interesting lesson in how systems get adopted). The cost is real though. Every consensus round is multiple network hops. In the normal case, Raft needs the leader to propose, followers to acknowledge, the leader to commit, followers to apply. When the leader fails, you pay even more to elect a new one. For a coordination service like etcd where you’re doing tens of thousands of operations, not millions, this is fine. For a high-throughput data store, it’s a bottleneck.
Two-phase commit is what you reach for when a single transaction needs to span multiple nodes. The coordinator asks everyone “can you commit?” and waits for unanimous yes. Simple, and it works well enough that it’s been the backbone of distributed databases for decades. The failure mode is specific and ugly though: if the coordinator crashes between the two phases, every participant is stuck holding locks, waiting for a decision that might never come. They can’t commit (the coordinator might have decided to abort) and they can’t abort (the coordinator might have decided to commit). They just wait. Three-phase commit exists to fix this, but in practice most systems use two-phase commit with a coordinator failover mechanism instead.
Gossip protocols sit at the opposite end of the spectrum. No coordination at all. Each node periodically picks a random peer and exchanges state. Repeat. Information spreads like a rumor - exponentially fast in theory, unpredictably in practice. The appeal is that it scales effortlessly. Each node does O(1) work per round regardless of cluster size, and no single node matters. Cassandra uses gossip for cluster membership. Consul uses it for health checking. The catch is that “eventually consistent” can mean anything from milliseconds to minutes depending on cluster size, network conditions, and what phase of the moon it is. You get convergence, but you can’t make promises about when.
And then there are CRDTs, which sidestep the whole tension. Instead of coordinating writes, you design data structures where all operations commute. Any node can write independently. When they sync - in any order, whenever - they converge. The constraint is that not everything can be a CRDT. Counters and sets work. Arbitrary application logic usually doesn’t.
Where real systems land
No real system uses just one of these. The thing that stands out when you look at production systems is how they layer approaches.
DynamoDB and Cassandra use gossip for membership and failure detection, sloppy quorums for reads and writes, and vector clocks or last-writer-wins for conflict resolution. The philosophy: optimize for availability and throughput, accept that reads might be stale, give developers tools for conflict resolution. This works because their typical workloads - session stores, user profiles, time-series data - can tolerate brief inconsistency without anyone noticing.
Zookeeper and etcd go the other direction. Consensus for everything. Strong consistency. Linearizable reads. Every write goes through the leader and gets replicated to a majority before it’s acknowledged. This makes them slow for high-throughput data, but that’s not what they’re for. They’re coordination services - leader election, distributed locks, configuration management. Being wrong is worse than being slow. A distributed lock that sometimes lies about who holds it is not a lock at all.
Spanner is Google’s attempt to have both. GPS receivers and atomic clocks in every data center, giving bounded uncertainty intervals for physical time. With that, Spanner can assign globally meaningful timestamps to transactions and get strong consistency with less coordination than traditional consensus. CockroachDB uses hybrid logical clocks to approximate this without the specialized hardware - a pragmatic tradeoff that works surprisingly well.
Drawing lines
Every system I’ve worked on that got synchronization wrong made the same mistake: applying the same consistency model everywhere. Strong consistency for everything is expensive and slow. Weak consistency for everything leads to bugs that surface at 3am as data corruption you can’t reproduce.
A shopping cart can be eventually consistent. If two replicas briefly disagree about what’s in the cart, the user will sort it out at checkout. A payment ledger can’t. An analytics pipeline can tolerate hours of propagation delay. A distributed lock has to be consistent or it’s worthless - which is to say, it’s not a lock.
The hard part isn’t picking the synchronization mechanism. It’s deciding where to draw the lines between “this must be correct” and “this can be approximate.” That decision requires understanding the domain at least as well as you understand the distributed systems theory, and in my experience it’s where most of the interesting engineering lives.