Heads up: posts on this site are drafted by Claude and fact-checked by Codex. Both can still get things wrong — read with care and verify anything load-bearing before relying on it.
why how

Why 'eventually consistent' became acceptable

For decades, anything weaker than strict consistency was a bug. Then the internet got big enough that strict consistency stopped being affordable — and a generation of engineers learned to live with the gap.

Data intermediate Apr 29, 2026

Why it exists

If you opened a database textbook in 1995 and asked “is it okay for two reads of the same key to return different values?”, the textbook would say no, that’s a bug, please fix it. Databases were single boxes, transactions were ACID, and the entire point of the system was that callers never had to think about the inside.

Then the web happened. Specifically: shopping carts, social feeds, and inboxes that had to stay up across multiple data centers, often on different continents, while serving millions of concurrent users. The old answer — “one authoritative box, everyone talks to it, transactions serialize” — stopped working at that scale, for two stubborn reasons:

  1. Latency. A coordinated write across continents is bounded below by the speed of light. A round trip from Virginia to Frankfurt is ~80–90 ms on a good day. If your “add to cart” needs a quorum across both, every click inherits that floor.
  2. Partitions. Wide-area links fail. Switches reboot, fibers get cut by anchors, BGP misroutes. If your design says “we cannot serve writes during a partition,” then a transatlantic hiccup takes the site down.

Eventual consistency is the deliberate trade: replicas are allowed to disagree for a while, but they’re guaranteed to converge to the same state once writes stop. You give up “every read sees the latest write” in exchange for staying available, fast, and tolerant of the network being the network.

Why it matters now

You almost certainly depend on something eventually consistent today, even if your primary database is Postgres:

Knowing when “eventually consistent” is fine and when it’s a bug is now a core skill for engineers shipping anything distributed — which is almost everything.

The short answer

eventual consistency = replicas may disagree now + guaranteed to converge later

You’re trading “all readers see the same value at the same instant” for “all readers will agree once writes settle.” Inside that gap lives a lot of practical engineering: how long is the window, what does the application show during it, and what happens when two writes race?

How it works

The mechanism is less mysterious than the marketing suggests. A typical eventually-consistent store:

  1. Accepts a write at any replica (or a primary in a region) and acks the client immediately, before the write has reached every replica.
  2. Propagates the write asynchronously — gossip protocols, replication logs, or an anti-entropy background job that compares replicas and ships missing updates.
  3. Resolves conflicts when two replicas independently accept writes to the same key. Common strategies:
    • LWW — simplest, leaks data.
    • Vector clocks surface the conflict and let the application merge.
    • CRDTs make merges automatic and commutative for specific shapes (counters, sets, sequences).

The convergence guarantee is real but conditional: it assumes writes eventually stop (or at least the rate of new writes is below the rate at which replicas can sync). In practice the window is usually milliseconds to seconds; under partition it can be minutes or hours.

The CAP framing — useful, often misquoted

Most engineers have heard of CAP. The accurate statement, from Eric Brewer’s original conjecture (later proved by Gilbert and Lynch in 2002): during a network partition, a distributed system can either remain available or remain strongly consistent, not both. When the network is healthy, you can have both, which is the case ~99% of the time for most systems.

The pop-culture version — “pick two of CAP” — is misleading. Partitions aren’t optional, so the real choice is what to do when one happens. Systems that pick A under partition (Dynamo, Cassandra, most caches) are the ones we call eventually consistent. Systems that pick C (Spanner, etcd, ZooKeeper) will refuse writes — or refuse reads — rather than diverge.

The PACELC refinement

CAP only describes partition behavior. Daniel Abadi’s PACELC adds the missing case: even when there’s no partition, you still trade latency against consistency. A strongly consistent write that needs a quorum across regions is fundamentally slower than a local-only write. So even healthy systems make this choice every time you tune replication factor or read consistency level.

Show the seams

If a curious engineer takes one thing away: eventual consistency wasn’t a giving-up. It was an honest acknowledgment that the network is a real physical thing with real physical limits, and that for many workloads, “shows the right answer eventually” is what users actually need — they just don’t want to wait.

Going deeper