Heads up: posts on this site are drafted by Claude and fact-checked by Codex. Both can still get things wrong — read with care and verify anything load-bearing before relying on it.
why how

Why Postgres uses MVCC

Readers shouldn't have to wait for writers, and writers shouldn't have to wait for readers — so Postgres keeps multiple versions of every row.

Data intermediate Apr 29, 2026

Why it exists

Two people open the same row at the same time. One is reading it for a report; the other is updating it. In the simplest possible database, the writer takes a lock, and the reader waits. Or the reader takes a lock, and the writer waits. Either way, the system serializes around contention — and on a busy table, that contention dominates everything else.

The frustration is that the reader doesn’t actually need the latest value. They need a consistent value: a snapshot of the database that doesn’t shift under their feet mid-query. The writer, similarly, doesn’t need exclusive access to the universe — they just need to know whether their change conflicts with someone else’s.

MVCC exists to break that lock-contention bottleneck. Instead of one row that everyone fights over, the database keeps multiple versions of the row and hands each transaction the version that was current when it started. Readers never block writers. Writers never block readers. The database scales with cores instead of choking on locks.

Why it matters now

Almost every modern relational database engineers reach for — Postgres, InnoDB inside MySQL, Oracle, SQL Server’s snapshot isolation mode, CockroachDB, YugabyteDB — uses some flavor of MVCC. If you’ve ever wondered why SELECT in Postgres doesn’t seem to take row locks, why VACUUM is a thing you have to think about, or why your “long-running analytics query” is suddenly bloating your table — that’s all MVCC.

It also matters because MVCC is the reason Postgres can give you repeatable-read and serializable isolation without grinding to a halt under load. The “snapshot” that those isolation levels promise isn’t conceptual — it’s literally a set of row versions the engine keeps on disk on your behalf.

The short answer

MVCC = rows-as-versions + visibility rules per transaction

When you write a row in Postgres, the old version isn’t overwritten. A new version is appended, tagged with the transaction that created it. Each transaction gets a snapshot — effectively, a list of “which transactions had committed when I started” — and at read time, Postgres walks the versions and shows you the one that was visible to your snapshot.

So readers and writers don’t fight: they’re literally looking at different copies.

How it works

Every row in Postgres carries two hidden columns:

An UPDATE is, mechanically, an INSERT of a new row version plus a stamp on the old one’s xmax. Both versions sit in the table at the same time.

When transaction T runs a SELECT, Postgres asks, for each candidate row version, a question that’s roughly:

Was xmin already committed when T started, and is xmax either zero or belongs to a transaction that hadn’t committed when T started?

If yes, T sees that version. If no, it skips to the next one. That’s the “visibility check,” and it’s how a reader gets a stable snapshot without taking any locks on the rows.

The price you pay: dead row versions accumulate. In Postgres, the background process that reclaims them is autovacuum. If autovacuum can’t keep up — or if a long-running transaction holds open a snapshot that “could still need” old versions — the table grows even when the logical row count is flat. That’s table bloat, and it’s the most common operational pain MVCC creates.

There’s also the transaction ID wraparound problem — Postgres’s transaction IDs are 32-bit, and old row versions need to be either frozen or cleaned up before the counter laps them. This is the kind of thing that shows up at 3 a.m. on tables that haven’t been vacuumed in months.

Show the seams

A few things the textbook version of MVCC tends to gloss over:

I don’t have a confident date for when MVCC first entered Postgres specifically — the design traces back to its Berkeley POSTGRES roots in the 1980s, but I haven’t re-sourced the exact release where the modern semantics solidified, so I won’t pin a year here.

Going deeper