Why deadlocks need four conditions
A deadlock feels like bad luck, but it can only happen when four specific conditions all hold at once — and breaking any one makes it impossible.
Why it exists
Two people reach a narrow doorway from opposite sides at the same moment. Each steps half-in, then stops — neither will back out, because the other might take the gap. They stand there. Nothing is broken; both are behaving “correctly.” They’re just stuck, forever, waiting on each other.
That’s a deadlock, and it’s the bug that makes concurrent programs feel cursed. Your service runs fine for weeks, then one night two threads freeze, requests pile up, and the stack traces show everyone politely waiting — no crash, no error, no obvious culprit. It looks like cosmic bad luck: a one-in-a-million timing fluke.
It isn’t luck. A deadlock is not a random event that sometimes happens when threads share resources. It is a precise structural situation, and it can only occur when four specific conditions are all true simultaneously. These are usually called the Coffman conditions. Miss any one of the four, and a deadlock is not unlikely — it is impossible. That’s the satisfying part: deadlock has a closed-form cause, so it has a closed-form cure.
Why it matters now
Every system that lets more than one thing run at once and share something — threads sharing a mutex, transactions sharing database rows, microservices sharing a connection pool, distributed nodes sharing a lock service — can deadlock. Databases take this seriously enough to ship a deadlock detector: in PostgreSQL, once a transaction has waited on a lock longer than deadlock_timeout, the system checks for a cycle of transactions waiting on each other’s locks and aborts one of them to break it. The reason that detector exists, and the reason it works, is the four-condition structure below. Knowing the four conditions turns “our service hangs sometimes, restart it” into “condition X holds here — break it on purpose.”
The short answer
deadlock = mutual exclusion + hold-and-wait + no preemption + circular wait
A deadlock happens exactly when: resources can’t be shared (mutual exclusion), a holder can request more while keeping what it has (hold-and-wait), nobody can forcibly take a resource away (no preemption), and there’s a closed loop of “A waits on B waits on … waits on A” (circular wait). All four are necessary. In the common case of single-instance resources — one lock, one holder — they’re also sufficient: a wait-for cycle means everyone in it is stuck. (With pools of interchangeable resources the cycle test gets weaker, but the four conditions still all have to hold.) Either way, you don’t fix a deadlock by getting lucky with timing — you fix it by designing one of the four conditions out.
How it works
Take the doorway again, but in code. Thread A grabs lock R1 and then needs R2. Thread B grabs lock R2 and then needs R1. Each holds one, each waits for the other. Draw the “who is waiting on whom” graph and you get a cycle:
flowchart LR
A[Thread A] -->|holds| R1[Lock R1]
A -->|waits for| R2[Lock R2]
B[Thread B] -->|holds| R2
B -->|waits for| R1
Follow the arrows: A waits for R2, which B holds; B waits for R1, which A holds. The loop never opens. Now walk the four conditions against this picture and watch each one become a place to attack:
- Mutual exclusion — a lock can be held by only one thread at a time. If R1 and R2 were freely shareable (say, read-only data), there’d be nothing to wait for. Break it by not locking: use immutable data, or per-thread copies.
- Hold-and-wait — a thread keeps R1 while blocking on R2. Break it by requiring threads to grab all the locks they’ll need at once, or to release everything before re-requesting. No partial holds, no half-in-the-doorway.
- No preemption — once A has R1, nothing can yank it back. Break it by allowing preemption: a thread that can’t get its next lock must drop the ones it holds and retry. (This is roughly what a database deadlock detector forces — it preempts by aborting one transaction.)
- Circular wait — the loop itself. Break it by imposing a global order on resources: every thread must acquire locks in increasing order, say always R1 before R2. Then a cycle can’t form, because a cycle would require some thread to hold a higher-numbered lock while waiting on a lower-numbered one — which the rule forbids.
The lock-ordering trick is a common move, because it’s cheap and local: you don’t need a detector, a timeout, or a retry loop, just a convention. The deeper point is that all four cures are the same move — make one of the four conditions false. There’s no fifth option, and no amount of careful timing counts as a cure, because timing was never the cause.
One honest seam: “necessary and sufficient” is about the structure, not about whether the bad interleaving actually occurs. A program can have all four conditions latent in its design and run for years because the unlucky schedule never lands. That’s why deadlocks feel like luck — the structure is always there, and luck only decides when it bites. Removing a condition removes the structure, so luck stops mattering.
Famous related terms
- Livelock —
livelock ≈ deadlock + motion— threads aren’t blocked, they’re actively responding to each other (both step aside, both step back, repeat), so they make no progress despite never sitting still. - Race condition —
race = shared state + unsynchronized access + order-dependent outcome— the bug locks are meant to prevent; ironically, adding locks to fix races is what introduces the deadlock risk. - Lock ordering / lock hierarchy —
lock ordering = global rank on locks + "acquire in rank order" rule— the standard way to kill the circular-wait condition by construction. - Deadlock detection —
detection = wait-for graph + cycle search + victim abort— what databases like PostgreSQL do instead of preventing deadlocks: let them happen, find the cycle, break it.
Going deeper
- Coffman, Elphick, and Shoshani, “System Deadlocks” (ACM Computing Surveys, 1971) — the survey paper that lays out the four conditions; go here for the precise original statement rather than a paraphrase. (The four are widely called the “Coffman conditions” after this paper; I haven’t verified who first attached that name.)
- Operating Systems: Three Easy Pieces, the “Deadlock” chapter (free online) — a clear, example-driven walk through each condition and the practical ways to break it, including lock ordering.