Why garbage collectors pause your program
A tracing collector can't safely move or free an object while your code is mid-read — so it freezes the program to get a consistent snapshot, and generational collection is one common trick for keeping the freeze short.
Why it exists
You’re playing a game, or scrolling a feed, and every few seconds it hitches — a tiny stutter, a dropped frame, a moment where the input lag spikes and then clears. Nothing crashed. The program didn’t do more work. It just… stopped, briefly, and then carried on. If the runtime is garbage-collected — Java, Go, C#, JavaScript, Python — one possible culprit is the garbage collector doing its job.
Here’s the problem it’s solving. Your program allocates objects constantly and almost never explicitly frees them — that’s the whole point of a managed language. So something has to figure out which objects are dead (nothing points to them anymore) and reclaim their memory. The dominant approach is a tracing collector: it walks the graph of live objects, starting from the roots, follows every pointer, marks everything it reaches as alive, and treats the rest as garbage. (The other major family, reference counting, works differently — more on that at the end. This post is about tracing collectors, which is where the dramatic pauses live.)
But the program is also running. Your code is following those same pointers, reading fields, writing new pointers into objects. If the collector is reading the object graph at the same moment your code is rewriting it, the collector sees a graph that never actually existed — half-old, half-new — and can free an object you’re about to use, or miss an object you just made reachable. That’s a use-after-free or a corrupted heap, the exact bugs managed languages promised to abolish.
The simplest fix that is obviously correct: stop the program. Freeze every application thread (the runtime calls these threads the mutators), take your snapshot, do the work, resume. That freeze is the stop-the-world pause. It exists because a consistent view of the heap and a running program are fundamentally in tension — and correctness wins.
Why it matters now
A pause is easy to live with when “the program” is a batch job — total throughput is what you’re measuring, and a freeze every so often barely moves it. It’s much harder to live with when the program is a request server with a p99 latency budget, a game holding a 16-millisecond frame deadline, or a trading system where a 50-millisecond hiccup is a real loss. A GC pause barely moves your average — it wrecks your tail. One unlucky request in a thousand eats the full pause, and that’s the number your users and your SLOs actually feel.
That tension is why so much modern runtime engineering is, specifically, pause engineering. Go made low pause time an explicit design goal. The JVM ships multiple collectors — G1, and the newer low-latency ZGC and Shenandoah — that exist largely to shrink or break up the stop-the-world window. None of them eliminate it entirely. The interesting question was never “can we avoid pausing” — it’s “how short, how predictable, and how much throughput do we trade to get there.”
The short answer
stop-the-world pause = "freeze all mutator threads" + "so the collector sees a heap that isn't changing under it"
You can’t safely move or free an object while another thread might be reading it. The bluntest way to guarantee that is to make sure no thread is reading anything. The collector freezes the mutators, gets a stable snapshot, does the dangerous part, and resumes them. The clever machinery in modern tracing collectors — generations, concurrent marking, read/write barriers — is an effort to make that frozen window smaller, rarer, or more predictable, without giving up the correctness the freeze buys.
How it works
Start with the naive collector and watch where the pause comes from.
Mark. From the roots, walk every reachable pointer and mark each object live. Sweep (or compact). Reclaim everything unmarked; optionally move the survivors so they sit contiguously and defeat fragmentation. With no barriers and nothing stopped, the mark phase is wrong the instant a mutator rewrites a pointer mid-walk. The compaction phase is even more delicate: if the collector moves an object, every pointer to it must be updated, and a mutator must not dereference the old address in the gap between “object copied” and “pointers fixed.” Both phases want the world stopped.
So why isn’t every pause enormous? Because of one empirical observation about how many programs behave:
The generational hypothesis: most objects die young. In many workloads, the bulk of allocations — loop temporaries, request-scoped buffers, intermediate strings — become garbage very quickly. A smaller set (caches, long-lived state) survives a long time.
Generational GC turns that observation into a strategy. Split the heap into a young generation and an old generation. New objects are born in the young gen. Most of them die there. So collect the young gen often — and a young-gen collection is cheap, because you only trace and copy the handful of survivors, then declare the entire rest of that space free in one move. Objects that survive a few young-gen collections get promoted (tenured) into the old gen, which you collect rarely.
Not every collector takes this bet. Go’s collector, notably, is non-generational and non-compacting — it relies on concurrent marking instead (more below). Generational GC is a common pause-reduction strategy, not a universal one.
The payoff: instead of one giant pause to trace the whole heap, you get frequent tiny pauses over a small region, and the expensive full-heap collection happens rarely. Same total correctness, with the pause cost spread into small predictable chunks.
There’s a catch, and it’s where the seam shows. To collect the young gen alone, you need its roots — and an old-gen object holding a pointer into the young gen is a root you’d otherwise miss. The runtime can’t afford to scan the whole old gen to find those pointers; that would erase the savings. So it tracks them as they’re created, using a write barrier: a small piece of code injected on pointer-writes in your program that records when an old-gen object starts pointing into the young gen. The runtime stores that information in side structures — a card table marks which regions of memory contain such pointers; a remembered set is the collector’s record of those cross-region references. You pay a few instructions on pointer stores so that young-gen collections stay cheap. The pause didn’t vanish — part of its cost was amortized into the mutator.
Shrinking the pause further
Generational GC makes pauses small but doesn’t remove them. The next moves:
- Concurrent marking. Do most of the graph walk while the mutators run, then stop the world only briefly to enable/reconcile the phase. G1, ZGC, Shenandoah, and Go’s collector all mark concurrently. It needs a write barrier so the concurrent marker doesn’t miss a pointer the mutator rewrites after it’s already been scanned — the same family of trick as the generational write barrier above, aimed at a different problem.
- Concurrent compaction. Moving objects while the program runs is the hard one, because a mutator could dereference a pointer to an object that’s halfway moved. ZGC and Shenandoah relocate objects concurrently and use barriers — ZGC a load barrier with colored pointers, Shenandoah barriers around object access — so that an access to a relocating object is steered to the right copy. The exact mechanisms differ per collector; the shared idea is “intercept the access, fix up the pointer.”
- Parallelism. Use many collector threads during the stop-the-world phase so the freeze, while it lasts, is as short as the hardware allows.
Every one of these trades something. Concurrent work competes with your program for CPU and memory bandwidth, so you lose throughput to gain predictability. Write barriers add a small cost to pointer writes; the load/read barriers in concurrent-compacting collectors add cost to pointer reads. There is no free collector — only different points on the pause-versus-throughput curve, and the right pick depends on whether you’re running a batch job or a latency-bound service.
Famous related terms
- Mark-and-sweep —
mark-and-sweep = trace live objects from roots + reclaim everything unmarked— the baseline collector whose mark phase is where the classic pause lives. - Generational hypothesis —
generational hypothesis ≈ "most objects die young"— the empirical bet that makes frequent cheap young-gen collections worth it. - Write barrier —
write barrier = code on pointer-writes + records cross-generation or concurrent-mark pointers— the tax that keeps young-gen and concurrent collections cheap. - Reference counting —
reference counting = per-object live-pointer count + free at zero— the other major GC family; no global tracing pause, but plain refcounting can’t reclaim reference cycles, so runtimes like CPython bolt on a separate cycle collector — and the counting itself isn’t free. - MVCC —
MVCC = keep multiple versions + readers never block writers— databases dodge the analogous “don’t mutate what someone’s reading” problem with versioning instead of a freeze; see Postgres MVCC.
Going deeper
- The Garbage Collection Handbook (Jones, Hellyer, Moss) — the canonical reference, for the precise taxonomy of marking, copying, generational, and concurrent collectors.
- The Go GC guide and the Go 1.5 GC blog post — first-party writing on a concurrent, non-generational, non-compacting collector and the latency-versus-throughput knobs it exposes.
- The ZGC project page and JEP 333 — for how a JVM collector uses colored pointers and load barriers to relocate objects concurrently, and what that buys in pause time.