Why does TCP have congestion control?
The internet didn't always have it. Once, in 1986, it nearly fell over. The fix wasn't a protocol change — it was endpoints learning to back off.
Why it exists
Picture a four-lane highway during a traffic jam. Now imagine every driver’s response to slowing down is to honk and accelerate. The jam gets worse. More honking. Soon the highway carries zero useful traffic — it’s full of cars, but nobody’s going anywhere. The early internet nearly died of exactly this problem in 1986, except with packets instead of cars. The fix wasn’t to widen the road or build new traffic lights. It was to teach every endpoint to voluntarily slow down when it sensed congestion, and speed back up when the road cleared. That self-restraint is congestion control, and it’s the reason your Netflix stream automatically drops to 480p when your Wi-Fi gets weak instead of freezing the whole network.
In the mid-1980s the early internet briefly stopped working. The standard story — and this one is well-attested — is that the link between LBL and UC Berkeley, a few hundred yards apart, dropped from 32 kbit/s to about 40 bit/s. A factor of roughly a thousand. Not because anything broke, but because every sender was doing exactly what TCP told them to: when a packet is lost, retransmit it.
The problem is what happens when many senders all do that at once. A router’s queue fills up, packets get dropped, every sender notices and retransmits, those retransmits also get dropped, and the senders retransmit again. The network is now spending all of its capacity carrying packets that will be thrown away. This is congestion collapse — a stable equilibrium where useful throughput approaches zero while the wires stay completely full.
Van Jacobson’s 1988 paper Congestion Avoidance and Control is the canonical fix. The remarkable thing is what it didn’t change: not the IP layer, not the routers, not the protocol on the wire. Just the sender’s sending logic. Endpoints learned to slow down on their own, by reading loss as a signal that the network was full.
Why it matters now
Every TCP connection you open today — every HTTP request, every git push, every database connection — runs a congestion control algorithm in the kernel. QUIC reimplements the same ideas in userspace. Datacenter networks, satellite links, and your phone’s LTE connection all depend on senders voluntarily holding back to keep the shared substrate usable.
The choice of algorithm is also a live area: Linux defaulted to CUBIC for years, Google deployed BBR on YouTube and search infrastructure (claims of meaningful throughput wins are public; precise current production share is not something I can verify in detail), and datacenter operators run their own variants on top of DCTCP-style ECN signaling. The mechanism that saved the internet in 1988 is still the surface where most of the interesting transport-layer engineering happens.
The short answer
TCP congestion control = a window that grows when ACKs arrive + shrinks when loss happens
A sender keeps a “congestion window” — how many bytes it’s allowed to have in flight at once. Every successful acknowledgement is evidence that the network had room, so the window grows. Every dropped packet is evidence that the network is full, so the window shrinks. The clever part is the shape of grow and shrink, because the network is shared and you want every sender to converge to a fair share without coordinating.
How it works
The classic algorithm, more or less unchanged in spirit since 1988, has four moving parts:
Slow start. A new connection has no idea how fast the path can go, so it starts tiny — historically 1 segment, today usually 10 — and doubles the congestion window every RTT. “Slow” is a misnomer: doubling is exponential, the fastest sane way to probe an unknown ceiling.
Congestion avoidance. Once the window crosses a threshold (the last known good size, roughly), growth switches from doubling to adding one segment per RTT. Linear instead of exponential. The sender is now near the cliff and is feeling its way forward carefully.
Loss as signal. When a packet is dropped — detected either by a timeout or by duplicate ACKs — the sender treats that as “the network is full.” It cuts the window (classic TCP Reno halves it) and resets the threshold. This is the move that breaks congestion collapse: instead of retransmitting harder when packets disappear, the sender retransmits and slows down.
Fast recovery. A timeout is a bad signal, because it forces the connection to restart from a tiny window. Modern TCP detects loss earlier from duplicate ACKs and avoids the timeout when it can.
The shape is sometimes called AIMD: add a little when things are going well, multiply by a fraction when they’re not. It looks arbitrary, but it has a real property: when many AIMD senders share a link, they provably converge to roughly equal shares of the bandwidth, without ever talking to each other. The link itself becomes the coordination channel — a full queue is the message.
The seam to look at: this whole mechanism treats packet loss as congestion. That assumption was true on the wired networks of 1988. It’s a worse assumption on a Wi-Fi link or an LTE radio, where packets are routinely lost to interference rather than queue overflow. A TCP sender on a flaky cafe connection halves its window in response to noise, not congestion, and throughput tanks. This mismatch is most of why BBR exists — it tries to read congestion off of RTT growth instead of loss. Whether BBR is actually better is contested; it depends on what’s sharing the link with you.
Famous related terms
- Bufferbloat —
bufferbloat = oversized router buffers + loss-based congestion control— when a router’s queue is huge, TCP keeps filling it because it never sees a drop, and latency through the queue balloons. The reason your video call stutters when someone else starts an upload. - ECN —
ECN = explicit congestion notification— routers mark packets instead of dropping them. A way to tell senders “slow down” without the brutality of a packet loss. Underused in the public internet; standard inside datacenters. - BBR —
BBR ≈ congestion control that ignores loss— model the path’s bandwidth and RTT directly, send at that rate. Different philosophy from AIMD. Widely deployed at Google; performance vs. CUBIC depends heavily on the workload, and the literature on fairness between BBR and loss-based flows is not flattering in every scenario. - QUIC —
QUIC = UDP + TCP-style reliability + TLS in userspace— moves congestion control out of the kernel, making it much easier to ship a new algorithm.
Going deeper
- Van Jacobson, Congestion Avoidance and Control (SIGCOMM 1988) — the founding paper. Short, readable, and the diagrams still hold up.
- RFC 5681 — TCP Congestion Control. The codified version of the mechanisms above.
- For BBR: the Google paper and the long thread of follow-up work. Worth reading the critiques alongside the original — the fairness story is more complicated than the marketing.
- Gap I’ll name: I don’t have a confident, current breakdown of which congestion control algorithm dominates production traffic on the public internet today. CUBIC has been the Linux default for a long time and Linux runs most of the world’s servers; BBR has visible deployment at Google; beyond that, the picture is murky from the outside.