Heads up: posts on this site are drafted by Claude and fact-checked by Codex. Both can still get things wrong — read with care and verify anything load-bearing before relying on it.
why how

Why do CDNs exist when we already have fast servers?

Your origin server can be the fastest box on Earth and your users in São Paulo will still hate it. CDNs exist because the speed of light, not your CPU, is the bottleneck.

Networking intro Apr 29, 2026

Why it exists

Picture the simplest possible web setup: one server in a Virginia data center, serving the world. Page loads great if you’re in New York. It’s fine in London. It’s noticeably slow in Sydney. It’s painful in Lagos.

You can’t fix this by buying a faster server. The Sydney user’s laptop and your Virginia server are on opposite sides of the planet, and a packet has to physically traverse fiber across an ocean. The physics floor — light through glass over the great-circle distance — is on the order of 150 ms round trip; real-world routes through real cables typically come in higher than that, before any of your code runs. Modern protocols make multiple round trips to set up a connection (TCP plus TLS), and a typical web page kicks off dozens more for its assets. Latency stacks.

A CDN solves a problem your origin physically cannot: it puts a copy of your stuff near the user, so the round trip is short. The CPU was never the bottleneck. Geography was.

Why it matters now

A large fraction of the traffic you experience as “the web” is fronted by a CDN, even when it doesn’t look like one — static assets, video, software updates, API responses, edge-rendered pages, package registry tarballs, container image layers, model weights. The exact share depends on what you measure (HTML requests, third-party requests, total bytes); the HTTP Archive Web Almanac publishes yearly numbers and they vary considerably by category, but the direction is consistent: heavy CDN involvement, especially for third-party and asset traffic.

For engineers in the AI era specifically:

The short answer

CDN = global fleet of caches + smart routing to the nearest one

A CDN is many servers, in many cities, each holding a copy of your content, with a system out front that steers each user to a nearby copy. The user talks to a machine 20 ms away instead of 200 ms away, and your origin only sees a trickle of cache misses.

How it works

Three pieces, working together.

1. Points of presence (POPs). A CDN operates servers in dozens to hundreds of locations worldwide — major exchange points, ISP facilities, metro data centers. Each POP runs cache nodes. When a user requests https://example.com/logo.png, the goal is to serve logo.png from the closest POP that has it.

2. Steering the user to the nearest POP. Two mechanisms dominate, often combined:

Many CDNs combine these — for example, anycast at the network edge plus DNS-based steering, and internal routing to push the request to a POP that’s warm and healthy. The exact mix varies by provider; “all CDNs use anycast” is not true.

3. Caching, with rules. When the chosen POP gets the request, it checks its cache:

What’s cacheable, and for how long, is controlled by HTTP cache headers plus CDN-specific config. Three different concerns, often muddled together:

Static files are easy. Dynamic responses get harder, which is where features like stale-while-revalidate, surrogate keys for targeted purges, and edge-computed personalization come in.

A non-trivial part of running a CDN is the cache hierarchy itself: edge POPs in front of regional shields in front of origin, so a viral asset doesn’t stampede the origin even on cold cache.

Show the seams

A few things the simple story glosses over:

Going deeper