Heads up: posts on this site are drafted by Claude and fact-checked by Codex. Both can still get things wrong — read with care and verify anything load-bearing before relying on it.
why how

Why is DNS hierarchical?

DNS could have been a giant flat lookup table — one machine somewhere mapping every name in the world to an IP. It isn't, and the reason is less about technology than about who gets to be in charge of what.

Networking intermediate Apr 29, 2026

Why it exists

In the early 1980s the entire internet’s name-to-address mapping lived in a single text file called HOSTS.TXT, maintained by the SRI-NIC and copied around by every host that wanted to know what mit-multics resolved to. This worked when “the internet” was a few hundred machines run by people who knew each other. It stopped working the moment it didn’t.

Three things broke at once:

  1. Name collisions. Only one machine in the world could be called vax. With one global file, every new host had to ask a central authority before picking a name.
  2. Update lag. Every host pulled HOSTS.TXT periodically. Adding a machine meant waiting for the file to propagate, and the file kept getting bigger.
  3. A single point of administrative failure. SRI-NIC was the bottleneck for every name change anywhere on the network. Worse, it was the wrong group of humans to be deciding what a university lab in Berlin should call its server.

DNS was the redesign. The technical pieces — caching, UDP queries, resource records — are interesting, but the load-bearing idea is the shape of the namespace. It’s a tree. And the tree exists primarily so that authority can be delegated.

Why it matters now

Every modern system that has to name things at scale ends up with the same shape: Kubernetes namespaces, Java packages, npm scopes, AWS resource ARNs, S3 bucket subpaths, even file systems. The reason is the same reason DNS is hierarchical, and once you see it once you see it everywhere: hierarchy is how you split naming from control without losing either.

For an engineer this matters in concrete ways. When you register example.com, you’re not buying a row in some giant database — you’re being delegated authority over the entire subtree below example.com. Nobody at the root needs to know or approve when you create api.staging.example.com. That delegation is also what lets DNS scale, what lets caching work, and what determines whose key signs what under DNSSEC.

The short answer

DNS = tree of names + delegation of authority at every edge + caching

DNS is hierarchical because the names are the cheap part. The expensive part is deciding who’s allowed to say what something.com resolves to, and the tree is the data structure that lets that authority be handed off cleanly at every level. Caching then makes the whole thing fast enough to be invisible.

How it works

Read a domain name right-to-left. mail.example.com is really com → example → mail, with an invisible root . at the very end. Each dot is a hand-off point.

The root is tiny on purpose. The root zone — the contents of . — only needs to know one thing per TLD: where to find the name servers for com, for org, for uk, and so on. That’s a small list. It’s served by the 13 root server letters, each of which is actually many machines spread around the world via anycast. The root almost never changes, so this works.

Each TLD operator runs its own zone. The .com operator (Verisign) doesn’t have to know your domain’s IP — it only has to know which name servers you’ve declared as authoritative. When a resolver asks .com “where is example.com?”, it gets back a referral: “I don’t know, but ns1.example.com does.”

You run the leaves. Or your DNS provider does. Either way, the authoritative servers for example.com are the only place in the world where the answer for mail.example.com actually lives. Nobody upstream stores it. That’s delegation: each level only knows enough to point further down.

A typical resolution looks like this:

client → resolver:           "mail.example.com?"
resolver → root:             "mail.example.com?"
root → resolver:             "ask the .com servers, here they are"
resolver → .com:             "mail.example.com?"
.com → resolver:             "ask example.com's nameservers, here they are"
resolver → example.com NS:   "mail.example.com?"
example.com NS → resolver:   "203.0.113.42"
resolver → client:           "203.0.113.42"

In practice the resolver did this once, weeks ago, and has been serving the answer from cache for everyone in your office since. Each record carries a TTL which is the operator’s promise: “this answer is good for at least N seconds.”

Why a tree, not a hash table?

You could imagine an alternative universe with a flat namespace and a distributed hash table — example.com is just a key, and some peer-to-peer protocol stores the value. People have built this; it doesn’t replace DNS, and the reason is mostly social, not technical.

A flat namespace forces a single answer to “who decides which names are allowed?” A tree lets that question be answered differently in every subtree. ICANN decides which TLDs exist. Verisign decides who gets a .com. You decide what lives under your .com. Your team lead decides what lives under team.example.com. The same data structure that organizes the names also organizes the politics, and that turns out to be the thing you can’t skip.

Show the seams

A few places the clean story doesn’t quite hold:

Going deeper