Why is DNS hierarchical?

DNS could have been a giant flat lookup table — one machine somewhere mapping every name in the world to an IP. It isn't, and the reason is less about technology than about who gets to be in charge of what.

Networking intermediate Apr 29, 2026

Why it exists

In the early 1980s the entire internet’s name-to-address mapping lived in a single text file called HOSTS.TXT, maintained by the SRI-NIC and copied around by every host that wanted to know what mit-multics resolved to. This worked when “the internet” was a few hundred machines run by people who knew each other. It stopped working the moment it didn’t.

Three things broke at once:

Name collisions. Only one machine in the world could be called vax. With one global file, every new host had to ask a central authority before picking a name.
Update lag. Every host pulled HOSTS.TXT periodically. Adding a machine meant waiting for the file to propagate, and the file kept getting bigger.
A single point of administrative failure. SRI-NIC was the bottleneck for every name change anywhere on the network. Worse, it was the wrong group of humans to be deciding what a university lab in Berlin should call its server.

DNS was the redesign. The technical pieces — caching, UDP queries, resource records — are interesting, but the load-bearing idea is the shape of the namespace. It’s a tree. And the tree exists primarily so that authority can be delegated.

Why it matters now

Every modern system that has to name things at scale ends up with the same shape: Kubernetes namespaces, Java packages, npm scopes, AWS resource ARNs, S3 bucket subpaths, even file systems. The reason is the same reason DNS is hierarchical, and once you see it once you see it everywhere: hierarchy is how you split naming from control without losing either.

For an engineer this matters in concrete ways. When you register example.com, you’re not buying a row in some giant database — you’re being delegated authority over the entire subtree below example.com. Nobody at the root needs to know or approve when you create api.staging.example.com. That delegation is also what lets DNS scale, what lets caching work, and what determines whose key signs what under DNSSEC.

The short answer

DNS = tree of names + delegation of authority at every edge + caching

DNS is hierarchical because the names are the cheap part. The expensive part is deciding who’s allowed to say what something.com resolves to, and the tree is the data structure that lets that authority be handed off cleanly at every level. Caching then makes the whole thing fast enough to be invisible.

How it works

Read a domain name right-to-left. mail.example.com is really com → example → mail, with an invisible root . at the very end. Each dot is a hand-off point.

The root is tiny on purpose. The root zone — the contents of . — only needs to know one thing per TLD: where to find the name servers for com, for org, for uk, and so on. That’s a small list. It’s served by the 13 root server letters, each of which is actually many machines spread around the world via anycast. The root almost never changes, so this works.

Each TLD operator runs its own zone. The .com operator (Verisign) doesn’t have to know your domain’s IP — it only has to know which name servers you’ve declared as authoritative. When a resolver asks .com “where is example.com?”, it gets back a referral: “I don’t know, but ns1.example.com does.”

You run the leaves. Or your DNS provider does. Either way, the authoritative servers for example.com are the only place in the world where the answer for mail.example.com actually lives. Nobody upstream stores it. That’s delegation: each level only knows enough to point further down.

A typical resolution looks like this:

client → resolver:           "mail.example.com?"
resolver → root:             "mail.example.com?"
root → resolver:             "ask the .com servers, here they are"
resolver → .com:             "mail.example.com?"
.com → resolver:             "ask example.com's nameservers, here they are"
resolver → example.com NS:   "mail.example.com?"
example.com NS → resolver:   "203.0.113.42"
resolver → client:           "203.0.113.42"

In practice the resolver did this once, weeks ago, and has been serving the answer from cache for everyone in your office since. Each record carries a TTL which is the operator’s promise: “this answer is good for at least N seconds.”

Why a tree, not a hash table?

You could imagine an alternative universe with a flat namespace and a distributed hash table — example.com is just a key, and some peer-to-peer protocol stores the value. People have built this; it doesn’t replace DNS, and the reason is mostly social, not technical.

A flat namespace forces a single answer to “who decides which names are allowed?” A tree lets that question be answered differently in every subtree. ICANN decides which TLDs exist. Verisign decides who gets a .com. You decide what lives under your .com. Your team lead decides what lives under team.example.com. The same data structure that organizes the names also organizes the politics, and that turns out to be the thing you can’t skip.

Show the seams

A few places the clean story doesn’t quite hold:

The root isn’t really the root. ICANN coordinates the root zone, but the content (which TLDs exist and who runs them) is the result of policy processes, contracts, and in some cases government involvement for country-code TLDs. The “tree” has a political base, not a technical one.
Caching makes propagation messy. “DNS changes can take 24–48 hours to propagate” is shorthand for “every cache between you and the world has its own TTL countdown.” Lower TTLs trade load on your servers for faster changes. There isn’t a clean answer; it’s a tuning knob.
The “13 root servers” number is a historical artifact. It’s 13 named identities (A-root through M-root), originally because of UDP packet size limits in the early DNS protocol. Each identity is now backed by hundreds of physical instances via anycast.
DNSSEC is the part that makes the hierarchy cryptographic, not just administrative. Each zone signs its records, and the parent zone signs a hash of the child’s key. The chain of trust runs from the root key downward, mirroring the delegation tree. Adoption is uneven and I don’t have a current global number I’d defend — coverage varies a lot by TLD and by organization.
Modern resolution often skips most of this. Your laptop usually asks a recursive resolver (your ISP’s, or 1.1.1.1, or 8.8.8.8) which has cached half the internet. The full root-to-leaf walk is rare in practice but is the fallback the system is designed around.

TLD — TLD = the rightmost label in a domain name — .com, .org, .uk. The first level of delegation under the root.
Authoritative vs recursive server — authoritative = "I own this zone", recursive = "I'll go ask on your behalf and cache the answer" — the two roles every DNS deployment splits into.
TTL — TTL = seconds a record may be cached — the operator’s contract with every resolver downstream.
Anycast — anycast = same IP, many machines, routing picks the nearest — how a “single” root server is actually hundreds.
DNSSEC — DNSSEC = DNS records + signatures rooted at the root key — turns the administrative tree into a cryptographic one.

Going deeper

RFC 1034 and RFC 1035 — Paul Mockapetris’s original DNS specifications from 1987. Short, readable, and the why shows through more clearly than in any later document.
“The Design of the Domain Name System” (Mockapetris and Dunlap, SIGCOMM 1988) — the design rationale paper, which is more explicit than the RFCs about why hierarchy was the central choice.
The IANA root zone database — the actual list of TLDs and their operators, which makes the political layer concrete in a way the protocol docs don’t.