Why Spectre still isn't fully patched

Eight years after disclosure, new Spectre-class vulnerabilities keep landing. The reason isn't sloppy patching — it's that the attack exploits the same speculation that makes modern CPUs fast in the first place.

Security intermediate May 4, 2026

Why it exists

If you’ve ever wondered why a six-year-old laptop benchmarks worse on the same CPU than it did the day you bought it, part of the answer is mundane (thermal paste dries, fans clog) — but a measurable chunk is software. Since January 2018, every mainstream OS, browser, and hypervisor has been steadily piling on mitigations for a family of CPU bugs that started with two papers named Spectre and Meltdown. The patches cost performance — sometimes a few percent, sometimes much more for system-call-heavy workloads — and they keep coming. In just the last couple of years there’s been Inception, Downfall, Reptar, GhostRace, and a steady drip of variant disclosures.

The interesting question isn’t what Spectre is. It’s why the industry can’t just fix it and move on the way it does with normal CVEs.

The short answer is that Spectre doesn’t exploit a bug in the usual sense — a typo in a memcpy, a missing bounds check, a parser that trusts its input. It exploits the intended behavior of every fast CPU built since the late 1990s. Modern processors are fast largely because they don’t wait around: they guess what comes next, run it, and throw the work away if the guess was wrong. That guessing — speculative execution — is worth roughly an order of magnitude in single-thread performance. Spectre is the discovery that the “throw the work away” step is incomplete: the architectural state is reverted, but microarchitectural side effects (which cache lines got loaded, which branch predictor entries got updated) leak into the world, and an attacker can read them through timing.

You can’t fix Spectre by removing the bug, because the bug is the optimization. You can only narrow the channels through which the leak happens, one variant at a time, and pay for it in throughput.

Why it matters now

Spectre-class issues are the reason cloud providers behave strangely about SMT / hyperthreading on shared-tenant hardware — some have disabled it for certain workloads, others restrict cross-tenant pairings, because two threads on the same core sharing branch predictors and caches is exactly the topology these attacks exploit. They’re also why browser JavaScript engines lost access to high-resolution timers and SharedArrayBuffer was temporarily disabled in 2018: the threat model assumes attacker code is already running on your machine (in a tab, in a Lambda function, in a JS sandbox) and is just trying to read memory it shouldn’t be able to.

Most importantly, the meta-story keeps repeating. Researchers find a new way to coerce the CPU into speculating across a security boundary; vendors ship a microcode update or a compiler flag; six months later someone finds the next variant. The 2024 disclosures (e.g. GhostRace, BHI variants) were not fundamentally new physics — they were new paths through the same physics. Until CPUs are designed from the ground up with speculation isolated by security domain, the trickle is structural.

The short answer

Spectre = speculative execution + a microarchitectural side channel that survives the rollback

The CPU runs ahead of the program, executing instructions it isn’t yet sure are needed. If it guessed wrong, it discards the registers and pipeline state — but the cache, branch predictor, and other shared microarchitectural buffers keep the fingerprints of what it touched. An attacker tricks the CPU into speculating through a memory access it shouldn’t make (out-of-bounds, across a privilege boundary), then reads those fingerprints through timing. The “fix” is to plug the channels one by one, because the speculation itself is too valuable to remove.

How it works

The original Spectre variant (Kocher et al., 2018, arXiv:1801.01203) is the cleanest illustration. Imagine a kernel function that takes an integer index x from userspace and does:

if (x < array1_size) {
    y = array2[array1[x] * 4096];
}

The bounds check looks safe. With ASLR and the bounds check enforced, an out-of-range x returns immediately. But the CPU’s branch predictor doesn’t know the law — it learns from history. If you train it by calling this function repeatedly with valid x, it starts predicting “the branch is taken” by default. Now you call it once with an out-of-range x. The CPU speculates that the branch is taken, dereferences array1[x] even though x is out of bounds — that’s a kernel memory read, performed speculatively — uses the byte it found as an index into array2, and starts loading the corresponding cache line. Then the bounds check resolves, the speculation is squashed, registers are restored.

But the cache line is still loaded. The attacker times accesses to every page of array2; the one that comes back fast is the one that got speculatively touched, and its index reveals the secret byte. Repeat one byte at a time, and you’ve read kernel memory from a userspace process that was never given permission.

That’s Spectre v1: bounds-check bypass via the branch predictor.

Why “fix it” is harder than it sounds

Conceptually you’d patch this by either (a) not speculating across security boundaries, or (b) reverting the cache state when you revert the registers. Both are expensive.

Don’t speculate across boundaries. This is what Intel’s IBRS, IBPB, and STIBP microcode features attempt — flush or partition predictor state at trust boundaries. Each one has a measurable cost on syscalls, context switches, and SMT pairs. Compiler-side mitigations like retpoline rewrite indirect branches into a pattern the branch predictor can’t influence; that costs cycles per call.
Revert cache state on misprediction. Hardware-level proposals exist (e.g. SafeSpec, InvisiSpec from the academic literature), but no shipping x86 CPU implements full speculative-cache isolation. The engineering cost — tracking which cache lines were loaded speculatively, undoing them on rollback, doing this within the cycle budget that made speculation worthwhile in the first place — is severe.
Disable speculation entirely. This works and is essentially what some sensitive paths in the kernel do (e.g. with lfence after bounds checks). Performance loss for general workloads would be catastrophic — modern CPU performance is largely the speculation.

So the actual mitigation portfolio is a patchwork: KPTI to address Meltdown (a related but separate Intel-specific bug where speculation crossed the user/kernel boundary even before privilege checks), retpoline for indirect branches, microcode updates for predictor barriers, browser-level reductions in timer precision, hypervisor-level core scheduling so VMs don’t share SMT siblings, and per-variant fixes as researchers find new channels.

Why new variants keep landing

The original Spectre paper named two variants and explicitly anticipated more. Researchers immediately started cataloguing the speculation primitives a CPU offers (indirect branches, return stacks, store-to-load forwarding, gather instructions, TSX transactional aborts) and the side channels available (L1/L2/L3 cache, TLB, store buffers, line-fill buffers, register file ports, even AVX2 register state). Cross every primitive with every channel and you get a matrix; each cell is a potential paper. The post-2018 disclosures — Foreshadow, MDS/RIDL/Fallout/ZombieLoad, LVI, Retbleed, Downfall (gather-data sampling), Inception (AMD return-stack injection), Reptar (Intel rep movsb corner case), GhostRace (race conditions in speculation) — are mostly cells in that matrix, not new physics.

The pattern is: someone proves that this speculative path leaks through that buffer; vendors add a microcode or compiler mitigation specific to that path; the matrix shrinks but the rest of it is still live.

What about hardware-fixed CPUs?

Vendors have been quietly redesigning. Intel’s post-2019 cores included silicon-level fixes for Meltdown and L1TF; AMD Zen 2/3/4 made architectural choices that immunized them against several variants but introduced others (Zenbleed, Inception). ARM’s newer cores have similar mixed records. The trend is toward “more of the obvious leaks closed in silicon,” not toward “a new architecture without speculation.” Nobody is willing to give up an order of magnitude of single-thread performance, and in any case the closer you look the more the channels multiply — speculative execution interacts with caches, predictors, prefetchers, memory ordering, and SMT in ways that don’t have a single chokepoint to defend.

What I’m not sure about

I’m being deliberately fuzzy on exact performance numbers. Published figures vary wildly — single-digit percent for compute-bound workloads, 20%+ for syscall-heavy workloads on early patches, less on newer hardware with silicon mitigations — and the right number depends on CPU generation, kernel version, which mitigations are enabled, and the workload. If you need a real number for a real decision, benchmark the actual machine.

I’m also compressing a lot of cross-vendor history. Intel, AMD, ARM, IBM POWER, and Apple Silicon each have different mitigation stacks and different vulnerability matrices. The shape of the story — speculation is the bug, the channel is microarchitectural state, the fix is per-variant — is the same; the specifics vary.

Meltdown — Meltdown ≈ Spectre's cousin that ignored privilege checks during speculation — Intel-specific, cleaner to fix (Kernel Page-Table Isolation moves the kernel out of the user page table when not in use). Disclosed alongside Spectre in January 2018.
KPTI — KPTI = separate page tables for user and kernel + a more expensive context switch — the canonical Meltdown mitigation; the syscall-cost story is largely about KPTI.
Retpoline — retpoline = indirect call rewritten as a ret-trampoline the predictor can't poison — Google’s compiler-side mitigation for Spectre v2.
MDS — the family covering RIDL, Fallout, ZombieLoad — leaks through internal CPU buffers rather than the cache.
ASLR — orthogonal randomization-based defense; Spectre-class info leaks defeat KASLR for free, which is one reason ASLR’s “force a leak” cost-raising story has weakened over time.

Going deeper

Kocher et al., Spectre Attacks: Exploiting Speculative Execution (2018) — the original paper. Read this for the precise primitive; the variant zoo since is mostly elaborations.
Lipp et al., Meltdown: Reading Kernel Memory from User Space (2018) — the companion paper; useful for understanding why Meltdown was easier to kill than Spectre.
Daniel Gruss’s lab at TU Graz keeps disclosing new variants; their group page is a good firehose if you want to follow the matrix as it fills in.
For the architecture side, Mark Hill and colleagues’ Spectre and Meltdown: Lessons for Computer Architects is a readable retrospective on what speculation-by-default got the field into.