05 — Spectre and Meltdown

Technical Overview

Spectre and Meltdown (January 2018) represent the most significant class of hardware security vulnerabilities in the history of commodity computing. They exploited microarchitectural optimizations—speculative execution and the shared memory cache—to allow unprivileged code to read memory it has no permission to access. Unlike software bugs that are fixed by patching one codebase, these vulnerabilities are inherent to how modern out-of-order, speculative CPUs are designed.

The original disclosure covered three variants across two CVEs, but the underlying class of attacks—transient execution attacks—has spawned dozens of follow-on vulnerabilities that continue to be discovered years later.

Prerequisites

CPU microarchitecture: pipeline, out-of-order execution, speculative execution.
x86-64 memory management: page tables, TLB, privilege rings.
Cache architecture: LLC, cache sets, cache lines.
Timing side-channels: the concept of measuring time differences to infer secret data.

Core Content

Meltdown (CVE-2018-3675)

Vulnerability class: Rogue Data Cache Load (RDCL).

What it enables: A user-space process can read arbitrary kernel memory.

Meltdown Attack Mechanism

Before Meltdown, the prevailing assumption was: if a user-space process tries to read from a kernel virtual address, the CPU raises a page fault immediately, and the read never occurs. Meltdown showed this assumption was wrong.

CPU Execution Pipeline (out-of-order)

Instruction 1: mov rax, [kernel_address]  ; SHOULD fault (ring check)
    │
    ├── Permission check begins...
    │   (but out-of-order execution continues speculatively)
    │
    └── Data from kernel_address SPECULATIVELY LOADED into cache

Instruction 2: and rax, 0xFF            ; use only low 8 bits
Instruction 3: shl rax, 12              ; rax * 4096
Instruction 4: mov rbx, [probe_array + rax]  ; SPECULATIVELY access probe_array

--- (Permission check completes: fault!) ---
Microarchitectural rollback: rax becomes 0
Exception raised: SIGSEGV

BUT: probe_array[secret_byte * 4096] is STILL IN CACHE
     (microarchitectural state is not rolled back fully)

THEN: attacker measures access time to probe_array[i*4096] for i in 0..255
      The one that's fast (cache hit) reveals secret_byte

This is the Flush+Reload technique: before the attack, flush all probe array lines (CLFLUSH); after speculative execution, time each line access. Cache hit (< 100 cycles) reveals which byte the kernel read.

Meltdown Attack Diagram

1. FLUSH probe_array (256 × 4KB pages)

2. Execute transient instruction sequence:

   try:
     secret = *(kernel_ptr)    ← will fault, but speculatively executes
     probe_array[secret * 4096]  ← caches this line before fault

   except SIGSEGV:
     pass  (use signal handler to suppress fault)

3. RELOAD timing measurement:

   for i in 0..255:
     t_start = rdtsc()
     _ = probe_array[i * 4096]   ← measure access time
     t_end = rdtsc()

     if t_end - t_start < THRESHOLD:
       # cache hit → secret byte = i
       print(f"kernel byte = {i}")

4. Repeat for each byte of kernel memory to be read

KPTI (Kernel Page Table Isolation)

Mitigation for Meltdown: maintain two page table sets per process.

User PGD (used when executing in user space): does NOT contain kernel virtual addresses (except for minimal trampolines needed for syscall entry).
Kernel PGD (used when executing in kernel space): contains all kernel mappings.

Switching between them requires loading CR3 (the page table base register) on every syscall entry/exit.

User mode:  CR3 → User PGD (no kernel mappings)
                  │
                  │ syscall / interrupt
                  ▼
           [trampoline page — mapped in both PGDs]
                  │
                  │ swapgs + CR3 load → Kernel PGD
                  ▼
Kernel mode: CR3 → Kernel PGD (full mappings)
                  │
                  │ syscall return
                  ▼
           [trampoline page]
                  │
                  │ CR3 → User PGD
                  ▼
User mode: (no kernel mappings visible)

Why Meltdown can't work with KPTI: since the kernel address is not mapped in the user PGD, the speculative load from a kernel address immediately causes a TLB miss / page not present fault—no speculative load into cache occurs. The cache side-channel has nothing to measure.

KPTI Performance Cost

Every syscall requires two CR3 loads. CR3 load flushes the TLB (unless PCID is used). On Intel Haswell+ (PCID support), KPTI overhead is 3–10%. Without PCID, overhead can be 20–30% for syscall-heavy workloads.

# Check if KPTI is active
dmesg | grep -i kpti
cat /sys/devices/system/cpu/vulnerabilities/meltdown

# Verify PCID support (reduces KPTI overhead)
grep pcid /proc/cpuinfo

# Measure KPTI overhead
perf stat -e syscalls:sys_enter_write ./syscall_benchmark
# Compare with kpti=off (boot parameter, DO NOT use in production)

Workloads most affected: - Redis (many small commands = many syscalls): 15–20% throughput reduction. - PostgreSQL (many small queries): 10–15%. - OS-heavy benchmarks (SPEC CINT): 3–7%. - CPU-bound compute: < 1%.

Spectre (CVE-2018-3693)

Vulnerability class: Bounds Check Bypass and Branch Target Injection.

What it enables: A process can leak memory from another process or from the kernel, without relying on speculative loads from unmapped addresses (Meltdown's requirement). Spectre is more general and harder to mitigate.

Spectre exploits the branch predictor—the CPU's mechanism for guessing which branch will be taken before the condition is evaluated.

Spectre v1: Bounds Check Bypass

The classic "out-of-bounds speculative array read":

// Victim code (kernel or library):
if (x < array1_size) {           // bounds check
    y = array2[array1[x] * 256]; // dependent load
}

If the attacker supplies x = attacker_controlled_index (> array1_size):

CPU predicts the if will be TRUE (branch history manipulation).
Speculatively executes array1[attacker_index]—reads OOB memory.
Speculatively accesses array2[secret_byte * 256]—caches this line.
Actual check fails: speculative results rolled back.
Attacker measures array2[i*256] timing → recovers secret_byte.

The victim's bounds check is bypassed speculatively. The attacker can read any memory accessible to the victim process (including kernel data if the victim is the kernel, or another process's data via shared libraries).

Mitigation for v1: array_index_nospec() (Linux kernel) serializes the speculative path:

// Linux kernel mitigation
x = array_index_nospec(x, array1_size);  // serializes speculation
if (x < array1_size) {
    y = array2[array1[x] * 256];
}

// Implementation: mask x to 0 if x >= array1_size
// using a data-dependent mask that cannot be speculated

Also: LFENCE instruction after the bounds check serializes the pipeline, preventing speculative execution past the fence. Overhead: 0.5–3% for kernel code with many bounds checks.

Spectre v2: Branch Target Injection (BTI)

More dangerous: the attacker trains the Branch Target Buffer (BTB) in the CPU to predict a specific target address for an indirect branch. When the victim makes the indirect branch, the CPU speculatively executes attacker-chosen code.

Attacker:
  Repeatedly call a function at virtual address VA
  → trains BTB: "indirect branch at PC X → jumps to VA"

Victim (kernel):
  Executes:  jmp [function_pointer]  ; PC = X
  BTB says: this branch goes to VA (attacker's training)
  CPU speculatively executes at VA (attacker-controlled code)
  → leaks memory via Flush+Reload on probe array

  (Actual branch resolves to correct address,
   BTB-trained speculation is flushed,
   but cache state already reflects secret)

Retpoline Mitigation

"Return Trampoline" — replaces indirect branches (jmp [reg], call [reg]) with a construct that defeats speculative execution:

; Without retpoline: CPU speculates on BTB prediction
jmp [rax]

; With retpoline:
    call set_up_target
loop:
    pause
    lfence
    jmp loop          ; CPU speculatively runs this loop forever
                      ; (branch predictor: loop back here)
set_up_target:
    mov [rsp], rax    ; overwrite return address with actual target
    ret               ; CPU thinks "ret" goes to loop (RSB prediction)
                      ; But actually returns to rax (actual target)

The Return Stack Buffer (RSB) is more predictable than the BTB; the retpoline construct ensures speculation goes into a harmless infinite loop (PAUSE + LFENCE) rather than attacker-controlled code.

IBRS (Indirect Branch Restricted Speculation): Intel microcode update. Prevents software in a less-privileged mode from influencing branch prediction in a more-privileged mode. Higher overhead than retpoline.

IBPB (Indirect Branch Predictor Barrier): flush the branch predictor state. Used at context switches to prevent cross-process BTB training.

STIBP (Single Thread Indirect Branch Predictors): isolates branch predictors between hyperthreads on the same core.

# Check Spectre mitigations
cat /sys/devices/system/cpu/vulnerabilities/spectre_v1
cat /sys/devices/system/cpu/vulnerabilities/spectre_v2
# "Mitigation: Retpoline" or "Mitigation: Enhanced IBRS"

# Check all CPU vulnerabilities
for f in /sys/devices/system/cpu/vulnerabilities/*; do
    echo "$(basename $f): $(cat $f)"; done

Spectre Variants

Variant	Name	Mechanism	Mitigation
v1	Bounds Check Bypass	Speculative OOB read via branch prediction	`array_index_nospec`, LFENCE
v2	Branch Target Injection	Poison BTB, speculate through attacker gadget	Retpoline, IBRS, IBPB
v3	Meltdown	Rogue Data Cache Load	KPTI
v3a	System Register Read	Read system registers via speculative execution	Microcode update
v4	Speculative Store Bypass	Bypass store-to-load forwarding	SSBD (Speculative Store Bypass Disable)
v5	ret2spec / RSB underflow	Exhaust Return Stack Buffer, speculate on BTB	RSB stuffing
L1TF	L1 Terminal Fault / Foreshadow	SGX enclave data via L1 cache	L1D flush, disable SMT
MDS	Microarchitectural Data Sampling	MFDS, RIDL, Fallout — leak from CPU buffers	MD_CLEAR microcode
SRBDS	Special Register Buffer Data Sampling	Leak RDRAND output	Microcode update

Spectre Mitigation Overhead

Full Spectre mitigations (retpoline + IBRS + STIBP) on older hardware (Skylake without IBRS_ALL):

Workload              Meltdown only   + Spectre v2    + STIBP
─────────────         (KPTI)          (Retpoline)     (HT isolation)
Redis small GET:      -15%            -5%             -25% total
PostgreSQL OLTP:      -10%            -3%             -17% total
Nginx static:         -5%             -2%             -8% total
CPU compute:          -1%             -1%             -2% total

On Intel Ice Lake and AMD Zen 2+, hardware mitigations (Enhanced IBRS, IBRS_ALL) reduce the retpoline overhead to near zero.

# Per-process mitigation control (Linux 5.4+)
# Disable STIBP for a performance-critical, trusted process:
prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_INDIRECT_BRANCH,
      PR_SPEC_ENABLE, 0, 0);

# Or via /proc:
echo "enable" > /proc/<pid>/status  # not directly; use prctl

Ongoing Spectre Variants

SpectreRSB (2018): Exhaust the Return Stack Buffer (RSB) using deeply-nested function calls. When RSB underflows, the CPU falls back to BTB for ret prediction—allowing BTI. Mitigation: stuff the RSB with benign entries after context switch.

Retbleed (CVE-2022-29900, CVE-2022-29901, 2022): On AMD Zen 1/2 and Intel Skylake-era CPUs, ret instructions under certain conditions behave like indirect branches, allowing BTI. Retpoline was designed to prevent speculation on call/jmp, not ret. Mitigation: IBPB (expensive), unret (AMD Zen), or RRSBA disable (Intel).

INCEPTION (CVE-2023-20569, 2023): AMD Zen 3/4. A combination of Training in Transient Execution (TTE) and Phantom JMPs allows arbitrary speculative code execution despite retpoline. Mitigation: flush BTB via IBPB on privilege transitions.

Historical Context

The vulnerabilities were discovered independently by multiple teams in 2017: - Google Project Zero (Jann Horn): discovered both Meltdown and Spectre. - Cyberus Technology / TU Graz: discovered Meltdown independently. - University of Pennsylvania / Adelaide / Rambus: discovered Spectre independently.

The coordinated disclosure date was January 3, 2018, with an embargo period that was accidentally broken by Linux kernel patches appearing in the public git tree days before the announcement.

The academic papers were published simultaneously: - Kocher, P. et al. "Spectre Attacks: Exploiting Speculative Execution." IEEE S&P 2019. - Lipp, M. et al. "Meltdown: Reading Kernel Memory from User Space." USENIX Security 2018.

Both papers are essential reading for anyone working in system security.

The disclosure catalyzed a new field: transient execution attacks (also called microarchitectural attacks). Dozens of follow-on variants were discovered in 2018–2024.

Production Examples

Case: Cloud provider inter-VM Meltdown (2018). Before KPTI patches, a cloud VM could read the hypervisor's memory via the Meltdown attack. AWS, Azure, and Google Cloud patched their hypervisors within hours of public disclosure—some before the embargo broke, because they were notified earlier. The patch window (hours between disclosure and full patch deployment) was tight; many unpatched windows existed in private cloud and on-premises environments.

Case: Retbleed disclosure impact (2022). Retbleed affected AMD Zen 1/2 and Intel Skylake-generation CPUs still widely deployed. The mitigation (IBPB on privilege transitions) added 14–39% overhead for I/O-intensive workloads on affected hardware. Red Hat delayed the default-enable of the mitigation for some workloads pending performance analysis. Cloud providers with large Skylake-generation fleets faced significant performance regressions or hardware upgrade pressure.

Debugging Notes

# Full vulnerability status
grep -r "" /sys/devices/system/cpu/vulnerabilities/

# Check which mitigations are active in kernel
dmesg | grep -i "spectre\|meltdown\|retpoline\|ibrs\|kpti"

# Performance impact of mitigations
# Disable mitigations (DANGEROUS — only for benchmarking)
# Boot with: mitigations=off
# Or per-vulnerability: nospectre_v2 nopti

# Check retpoline status
cat /sys/devices/system/cpu/vulnerabilities/spectre_v2
# "Mitigation: Retpoline" = software retpoline
# "Mitigation: Enhanced IBRS" = hardware mitigation (preferred, lower overhead)

# IBPB barrier cost
perf stat -e context-switches ./workload
# High context-switch rate × IBPB cost per switch = overhead

Security Implications

Spectre fundamentally changed the trusted computing model: 1. Shared CPU hardware is not secure between tenants. SMT/hyperthreading creates a covert channel between threads on the same physical core. Some security-critical deployments disable SMT entirely. 2. JIT-compiled code is especially vulnerable. JavaScript engines (V8, SpiderMonkey) running in browsers could be weaponized for Spectre attacks via timer-based side channels. Response: reduce timer resolution (performance.now() resolution reduced from 5 µs to 100 µs in Chrome), disable SharedArrayBuffer initially (re-enabled with cross-origin isolation). 3. The mitigation arms race is ongoing. New Spectre variants continue to be discovered. The correct engineering posture is defense-in-depth: assume mitigations are incomplete and minimize what secrets are reachable from co-tenant code.

Performance Implications

The practical performance impact of all mitigations on modern hardware (Intel Ice Lake+ or AMD Zen 3+) with hardware-assisted mitigations: - KPTI with PCID: ~3–8% for syscall-intensive workloads. - Enhanced IBRS: ~1–3% (vs. retpoline's 3–10% on older hardware). - STIBP: ~1–5% depending on SMT usage. - MD_CLEAR (MDS mitigation): ~1% (VERW instruction on context switch).

Organizations running legacy hardware (Skylake, Broadwell) face 20–40% total overhead from the mitigation stack—a significant motivator for hardware refresh cycles.

Failure Modes and Real Incidents

Browser Spectre (2018–2019): Proof-of-concept JavaScript code demonstrating cross-origin memory reads via Spectre was published within weeks of disclosure. Chrome and Firefox responded by reducing timer resolution and adding COOP/COEP cross-origin isolation headers to re-enable SharedArrayBuffer. Some websites required significant JavaScript changes to opt into the new cross-origin isolation model.

IBRS performance disaster (2018): The initial IBRS microcode (Indirect Branch Restricted Speculation) for older Intel CPUs was measured to cause 50–150% overhead on some workloads (requiring IBRS state to be set/cleared on every kernel/user transition). Intel pulled back IBRS for Spectre v2 in favor of retpoline, reserving IBRS for future hardware with cheaper hardware enforcement (Enhanced IBRS on Ice Lake).

Modern Usage

As of 2024–2026: - Intel Ice Lake Xeon and later: Enhanced IBRS, TAA mitigations, hardware-assisted Spectre mitigations. KPTI still required for Meltdown-class attacks. - AMD Zen 3+: IBPB on ret (Retbleed mitigation), SRSO mitigation (INCEPTION variant). Hardware RSB predictors improved. - ARM Cortex-A: CVA6 cores deployed with speculative execution disabled; Cortex-X3 with hardware CSV2/CSV3 Spectre mitigations.

Cloud providers have mostly moved to hardware generations where overhead is < 5% total for all mitigations.

Future Directions

Architecture-level changes: Intel's Control Flow Enforcement Technology (CET) and future ISA changes to serialize speculation at privilege transitions.
RISC-V with explicit speculation control: RISC-V ISA discussions include explicit "fence.speculation" instructions to give software control over when speculation occurs.
Formal verification of microarchitectures: academic work on verifying that microarchitectural implementations cannot leak through timing channels (e.g., "Spectector" formal analysis tool).
Constant-time cryptography: as a general defense, any code handling secrets should be constant-time (no data-dependent branches or memory accesses)—making Spectre unable to leak anything useful even if it can execute speculatively.

Exercises

Implement the Flush+Reload timing attack against a known secret value (use a process that reads a byte from a file). Measure the timing difference between a cached and uncached access. Determine the threshold that reliably distinguishes cache hits from misses on your hardware.
Study the Meltdown PoC (Lipp et al., 2018, https://meltdownattack.com/). Trace through the assembly-level attack code. Explain exactly which CPU microarchitectural optimization each instruction exploits.
Boot a Linux system with mitigations=off in the kernel cmdline (in a VM, never production). Run a syscall-heavy benchmark (Redis, PostgreSQL, or perf bench sched all). Compare throughput vs. default mitigations. Calculate the percentage overhead.
Read the Spectre paper (Kocher et al., 2019) Section 3: Spectre Attack v1. Implement the bounds-check bypass attack in C against a victim function in the same process. Measure the success rate of leaking a single secret byte.
Examine /sys/devices/system/cpu/vulnerabilities/ on your system. For each active mitigation, explain: (a) what variant it mitigates, (b) what performance overhead it incurs, and (c) whether your CPU supports the lower-overhead hardware mitigation alternative.

References

Kocher, P. et al. "Spectre Attacks: Exploiting Speculative Execution." IEEE S&P 2019. https://spectreattack.com/spectre.pdf
Lipp, M. et al. "Meltdown: Reading Kernel Memory from User Space." USENIX Security 2018. https://meltdownattack.com/meltdown.pdf
Gruss, D. et al. "KASLR is Dead: Long Live KASLR." ESSoS 2017. (Context for why KPTI was needed even before Meltdown.)
Intel KPTI overview: https://www.kernel.org/doc/html/latest/x86/pti.html
Retpoline: Turner, P. "Retpoline: a software construct for preventing branch-target-injection." Google, 2018.
Retbleed: Wikner, J., Razavi, K. "RETBLEED: Arbitrary Speculative Code Execution with Return Instructions." USENIX Security 2022.
INCEPTION: Trujillo, D. et al. "Inception: Exposing New Attack Surfaces with Training in Transient Execution." USENIX Security 2023.
Linux kernel Spectre mitigations: https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/spectre.html