03 — Unsafe Rust

Technical Overview

unsafe is Rust's escape hatch from the ownership model's static guarantees. It exists because some operations that are safe to perform cannot be proven safe by the borrow checker's static analysis, and because Rust must be able to interact with hardware, operating system interfaces, and C code that operate outside Rust's safety model. Understanding unsafe is essential for systems programming in Rust: knowing when it is necessary, what invariants the programmer must uphold, and how to build safe abstractions over unsafe code.

Prerequisites

Rust ownership model and borrow checker (see 02-rust-ownership-model.md)
Raw pointers vs references
C ABI and calling conventions
RAII and drop semantics
SIMD and intrinsics basics
Undefined behavior in programming language semantics

Historical Context

Every safe systems language must eventually confront the question: how do you write an OS kernel, a hardware driver, or a zero-copy network stack in a language that prevents pointer arithmetic? Rust's answer — the unsafe keyword — creates a bounded region where the programmer takes responsibility for safety invariants that the compiler cannot verify. The goal is not to eliminate unsafe code but to make it minimal, visible, and contained, so that the unsafe surface area can be audited, and the safety of the entire codebase is reduced to verifying the correctness of a small amount of unsafe code.

The `unsafe` Keyword

unsafe in Rust does two things: 1. It marks a block or function as containing operations that the compiler cannot verify as safe 2. It enables five specific operations that are otherwise forbidden in safe Rust

unsafe {
    // Allowed inside this block:
    // 1. Dereference a raw pointer
    // 2. Call an unsafe function or method
    // 3. Access or modify a mutable static variable
    // 4. Implement an unsafe trait
    // 5. Access fields of unions
}

Critical misconception to avoid: unsafe does not disable the borrow checker or turn off most of Rust's safety checks. The borrow checker continues to operate inside unsafe blocks. Type checking is still enforced. The only thing unsafe does is allow the five specific operations listed above. Everything else remains as safe as in non-unsafe code.

unsafe {
    let s = String::from("hello");
    let s2 = s;
    println!("{}", s);  // ERROR — still caught by borrow checker!
    // unsafe does NOT disable move semantics or borrow checking
}

The Five Unsafe Operations

1. Dereference a Raw Pointer

Raw pointers (*const T and *mut T) can be created in safe code but can only be dereferenced in unsafe:

let x = 42i32;
let raw_ptr = &x as *const i32;  // create raw pointer — safe
let raw_mut = &x as *const i32 as *mut i32;  // cast to mutable raw ptr — safe

unsafe {
    println!("{}", *raw_ptr);    // dereference — unsafe
    *raw_mut = 100;              // write through raw ptr — unsafe
}

Raw pointers, unlike references: - May be null - May point to freed memory - May point to unaligned memory - May alias other pointers (no aliasing guarantees) - May have any provenance (the compiler's alias analysis cannot reason about them)

The programmer asserts that the pointer is non-null, properly aligned, and points to valid memory of the correct type.

2. Call an Unsafe Function or Method

Functions marked unsafe fn require an unsafe block to call. The unsafe fn signature is a contract: the function has preconditions that cannot be statically verified, and the caller must ensure they hold.

/* Unsafe function: caller must ensure ptr is valid, aligned, and non-null */
unsafe fn get_value(ptr: *const i32) -> i32 {
    *ptr  // dereference: safe because the unsafe fn contract ensures validity
}

fn main() {
    let x = 42i32;
    let ptr = &x as *const i32;

    let val = unsafe { get_value(ptr) };  // caller ensures contract: ptr is valid
    println!("{}", val);
}

Common unsafe standard library functions: ptr::read, ptr::write, ptr::copy_nonoverlapping (like memcpy), slice::from_raw_parts, mem::transmute.

3. Access or Modify Mutable Static Variables

Global mutable state (static mut) is inherently unsafe because multiple threads can access it without synchronization, creating data races:

static mut COUNTER: u32 = 0;

fn increment() {
    unsafe {
        COUNTER += 1;  // unsafe: could be a data race in multi-threaded context
    }
}

fn get() -> u32 {
    unsafe { COUNTER }  // unsafe: reading mutable static
}

In single-threaded programs, static mut access may be safe. In multi-threaded programs, use std::sync::atomic types or Mutex<T> instead.

4. Implement an Unsafe Trait

Some traits have invariants that the compiler cannot verify. Such traits are marked unsafe trait. Implementing them requires unsafe impl:

/* Send: safe to transfer ownership to another thread */
/* Sync: safe to share references across threads */
/* Both are unsafe traits because the programmer must verify the invariants */

struct MyWrapper(*mut u8);  // contains a raw pointer

unsafe impl Send for MyWrapper { }  // programmer asserts: safe to send across threads
unsafe impl Sync for MyWrapper { }  // programmer asserts: safe to share across threads

Most types automatically implement Send and Sync if their fields do. Raw pointers and Rc<T> do not implement Send/Sync automatically, because the compiler cannot verify thread safety.

5. Access Fields of Unions

Rust unions (similar to C unions) allow multiple types to share the same memory. Reading a union field interprets the raw bytes as the requested type, which may produce invalid values for that type:

union IntOrFloat {
    i: u32,
    f: f32,
}

let u = IntOrFloat { i: 0xFFFFFFFF };
unsafe {
    println!("{}", u.f);  // interpret 0xFFFFFFFF as f32 → NaN
    // safe to read, but value might be invalid for the type
}

Unions are primarily used for C interoperability (many C structs use unions for type-safe variants).

The Unsafety Contract

When you write unsafe, you are signing a contract: "I, the programmer, assert that the invariants required for this operation to be safe are upheld, and I take responsibility for verifying them."

This contract is not checked by the compiler. If the invariants are wrong — if you dereference a null pointer, call a function with invalid arguments, or create a mutable alias — the result is undefined behavior: the compiler may generate code that does anything, including producing output that appears correct in testing but fails in production.

The invariants required for common unsafe operations:

/* ptr::read: requires the pointer to be: */
//   1. Non-null
//   2. Properly aligned for T
//   3. Pointing to a valid, initialized T
//   4. The memory it points to must not be mutated during the read
unsafe fn read_value<T>(ptr: *const T) -> T {
    // Programmer asserts all four invariants hold
    std::ptr::read(ptr)
}

/* slice::from_raw_parts: requires: */
//   1. ptr is non-null and properly aligned
//   2. ptr..ptr+len is all valid, initialized memory of type T
//   3. The memory is not mutated while the slice exists
//   4. len * mem::size_of::<T>() <= isize::MAX
unsafe fn make_slice<'a, T>(ptr: *const T, len: usize) -> &'a [T] {
    std::slice::from_raw_parts(ptr, len)
}

FFI: Calling C from Rust

FFI (Foreign Function Interface) is the primary reason unsafe exists. Rust cannot verify the safety of C code — C has no ownership model, no borrow checker, no type safety guarantees for raw pointers. Every call to a C function is unsafe.

Declaring a C Function

/* Link against libc */
extern "C" {
    fn strlen(s: *const u8) -> usize;  // C function declaration
    fn malloc(size: usize) -> *mut u8;
    fn free(ptr: *mut u8);
    fn memcpy(dst: *mut u8, src: *const u8, n: usize) -> *mut u8;
}

fn main() {
    let hello = b"hello\0";  // C string with null terminator

    let len = unsafe {
        strlen(hello.as_ptr())  // unsafe: strlen is a C function
    };

    println!("Length: {}", len);  // 5
}

Calling Rust from C

To expose Rust functions to C, use #[no_mangle] (prevent name mangling) and extern "C" (C calling convention):

#[no_mangle]
pub extern "C" fn add(a: i32, b: i32) -> i32 {
    a + b  // safe Rust, but called via C ABI
}

/* C side:
   extern int add(int a, int b);
   int result = add(2, 3);  // calls Rust
*/

Complete FFI Example: wrapping a C library

/* Safe Rust wrapper around unsafe C API */
use std::ffi::CString;
use std::os::raw::{c_char, c_int};

/* Declare the C function */
extern "C" {
    fn some_c_function(s: *const c_char, n: c_int) -> c_int;
}

/* Safe wrapper: hides all unsafe details from callers */
pub fn process_string(s: &str, n: i32) -> i32 {
    /* Convert Rust string to C string */
    let c_string = CString::new(s).expect("CString::new failed");

    /* The only unsafe block: the actual C call */
    unsafe {
        some_c_function(c_string.as_ptr(), n as c_int)
    }
    /* c_string is dropped here: CString::drop() frees the C string allocation */
}

/* Callers of process_string use entirely safe Rust */

Soundness: Safe Abstractions Over Unsafe Code

The goal of unsafe code is to build sound abstractions — public APIs that are safe to use from safe Rust, even though their implementation uses unsafe. A sound abstraction never allows safe code to cause undefined behavior.

Unsound abstraction (BAD — allows UB from safe code):

pub fn get_slice<T>(ptr: *const T, len: usize) -> &'static [T] {
    unsafe { std::slice::from_raw_parts(ptr, len) }
}
// Problem: 'static lifetime claim is a lie
// Safe code can call this with a pointer that becomes dangling
// → use-after-free possible from safe code
// This abstraction is UNSOUND

Sound abstraction (GOOD):

/* Vec::as_slice() — safe abstraction over raw pointer access */
impl<T> Vec<T> {
    pub fn as_slice(&self) -> &[T] {
        // Safety: self.buf.ptr() is valid, non-null, aligned
        //         self.len elements are initialized
        //         The lifetime of the returned slice is tied to &self,
        //         so it cannot outlive the Vec
        unsafe {
            std::slice::from_raw_parts(self.buf.ptr().as_ptr(), self.len)
        }
    }
}
// Sound: safe code cannot cause UB by calling as_slice()
// The lifetime annotation (&self → &[T] with same lifetime) ensures
// the slice cannot outlive the Vec

The soundness invariant: If safe code can trigger undefined behavior through your API, your abstraction is unsound. Unsoundness in a library is a bug — as serious as a memory safety bug, because it can propagate UB to any code that uses the library.

Common Unsafe Patterns

1. Raw Pointers for Performance (Skip Bounds Checking)

In hot loops, array bounds checking can be a bottleneck. get_unchecked() skips bounds checks:

fn sum_unchecked(slice: &[i32]) -> i32 {
    let mut total = 0;
    for i in 0..slice.len() {
        total += unsafe {
            // Safety: i < slice.len() by loop invariant
            *slice.get_unchecked(i)
        };
    }
    total
}

In practice, the LLVM optimizer can usually eliminate bounds checks automatically when the loop structure makes bounds provable. Manual get_unchecked should only be used after profiling confirms the overhead.

2. SIMD Intrinsics

CPU SIMD (Single Instruction Multiple Data) instructions are exposed via platform-specific intrinsics, all of which are unsafe:

#[cfg(target_arch = "x86_64")]
use std::arch::x86_64::{__m256i, _mm256_add_epi32, _mm256_loadu_si256};

#[cfg(target_arch = "x86_64")]
unsafe fn add_vectors_avx2(a: &[i32; 8], b: &[i32; 8]) -> [i32; 8] {
    let va = _mm256_loadu_si256(a.as_ptr() as *const __m256i);  // load 8×i32
    let vb = _mm256_loadu_si256(b.as_ptr() as *const __m256i);
    let result = _mm256_add_epi32(va, vb);                      // add 8 pairs
    let mut out = [0i32; 8];
    std::arch::x86_64::_mm256_storeu_si256(out.as_mut_ptr() as *mut __m256i, result);
    out
}
// Calling SIMD intrinsics requires unsafe: the programmer asserts
// that the CPU supports the required instruction set (AVX2 in this case)

The safe wrapper should check CPU feature availability:

pub fn add_vectors(a: &[i32; 8], b: &[i32; 8]) -> [i32; 8] {
    #[cfg(target_arch = "x86_64")]
    if is_x86_feature_detected!("avx2") {
        return unsafe { add_vectors_avx2(a, b) };
    }
    // Fallback: scalar
    let mut out = [0i32; 8];
    for i in 0..8 { out[i] = a[i] + b[i]; }
    out
}

3. Type Transmutation

mem::transmute reinterprets the bits of one type as another. Zero-cost, but completely unsafe:

/* Safe use: transmute between types with the same representation */
let x: u32 = 0x3F800000;  // IEEE 754 encoding of 1.0f32
let f: f32 = unsafe { std::mem::transmute(x) };
println!("{}", f);  // 1.0

/* Dangerous use: transmuting a pointer to an integer and back */
let s = String::from("hello");
let ptr = &s as *const String;
let addr: usize = unsafe { std::mem::transmute(ptr) };
// Now addr is the integer address of s
// Transmuting back to *const String would be sound only if s hasn't moved or been dropped

4. System Calls

Direct system calls require raw pointers and unsafe:

/* mmap via libc */
use libc::{mmap, munmap, PROT_READ, PROT_WRITE, MAP_PRIVATE, MAP_ANONYMOUS};

pub struct MmapBuffer {
    ptr: *mut u8,
    size: usize,
}

impl MmapBuffer {
    pub fn new(size: usize) -> Option<Self> {
        let ptr = unsafe {
            mmap(
                std::ptr::null_mut(),
                size,
                PROT_READ | PROT_WRITE,
                MAP_PRIVATE | MAP_ANONYMOUS,
                -1,
                0,
            )
        };
        if ptr == libc::MAP_FAILED { return None; }
        Some(MmapBuffer { ptr: ptr as *mut u8, size })
    }

    pub fn as_slice(&self) -> &[u8] {
        // Sound: ptr is valid (from mmap), size bytes initialized (MAP_ANONYMOUS → zeroed)
        unsafe { std::slice::from_raw_parts(self.ptr, self.size) }
    }
}

impl Drop for MmapBuffer {
    fn drop(&mut self) {
        unsafe { munmap(self.ptr as *mut libc::c_void, self.size); }
        // munmap is called exactly once when MmapBuffer drops — RAII
    }
}

Unsafe Code Review Checklist

When reviewing unsafe code (your own or others'), verify:

For each unsafe block:
[ ] What are the invariants this block requires?
[ ] Are those invariants documented (Safety: comment)?
[ ] Are the invariants provably maintained at all call sites?
[ ] Is the unsafe block as small as possible? (Minimize unsafe surface area)

For raw pointer operations:
[ ] Is the pointer non-null?
[ ] Is the pointer properly aligned for the type?
[ ] Is the memory initialized?
[ ] Is the memory still valid (not freed)?
[ ] Are aliasing rules respected? (no &T and &mut T to same location)
[ ] Is the lifetime correct? (reference does not outlive pointed-to value)

For FFI:
[ ] Does the C function's documented behavior match how it's called?
[ ] Are null checks performed before passing pointers to C?
[ ] Are C strings properly null-terminated?
[ ] Is memory ownership correctly accounted? (who allocates, who frees)
[ ] Are integer type sizes correct between Rust and C types?

For Send/Sync implementations:
[ ] Is the type actually safe to send to another thread?
[ ] Is the type actually safe to share references across threads?
[ ] Is the justification documented?

Safety comments: Every unsafe block should have a // Safety: comment explaining why the invariants hold:

// Safety: slice.len() > 0 is checked by the caller (caller guarantees non-empty slice)
//         i < slice.len() because loop invariant ensures i starts at 0 and increases by 1
let elem = unsafe { *slice.get_unchecked(i) };

Unsafe in Popular Rust Crates

tokio

Tokio (async runtime) uses unsafe for: - UnsafeCell<T> in its internal task scheduling structures (required for internal mutability in async futures) - waker_ref() — constructs a Waker from a raw pointer, requires careful lifetime management - Thread-local storage optimizations - Platform-specific I/O (io_uring, epoll) — system call wrappers

Tokio's unsafe code is extensively reviewed and tested with Miri (see below). Total unsafe lines are a small fraction of the codebase.

Rust Standard Library (std)

std uses unsafe for: - Vec<T> internals: raw pointer manipulation for push/pop/resize - String: verified-valid UTF-8 unsafe transmutation from Vec<u8> - Rc<T> and Arc<T>: reference counting with raw pointer operations - HashMap<K, V> (hashbrown): SIMD-accelerated hash table probing - All OS interface calls: read, write, mmap, etc. - Platform intrinsics for atomic operations

The standard library is considered a "trusted" unsafe foundation — its unsafe code has been extensively reviewed, formally verified (partially, via RustBelt), and tested with Miri.

MIRI: Undefined Behavior Detector for Unsafe Rust

MIRI is an interpreter for Rust's MIR (Mid-level Intermediate Representation) that detects undefined behavior in unsafe code at runtime. MIRI executes Rust code and checks every memory access, pointer operation, and type assumption:

# Install MIRI
rustup component add miri

# Run your tests under MIRI
cargo miri test

# MIRI detects:
#   - Dereference of null or dangling pointer
#   - Use of uninitialized memory
#   - Out-of-bounds pointer arithmetic
#   - Invalid values for types (e.g., bool with value 2)
#   - Violation of aliasing rules (noalias, &mut aliasing)
#   - Data races (experimental, with -Zmiri-track-alloc-id)
#   - Violation of the Stacked Borrows model

Example MIRI detection:

fn bad_deref() {
    let ptr: *const i32 = std::ptr::null();
    unsafe {
        println!("{}", *ptr);  // MIRI: error: null pointer dereference
    }
}

fn use_after_free() {
    let x = Box::new(42i32);
    let ptr: *const i32 = &*x;
    drop(x);          // free the Box
    unsafe {
        println!("{}", *ptr);  // MIRI: error: use-after-free
    }
}

MIRI is used by the Rust standard library development team to test new unsafe code, and by library authors to verify their unsafe abstractions.

MIRI's Stacked Borrows model: MIRI implements "Stacked Borrows" — a formal model of Rust's aliasing rules for pointers. When unsafe code violates aliasing (e.g., creating two &mut T to the same location), Stacked Borrows detects it.

ASCII Diagram: Unsafe Surface Area in a Rust Codebase

                 Codebase Structure
┌─────────────────────────────────────────────────────────┐
│  Application Code (safe Rust)                           │
│  ┌───────────────────────────────────────────────────┐  │
│  │ use safe_api::process(data);                      │  │
│  │ let result = collection.iter().map(f).collect();  │  │
│  └───────────────────────────────────────────────────┘  │
│                          │ calls into                    │
│  ┌───────────────────────────────────────────────────┐  │
│  │  Safe abstraction layer                           │  │
│  │  pub fn process(data: &[u8]) -> Vec<u8> {         │  │
│  │      // safe Rust using public APIs               │  │
│  │  }                                                │  │
│  └───────────────────────────────────────────────────┘  │
│                          │ wraps                         │
│  ┌───────────────────────────────────────────────────┐  │
│  │  Unsafe implementation layer (MINIMAL)            │  │
│  │  fn inner_process(ptr: *const u8, len: usize) {   │  │
│  │      unsafe {                                     │  │
│  │          // Safety: ptr valid, len < allocation   │  │
│  │          simd_process(ptr, len);                  │  │ ← unsafe surface
│  │      }                                            │  │   (auditable,
│  │  }                                                │  │    documented)
│  └───────────────────────────────────────────────────┘  │
│                          │ calls                         │
│  ┌───────────────────────────────────────────────────┐  │
│  │  OS / C library (external, trusted)               │  │
│  │  extern "C" { fn simd_intrinsic(...); }           │  │
│  └───────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────┘

Goal: minimize the unsafe surface area
      All safe code above the unsafe layer is trivially correct
      Only the unsafe layer needs careful invariant analysis

Debugging Notes

# Run under MIRI for UB detection
cargo miri test
cargo miri run

# Find all unsafe blocks in a codebase
grep -r "unsafe" --include="*.rs" . | grep -v "//.*unsafe" | grep -v "#\[.*unsafe"

# cargo-geiger: counts unsafe usage in dependencies
cargo install cargo-geiger
cargo geiger  # shows unsafe line counts per crate in dependency tree

# Sanitizers (compile-time instrumentation)
RUSTFLAGS="-Z sanitizer=address" cargo +nightly test  # ASan
RUSTFLAGS="-Z sanitizer=thread" cargo +nightly test   # TSan

# Strict provenance: opt into stricter aliasing rules
# (helps detect provenance errors in pointer arithmetic)
#![feature(strict_provenance)]
use std::ptr;
let addr: usize = ptr.addr();  // preferred over transmute for ptr→usize

Security Implications

Unsound safe abstractions are security vulnerabilities — they allow safe code to trigger UB
unsafe impl Send for *mut T must be verified: sending raw pointers across threads can cause data races
FFI memory ownership errors (double-free, not freeing, freeing wrong allocator's memory) are security vulnerabilities
mem::transmute to an invalid bit pattern for a type can produce arbitrary UB
MIRI should be run on all unsafe code as part of CI

Performance Implications

Raw pointer arithmetic: same speed as C, no overhead
get_unchecked(): eliminates bounds check (~1-3 ns per access in hot loops)
SIMD intrinsics: 4-32x speedup for data-parallel operations
FFI call overhead: ~1-10 ns per call (context switch to/from C calling convention)
MIRI: 100-1000x slower than native — use in CI, not production

Failure Modes

Unsafe Operation	If Invariant Violated	Consequence
Null pointer deref	Pointer is null	Segfault or UB
`get_unchecked` out-of-bounds	Index >= len	Memory read/write of arbitrary memory
`slice::from_raw_parts`	Uninitialized bytes	Type invariant violated → UB
`static mut` without sync	Concurrent access	Data race → UB
FFI memory ownership	Wrong allocator frees	Double-free or use-after-free
`mem::transmute`	Invalid bit pattern	Type confusion → UB

Modern Usage

Rust std lib: ~5,000 unsafe blocks in a ~500,000 line codebase (~1% of code)
Tokio: unsafe used for Waker construction, atomic operations, and I/O driver; ~0.5% of code
AWS Firecracker: minimal unsafe, primarily in KVM ioctl wrappers
Android system components in Rust: unsafe only in OS interface layers; application logic fully safe

Future Directions

Strict provenance API (stabilization): Provides ptr.with_addr(), ptr.addr() — more principled pointer-to-integer casting that preserves provenance information
Safe transmute: RFC for TransmuteFrom trait — allowing some transmutations to be safe when the compiler can verify the target type is valid for all source bit patterns
Formalization: The "Stacked Borrows" model (Neven and Dreyer) is being replaced by "Tree Borrows" — a more permissive but still sound aliasing model for MIRI
unsafe keyword audit tools: cargo-geiger, cargo-careful (-Z extra-checks) provide automated unsafe surface area measurement

Exercises

Write a safe wrapper around Vec's raw pointer API: a function split_at_unchecked<'a, T>(slice: &'a [T], mid: usize) -> (&'a [T], &'a [T]) that is faster than the safe split_at by omitting the bounds check. Document the // Safety: invariants. Test with MIRI.
Write a Rust FFI wrapper for a simple C function (e.g., libc::strlen or a function you write yourself). Ensure your wrapper: (a) takes safe Rust types as input, (b) handles null properly, (c) documents the safety contract, and (d) uses RAII to manage any C-allocated memory.
Deliberately introduce an unsound abstraction (a function that returns a dangling reference or allows a data race). Use MIRI to detect it. Fix the unsoundness.
Use cargo geiger on a medium-sized Rust project (e.g., ripgrep, bat, or a crate you use). Inspect the unsafe usage in each dependency. For the top 3 unsafe users, read the unsafe code and verify (or dispute) the soundness of the safety comments.
Implement a simple bump allocator in Rust using unsafe and a raw byte buffer. The allocator should: allocate from a [u8; 1024] backing store, return properly-aligned *mut T pointers, and panic if out of space. Implement dealloc as a no-op (bump allocators are arena-style). Test with MIRI.

References

Rust Reference. "Unsafe Rust." doc.rust-lang.org/reference/unsafe-code.html
Ralf Jung. "The Stacked Borrows Aliasing Model for Rust." PhD thesis, Saarland University, 2020.
Ralf Jung. "MIRI." github.com/rust-lang/miri — README and documentation.
Nomicon: "The Dark Arts of Advanced and Unsafe Rust Programming." doc.rust-lang.org/nomicon — The definitive guide to unsafe Rust.
Jung, Ralf et al. "Safe Systems Programming in Rust." Communications of the ACM, 2021.
Rustonomicon (informal name for Nomicon): docs.rs/nomicon — crate-formatted version.
Klabnik, Steve; Nichols, Carol. "The Rust Programming Language." Chapter 19: Unsafe Rust. No Starch Press, 2019.