Section 30: Compilers and Linkers
Purpose and Scope
This section covers the complete toolchain that transforms human-readable source code into executable binaries. Compilers and linkers are foundational infrastructure — every program running on any system passed through this pipeline. Understanding the internals enables writing faster code, debugging symbol resolution failures, reasoning about binary layout, and working with modern compiler frameworks like LLVM. Coverage spans classical compiler theory, production compiler architectures (GCC, LLVM/Clang), linker mechanics, dynamic linking, and modern compilation strategies including LTO, JIT, and AOT.
Prerequisites
- C/C++ programming at an intermediate level
- Assembly language basics (x86-64 or ARM)
- Operating system fundamentals (virtual memory, ELF format)
- Basic understanding of computer architecture (registers, stack, calling conventions)
Learning Objectives
By the end of this section, you will be able to:
- Trace a C source file through every stage of the compilation pipeline and describe each transformation
- Explain LLVM's IR-based architecture and how passes operate on the IR
- Diagnose linker errors involving symbol resolution, multiple definitions, and undefined references
- Explain how the PLT and GOT enable lazy symbol binding in shared libraries
- Describe the tradeoffs between static linking, dynamic linking, and LTO
- Compare JIT and AOT compilation strategies and identify when each is appropriate
- Read and interpret DWARF debug information structures
- Understand position-independent code generation and its performance implications
Architecture Overview
The Compilation Pipeline
Source Files (.c, .cpp, .rs)
|
v
+------+------+
| Preprocessor | #include, #define expansion, conditional compilation
+------+------+
|
v
+------+------+
| Lexer/Scanner | Source text -> token stream
| (Tokenization) | Keywords, identifiers, literals, operators
+------+------+
|
v
+------+------+
| Parser | Token stream -> Abstract Syntax Tree (AST)
| (Syntax Analysis)| Grammar rules, syntax error detection
+------+------+
|
v
+------+------+
| Semantic Analysis | Type checking, scope resolution, symbol tables
| + Type Checker | Name mangling (C++), overload resolution
+------+------+
|
v
+------+------+
| IR Generation | AST -> Intermediate Representation
| (LLVM IR / GIMPLE| SSA form, phi nodes
+------+------+
|
v
+------+------+
| IR Optimizer | Inlining, DCE, LICM, vectorization, alias analysis
| (Pass Pipeline) | Target-independent and target-dependent passes
+------+------+
|
v
+------+------+
| Code Generator | IR -> target assembly (instruction selection,
| (Instruction Sel) | register allocation, instruction scheduling)
+------+------+
|
v
Object Files (.o)
|
v
+------+------+
| Linker | Symbol resolution, relocation, section merging
+------+------+
|
v
Executable / Shared Library (.elf, .so, .dylib)
LLVM Architecture
Frontend (Clang) Middle-end (LLVM) Backend
+-----------+ +------------------+ +----------+
| C/C++/ObjC| -> IR ->| Target-Indep. | -> IR->| x86-64 |
| Rust (via | | Optimization | | ARM/AArch|
| rustc) | | Passes: | | RISC-V |
| Swift | | - mem2reg | | MIPS |
| Fortran | | - GVN | | WASM |
+-----------+ | - loop-unroll | +----------+
| - vectorize |
| - inlining |
+------------------+
|
v
LLVM IR (bitcode .bc or text .ll)
SSA form, infinite virtual registers
Typed, low-level but portable
Linker Operation
Object File 1 (.o) Object File 2 (.o) libfoo.a (archive)
+---------------+ +---------------+ +---------------+
| .text section | | .text section | | obj1.o |
| .data section | | .data section | | obj2.o |
| Symbol Table | | Symbol Table | | ... |
| Relocations | | Relocations | +---------------+
+---------------+ +---------------+
| | |
+----------------------+------------------------+
|
+-----v------+
| Linker |
| |
| 1. Symbol |
| Resolution|
| 2. Section |
| Merging |
| 3. Reloc. |
| Patching|
+-----+------+
|
+-----v------+
| Executable |
| or .so |
| (ELF) |
+------------+
Dynamic Linking: PLT/GOT Mechanism
First call to printf():
call printf@plt
|
v
+----+----+ PLT[printf]:
| jmp *GOT[printf] | --> GOT initially points back to PLT stub
+----+----+
|
v (first call only)
+----+----+
| PLT stub | --> calls _dl_runtime_resolve
+----------+
|
v
+----+----+
| ld.so | --> resolves printf in libc.so, patches GOT
+----------+
Subsequent calls:
call printf@plt --> jmp *GOT[printf] --> directly to printf in libc.so
(GOT now holds real address)
Key Concepts
- Lexical Analysis: Converting character stream to token stream; handles keywords, identifiers, string literals, and operators. Implemented as DFA or hand-written scanner.
- Parsing: Constructing the AST from token stream using LL(k), LR(1), or LALR grammars. Recursive-descent parsers dominate in production compilers for error recovery.
- Static Single Assignment (SSA): IR form where each variable is defined exactly once. Enables efficient dataflow analyses. Phi nodes merge definitions from different control flow paths.
- LLVM IR: Typed, SSA-based IR with infinite virtual registers. Target-independent. Both textual (.ll) and binary bitcode (.bc) representations exist.
- Register Allocation: Mapping virtual registers to physical registers; graph coloring or linear scan. Spilling to stack when registers are exhausted.
- Relocation: Process of patching address references in object files to point to final linked addresses. Types: R_X86_64_PC32, R_X86_64_PLT32, R_AARCH64_CALL26.
- Position-Independent Code (PIC): Code that works regardless of load address, using PC-relative addressing. Required for shared libraries.
- Link-Time Optimization (LTO): Performing IPO across translation unit boundaries by deferring optimization to link time using IR bitcode.
- JIT Compilation: Translating IR or bytecode to native code at runtime. Used in V8, HotSpot JVM, LLVM ORC JIT.
- DWARF: Debug information format embedded in ELF binaries. Describes source locations, variable types, call frames (CFI), and inline function expansion.
- MLIR: Multi-Level IR framework built on LLVM concepts; enables defining custom dialects for domain-specific compilation (ML accelerators, HPC).
Major Historical Milestones
| Year | Milestone |
|---|---|
| 1957 | FORTRAN compiler by John Backus at IBM — first optimizing compiler |
| 1960 | ALGOL 60 — formal grammar (BNF) for programming language definition |
| 1968 | Dragon Book predecessor — Aho, Ullman foundational compiler theory |
| 1973 | C compiler by Dennis Ritchie for Unix; self-hosting compiler |
| 1975 | Portable C Compiler (PCC) — architecture abstraction in compilers |
| 1984 | GCC 0.9 — Richard Stallman's GNU C Compiler, GPL-licensed |
| 1986 | GNU linker (ld) shipped with GNU binutils |
| 1991 | ELF format standardized in System V ABI |
| 1992 | SSA form formalized — Cytron et al. paper |
| 2000 | LLVM project started by Chris Lattner at UIUC |
| 2003 | GCC 3.3 introduces tree-SSA infrastructure |
| 2007 | Clang 1.0 released — LLVM C/C++ frontend |
| 2009 | LTO support added to GCC (GCC 4.5) |
| 2010 | LLVM 2.7 — mature optimization infrastructure |
| 2015 | LLVM backend used in Rust stable compiler |
| 2017 | MLIR project started at Google for TensorFlow compilation |
| 2019 | MLIR merged into LLVM monorepo |
| 2020 | LLD (LLVM linker) becomes default in many toolchains |
| 2022 | Clang/LLVM supports C23 and C++23 features |
Modern Relevance
Compiler technology sits at the intersection of hardware and software acceleration. Every ML framework (TensorFlow, PyTorch) compiles computation graphs through compiler pipelines (XLA, TorchInductor, Triton). LLVM is the common infrastructure: Rust, Swift, Julia, Kotlin/Native, and Zig all use LLVM backends. WebAssembly has revived interest in portable IR and sandboxed JIT execution. Link-time optimization is now default in production builds at major companies, yielding 5-15% binary performance improvements. The Clang/LLVM toolchain has effectively displaced GCC in mobile (Android NDK), embedded (ARM), and macOS/iOS development.
File Map
30-compilers-and-linkers/
├── 00-overview.md <- This file
├── 01-lexing-and-parsing.md
├── 02-semantic-analysis.md
├── 03-ir-and-ssa.md
├── 04-optimization-passes.md
├── 05-codegen-and-regalloc.md
├── 06-llvm-architecture.md
├── 07-gcc-internals.md
├── 08-linker-operation.md
├── 09-dynamic-linking-plt-got.md
├── 10-shared-libraries.md
├── 11-position-independent-code.md
├── 12-lto-and-wpo.md
├── 13-jit-compilation.md
├── 14-aot-compilation.md
├── 15-llvm-ir-deep-dive.md
├── 16-mlir.md
├── 17-dwarf-debug-info.md
└── 18-build-systems.md
Cross-References
- Section 06 (CPU Architecture): Instruction sets targeted by code generators; calling conventions
- Section 10 (Synchronization): Compiler memory model and fence insertion for atomic operations
- Section 11 (Memory Management): Stack frame layout, alloca, address sanitizer integration
- Section 29 (Runtime Systems): JIT compilation in language runtimes (JVM, V8, PyPy)
- Section 33 (Hardware Architecture): Microarchitecture-aware optimizations (vectorization, cache-line alignment)
- Section 37 (Browser Architecture): V8 JIT pipeline, TurboFan, Maglev, Sparkplug tiers
- Section 44 (Rust and Memory Safety): Rustc frontend over LLVM, borrow checker as semantic pass