Skip to content

Section 30: Compilers and Linkers

Purpose and Scope

This section covers the complete toolchain that transforms human-readable source code into executable binaries. Compilers and linkers are foundational infrastructure — every program running on any system passed through this pipeline. Understanding the internals enables writing faster code, debugging symbol resolution failures, reasoning about binary layout, and working with modern compiler frameworks like LLVM. Coverage spans classical compiler theory, production compiler architectures (GCC, LLVM/Clang), linker mechanics, dynamic linking, and modern compilation strategies including LTO, JIT, and AOT.


Prerequisites

  • C/C++ programming at an intermediate level
  • Assembly language basics (x86-64 or ARM)
  • Operating system fundamentals (virtual memory, ELF format)
  • Basic understanding of computer architecture (registers, stack, calling conventions)

Learning Objectives

By the end of this section, you will be able to:

  1. Trace a C source file through every stage of the compilation pipeline and describe each transformation
  2. Explain LLVM's IR-based architecture and how passes operate on the IR
  3. Diagnose linker errors involving symbol resolution, multiple definitions, and undefined references
  4. Explain how the PLT and GOT enable lazy symbol binding in shared libraries
  5. Describe the tradeoffs between static linking, dynamic linking, and LTO
  6. Compare JIT and AOT compilation strategies and identify when each is appropriate
  7. Read and interpret DWARF debug information structures
  8. Understand position-independent code generation and its performance implications

Architecture Overview

The Compilation Pipeline

Source Files (.c, .cpp, .rs)
         |
         v
  +------+------+
  |   Preprocessor   |   #include, #define expansion, conditional compilation
  +------+------+
         |
         v
  +------+------+
  |    Lexer/Scanner  |   Source text -> token stream
  |   (Tokenization)  |   Keywords, identifiers, literals, operators
  +------+------+
         |
         v
  +------+------+
  |     Parser        |   Token stream -> Abstract Syntax Tree (AST)
  |  (Syntax Analysis)|   Grammar rules, syntax error detection
  +------+------+
         |
         v
  +------+------+
  | Semantic Analysis |   Type checking, scope resolution, symbol tables
  |   + Type Checker  |   Name mangling (C++), overload resolution
  +------+------+
         |
         v
  +------+------+
  |   IR Generation   |   AST -> Intermediate Representation
  |  (LLVM IR / GIMPLE|   SSA form, phi nodes
  +------+------+
         |
         v
  +------+------+
  |  IR Optimizer     |   Inlining, DCE, LICM, vectorization, alias analysis
  | (Pass Pipeline)   |   Target-independent and target-dependent passes
  +------+------+
         |
         v
  +------+------+
  |  Code Generator   |   IR -> target assembly (instruction selection,
  | (Instruction Sel) |   register allocation, instruction scheduling)
  +------+------+
         |
         v
   Object Files (.o)
         |
         v
  +------+------+
  |    Linker         |   Symbol resolution, relocation, section merging
  +------+------+
         |
         v
  Executable / Shared Library (.elf, .so, .dylib)

LLVM Architecture

  Frontend (Clang)       Middle-end (LLVM)          Backend
  +-----------+         +------------------+        +----------+
  | C/C++/ObjC| -> IR ->| Target-Indep.    | -> IR->| x86-64   |
  | Rust (via |         | Optimization     |        | ARM/AArch|
  | rustc)    |         | Passes:          |        | RISC-V   |
  | Swift     |         |  - mem2reg       |        | MIPS     |
  | Fortran   |         |  - GVN           |        | WASM     |
  +-----------+         |  - loop-unroll   |        +----------+
                        |  - vectorize     |
                        |  - inlining      |
                        +------------------+
                               |
                               v
                           LLVM IR (bitcode .bc or text .ll)
                           SSA form, infinite virtual registers
                           Typed, low-level but portable

Linker Operation

  Object File 1 (.o)       Object File 2 (.o)       libfoo.a (archive)
  +---------------+        +---------------+        +---------------+
  | .text section |        | .text section |        | obj1.o        |
  | .data section |        | .data section |        | obj2.o        |
  | Symbol Table  |        | Symbol Table  |        | ...           |
  | Relocations   |        | Relocations   |        +---------------+
  +---------------+        +---------------+
           |                      |                        |
           +----------------------+------------------------+
                                  |
                            +-----v------+
                            |   Linker   |
                            |            |
                            | 1. Symbol  |
                            |    Resolution|
                            | 2. Section |
                            |    Merging |
                            | 3. Reloc.  |
                            |    Patching|
                            +-----+------+
                                  |
                            +-----v------+
                            | Executable |
                            | or .so     |
                            | (ELF)      |
                            +------------+

Dynamic Linking: PLT/GOT Mechanism

  First call to printf():

  call printf@plt
       |
       v
  +----+----+   PLT[printf]:
  |  jmp *GOT[printf]  |  --> GOT initially points back to PLT stub
  +----+----+
       |
       v (first call only)
  +----+----+
  | PLT stub |  --> calls _dl_runtime_resolve
  +----------+
       |
       v
  +----+----+
  |  ld.so  |  --> resolves printf in libc.so, patches GOT
  +----------+

  Subsequent calls:
  call printf@plt --> jmp *GOT[printf] --> directly to printf in libc.so
                           (GOT now holds real address)

Key Concepts

  • Lexical Analysis: Converting character stream to token stream; handles keywords, identifiers, string literals, and operators. Implemented as DFA or hand-written scanner.
  • Parsing: Constructing the AST from token stream using LL(k), LR(1), or LALR grammars. Recursive-descent parsers dominate in production compilers for error recovery.
  • Static Single Assignment (SSA): IR form where each variable is defined exactly once. Enables efficient dataflow analyses. Phi nodes merge definitions from different control flow paths.
  • LLVM IR: Typed, SSA-based IR with infinite virtual registers. Target-independent. Both textual (.ll) and binary bitcode (.bc) representations exist.
  • Register Allocation: Mapping virtual registers to physical registers; graph coloring or linear scan. Spilling to stack when registers are exhausted.
  • Relocation: Process of patching address references in object files to point to final linked addresses. Types: R_X86_64_PC32, R_X86_64_PLT32, R_AARCH64_CALL26.
  • Position-Independent Code (PIC): Code that works regardless of load address, using PC-relative addressing. Required for shared libraries.
  • Link-Time Optimization (LTO): Performing IPO across translation unit boundaries by deferring optimization to link time using IR bitcode.
  • JIT Compilation: Translating IR or bytecode to native code at runtime. Used in V8, HotSpot JVM, LLVM ORC JIT.
  • DWARF: Debug information format embedded in ELF binaries. Describes source locations, variable types, call frames (CFI), and inline function expansion.
  • MLIR: Multi-Level IR framework built on LLVM concepts; enables defining custom dialects for domain-specific compilation (ML accelerators, HPC).

Major Historical Milestones

Year Milestone
1957 FORTRAN compiler by John Backus at IBM — first optimizing compiler
1960 ALGOL 60 — formal grammar (BNF) for programming language definition
1968 Dragon Book predecessor — Aho, Ullman foundational compiler theory
1973 C compiler by Dennis Ritchie for Unix; self-hosting compiler
1975 Portable C Compiler (PCC) — architecture abstraction in compilers
1984 GCC 0.9 — Richard Stallman's GNU C Compiler, GPL-licensed
1986 GNU linker (ld) shipped with GNU binutils
1991 ELF format standardized in System V ABI
1992 SSA form formalized — Cytron et al. paper
2000 LLVM project started by Chris Lattner at UIUC
2003 GCC 3.3 introduces tree-SSA infrastructure
2007 Clang 1.0 released — LLVM C/C++ frontend
2009 LTO support added to GCC (GCC 4.5)
2010 LLVM 2.7 — mature optimization infrastructure
2015 LLVM backend used in Rust stable compiler
2017 MLIR project started at Google for TensorFlow compilation
2019 MLIR merged into LLVM monorepo
2020 LLD (LLVM linker) becomes default in many toolchains
2022 Clang/LLVM supports C23 and C++23 features

Modern Relevance

Compiler technology sits at the intersection of hardware and software acceleration. Every ML framework (TensorFlow, PyTorch) compiles computation graphs through compiler pipelines (XLA, TorchInductor, Triton). LLVM is the common infrastructure: Rust, Swift, Julia, Kotlin/Native, and Zig all use LLVM backends. WebAssembly has revived interest in portable IR and sandboxed JIT execution. Link-time optimization is now default in production builds at major companies, yielding 5-15% binary performance improvements. The Clang/LLVM toolchain has effectively displaced GCC in mobile (Android NDK), embedded (ARM), and macOS/iOS development.


File Map

30-compilers-and-linkers/
├── 00-overview.md              <- This file
├── 01-lexing-and-parsing.md
├── 02-semantic-analysis.md
├── 03-ir-and-ssa.md
├── 04-optimization-passes.md
├── 05-codegen-and-regalloc.md
├── 06-llvm-architecture.md
├── 07-gcc-internals.md
├── 08-linker-operation.md
├── 09-dynamic-linking-plt-got.md
├── 10-shared-libraries.md
├── 11-position-independent-code.md
├── 12-lto-and-wpo.md
├── 13-jit-compilation.md
├── 14-aot-compilation.md
├── 15-llvm-ir-deep-dive.md
├── 16-mlir.md
├── 17-dwarf-debug-info.md
└── 18-build-systems.md

Cross-References

  • Section 06 (CPU Architecture): Instruction sets targeted by code generators; calling conventions
  • Section 10 (Synchronization): Compiler memory model and fence insertion for atomic operations
  • Section 11 (Memory Management): Stack frame layout, alloca, address sanitizer integration
  • Section 29 (Runtime Systems): JIT compilation in language runtimes (JVM, V8, PyPy)
  • Section 33 (Hardware Architecture): Microarchitecture-aware optimizations (vectorization, cache-line alignment)
  • Section 37 (Browser Architecture): V8 JIT pipeline, TurboFan, Maglev, Sparkplug tiers
  • Section 44 (Rust and Memory Safety): Rustc frontend over LLVM, borrow checker as semantic pass