Dynamic Linking Security
Technical Overview
Dynamic linking's core mechanism — resolving function addresses at runtime via mutable GOT entries — creates a persistent attack surface in every dynamically linked process. A single arbitrary write primitive (from a heap overflow, format string bug, or use-after-free) aimed at the GOT can redirect any subsequent library call to attacker-controlled code, achieving arbitrary code execution without a separate code injection step.
This document covers the attack techniques against dynamic linking, the defense mechanisms the OS/compiler/linker provide, and the threat model for supply chain attacks involving shared libraries. Understanding these mechanics is prerequisite for writing secure systems software and for correct use of hardening flags.
Prerequisites
- Solid understanding of PLT/GOT mechanism (see 03-linkers-and-loaders.md)
- Understanding of ASLR, NX/DEP, and stack canaries as baseline mitigations
- Familiarity with ELF binary format
- Basic exploitation concepts: arbitrary read/write primitives, code execution
Dynamic Linking Security Checklist
+----------------------------------------------------------+
| RELRO Status | Impact if missing |
+-------------------------+--------------------------------+
| No RELRO | .got, .got.plt fully writable |
| | GOT overwrite trivial |
+-------------------------+--------------------------------+
| Partial RELRO (default) | .got read-only, .got.plt |
| | still writable — PLT entries |
| | still overwritable |
+-------------------------+--------------------------------+
| Full RELRO | .got.plt read-only after load |
| (-z relro -z now) | GOT overwrite impossible |
+----------------------------------------------------------+
Linking Security Flags Checklist:
[ ] -fPIE -pie PIE executable (ASLR applies to base)
[ ] -Wl,-z,relro,-z,now Full RELRO (requires PIE)
[ ] -Wl,-z,noexecstack Non-executable stack
[ ] -Wl,-z,separate-code Separate code/data segments
[ ] -D_FORTIFY_SOURCE=2 Buffer overflow detection in libc calls
[ ] -fstack-protector-strong Stack canaries
[ ] -fcf-protection=full Intel CET (IBT + SHSTK)
[ ] -fsanitize=cfi Clang CFI (indirect call type checks)
Core Content
GOT Overwrite Attacks
Attack scenario: An attacker has exploited a vulnerability that provides an arbitrary write primitive — they can write any 8 bytes to any writable virtual address in the process.
Target: The .got.plt section, which contains pointers to external library functions. These pointers are writable during program execution (under partial RELRO or no RELRO) and are dereferenced on every external function call.
Exploit flow:
1. Attacker determines the address of the GOT entry for free (or exit, printf, any frequently called function)
2. If ASLR is active, attacker needs a leak of a library address to compute the base. This is typically obtained via a memory disclosure bug (arbitrary read, format string %p).
3. Attacker writes the address of a ROP gadget or system() to GOT[free]
4. Next call to free(ptr) in the program calls system(ptr) instead. If ptr was controlled (e.g., free("/bin/sh")), this executes a shell.
This attack class dominated binary exploitation from 2005 to ~2015. Tools like pwntools have one-line helpers for GOT overwrite:
# pwntools exploitation example (educational)
from pwn import *
elf = ELF('./vulnerable')
# Find the GOT entry address for 'free'
got_free = elf.got['free']
# Find the plt for 'system'
plt_system = elf.plt['system']
# Overwrite GOT[free] with plt_system via the arbitrary write
# primitive in the vulnerability...
Without ASLR + without RELRO: GOT entries are at fixed addresses. The attacker knows them statically from readelf -r ./binary. This is why ASLR + RELRO together form a meaningful defense — ASLR alone is bypassed with an info leak; RELRO alone is less meaningful without ASLR because fixed GOT addresses are known.
RELRO: RELocation Read-Only
Partial RELRO (enabled by default in modern GCC/Clang builds):
- Reorders sections so that .got (non-PLT global variables via GOT) appears before .bss in the address space
- Marks .got read-only after dynamic linking is complete
- .got.plt (the PLT-used GOT entries) remains writable for lazy binding
- Result: GOT entries for global variables are protected; PLT entries are still overwritable
Full RELRO (-Wl,-z,relro,-z,now or equivalently -Wl,-z,relro -Wl,-z,now):
- Forces ld.so to resolve all PLT symbols at load time (eager binding, -z now)
- After all relocations are resolved, the entire .got.plt section is marked read-only via mprotect
- Result: No writable GOT entries at all during program execution. GOT overwrite is impossible.
Cost of full RELRO:
- Startup latency: All PLT symbols resolved before main() runs. For programs with hundreds of library functions, this adds 1–10ms to startup. Negligible for servers; relevant for short-lived commands.
- Incompatibility with some dynamic loading patterns: If a library needs to lazily resolve its own GOT entries after startup (rare), full RELRO breaks it.
# Build with full RELRO
gcc -fPIE -pie -Wl,-z,relro,-z,now -o hardened main.c
# Verify RELRO
checksec --file=hardened
# Expected: RELRO: Full, Stack: Canary found, NX: enabled, PIE: enabled
# Alternative verification:
readelf -l hardened | grep GNU_RELRO
# Shows size of the RELRO-protected region
ASLR for Shared Libraries
Address Space Layout Randomization (ASLR) for shared libraries randomizes the base address at which each .so is mapped. The linker must produce position-independent code (-fPIC) for this to work — code that uses relative offsets, not absolute addresses.
On Linux x86-64, ASLR provides:
- 28 bits of entropy for library base addresses (from /proc/sys/kernel/randomize_va_space=2)
- 64-bit mmap randomization: 2^28 = 268 million possible addresses per library
ASLR is defeated by:
- Info leak: Any memory disclosure (format string, use-after-free that reads memory, leak of a pointer) that reveals a library address. Once one address in libc.so is known, the entire libc.so is located (fixed offsets from the base).
- Brute force: For 32-bit processes, only 8–16 bits of entropy → brute force in <10,000 attempts. Not practical on 64-bit.
- Heap spraying: Fill the heap with valid pointers/shellcode so that a somewhat-random jump hits exploitable code. Less effective with 64-bit ASLR.
LD_PRELOAD Attacks
LD_PRELOAD=/path/to/evil.so ./target instructs ld.so to load evil.so before all other shared libraries, including libc.so. Any symbols defined in evil.so take precedence over library definitions, because ld.so uses the first definition found.
Attack uses:
- Hook read() / write() to exfiltrate data
- Hook strcmp() / memcmp() to log password comparisons
- Hook execve() to detect/block/redirect process execution
- Hook malloc() for heap profiling (legitimate use)
Defense mechanisms:
-
setuid/setgid protection:
ld.soignoresLD_PRELOAD(andLD_LIBRARY_PATH) when the process has elevated privileges (effective UID ≠ real UID, or effective GID ≠ real GID). This prevents unprivileged users from injecting libraries into privileged executables. -
Static linking: A statically linked binary has no
ld.soinvolvement →LD_PRELOADhas no effect. Used for security-critical binaries (sshd,sudo,suon some distros). -
System call interception via seccomp: While not blocking LD_PRELOAD, seccomp-bpf policies restrict which system calls the process can make. A
LD_PRELOADlibrary can't escalate privileges ifexecveand network calls are blocked. -
SELinux / AppArmor: Mandatory Access Control policies can prevent a process from loading unexpected shared libraries regardless of
LD_PRELOAD.
Legitimate LD_PRELOAD uses:
- Memory allocation profilers (tcmalloc, jemalloc, valgrind)
- System call tracing (ltrace, strace's older mechanism)
- Testing/mocking (inject a mock libssl.so in tests)
- faketime (intercept clock_gettime())
Library Search Order Hijacking
When ld.so searches for a required library, it uses this order:
1. Directories in the binary's RPATH (deprecated, baked in at link time, non-overridable)
2. Directories in LD_LIBRARY_PATH environment variable
3. Directories in the binary's RUNPATH (modern replacement for RPATH, can be overridden by LD_LIBRARY_PATH)
4. /etc/ld.so.cache (maintained by ldconfig)
5. Default directories: /lib, /usr/lib, /lib64, /usr/lib64
Hijacking via LD_LIBRARY_PATH: An attacker who can set environment variables for a process can redirect library loads to a malicious path. This is the LD_LIBRARY_PATH hijack — place a malicious libssl.so.1.1 in /tmp/evil/ and set LD_LIBRARY_PATH=/tmp/evil. Same defenses apply as LD_PRELOAD (setuid ignores it).
RPATH/RUNPATH injection in supply chain attacks: If a build system is compromised and inserts RUNPATH=/tmp into a distributed binary, and /tmp is attacker-controlled on target machines, library hijacking occurs at runtime.
Check a binary's RPATH/RUNPATH:
readelf -d ./binary | grep -E '(RPATH|RUNPATH)'
chrpath -l ./binary
# Remove RPATH entirely if not needed:
chrpath --delete ./binary
DLL Injection on Windows
The Windows analog of LD_PRELOAD is DLL injection. A DLL (Dynamic Link Library) can be injected into a running process via:
- CreateRemoteThread + LoadLibrary: Create a thread in the target process that calls LoadLibrary with the path to the attacker's DLL
- SetWindowsHookEx: Install a system-wide hook that causes Windows to load the DLL into any process that receives certain messages
- DLL search order hijacking: Windows DLL search order includes the application directory first. If evil.dll is placed in the same directory as target.exe, and target.exe loads crypto.dll, placing evil.dll as crypto.dll in the application directory causes it to be loaded first.
Windows defenses: Safe DLL Search Mode (enabled by default since Vista), Known DLL protection (critical system DLLs mapped from a system-managed location), Code Signing (DLLs can require valid signatures), and AppLocker/WDAC policies.
Supply Chain in Dynamic Linking
Dynamic linking creates a trust dependency: your application implicitly trusts every .so it loads. Compromising a widely-used system library affects all applications that link against it.
Real-world supply chain attacks via shared libraries:
- XZ Utils backdoor (2024): A malicious contributor to liblzma (part of xz-utils) inserted a backdoor via build system manipulation. On affected systems, sshd linked against liblzma (via libsystemd) was backdoored. The backdoor patched the PLT/GOT of sshd at startup (via a constructor function in liblzma) to intercept RSA key authentication. This is perhaps the most sophisticated supply chain attack targeting dynamic linking mechanics ever discovered.
- SolarWinds (2020): DLL hijacking in the build process — malicious DLL injected into the Orion update. Not a dynamic linking mechanism attack, but exploiting the same trust model.
Mitigations for supply chain in dynamic linking:
- Verify .so checksums (package manager signatures)
- Prefer static linking for critical security infrastructure
- Use containers (Docker) or VMs to isolate library versions
- Monitor ld.so calls with auditd (Linux Audit) or eBPF
eBPF-based library load monitoring:
// BPF program to log every execve with library paths
// (pseudocode)
SEC("kprobe/security_mmap_file")
int log_mmap(struct pt_regs *ctx) {
struct file *file = (struct file *)PT_REGS_PARM1(ctx);
// Extract filename and log to userspace
}
Controlling Dynamic Linking for Security
Static linking for security-critical binaries: sshd, sudo, cryptographic tools benefit from static linking to eliminate LD_PRELOAD/LD_LIBRARY_PATH as attack vectors (though the setuid protection covers most cases). The cost: binary size, no automatic security updates from library patches.
# Fully static link with musl libc (smaller than glibc static)
musl-gcc -static -o my_tool main.c
# Or with glibc static (larger, some features like NSS don't work statically)
gcc -static main.c -o my_tool
dlopen and dlsym for plugins require careful security design:
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>
// Load a plugin safely
void* load_plugin(const char *path) {
// Verify the plugin path is in a trusted directory
if (strncmp(path, "/opt/trusted_plugins/", 21) != 0) {
fprintf(stderr, "Refusing to load plugin from untrusted path: %s\n", path);
return NULL;
}
// RTLD_NOW: resolve all symbols immediately (fail fast on missing symbols)
// RTLD_LOCAL: don't pollute global symbol namespace
void *handle = dlopen(path, RTLD_NOW | RTLD_LOCAL);
if (!handle) {
fprintf(stderr, "dlopen failed: %s\n", dlerror());
return NULL;
}
// Verify plugin exports expected interface
void (*plugin_init)(void) = dlsym(handle, "plugin_init");
if (!plugin_init) {
fprintf(stderr, "Plugin missing plugin_init: %s\n", dlerror());
dlclose(handle);
return NULL;
}
plugin_init();
return handle;
}
Additional dlopen security: verify the .so is owned by root and not world-writable before loading; use SELinux file contexts to restrict which processes can load which libraries.
Symbol Versioning
Symbol versioning allows multiple versions of the same function to coexist in a shared library, enabling ABI compatibility:
// In libc.so.6, multiple versions of memcpy coexist:
// memcpy@GLIBC_2.2.5 — old ABI (for binaries linked against old glibc)
// memcpy@GLIBC_2.14 — new ABI with different destination-overlapping behavior
// A binary linked against glibc 2.14+ gets the new memcpy
// A binary linked against glibc 2.2.5 gets the old memcpy
// Both run correctly on a system with glibc 2.34
Version scripts for shared library versioning:
/* version.map */
MY_LIB_1.0 {
global: exported_function;
local: *; /* all other symbols hidden */
};
MY_LIB_2.0 {
global: new_exported_function;
} MY_LIB_1.0;
gcc -shared -fPIC -Wl,--version-script=version.map -o libfoo.so foo.c
Historical Context
The PLT/GOT mechanism was designed for performance (lazy binding avoids the startup cost of resolving all symbols) and not with security in mind. GOT overwrite as an exploitation technique was documented in Phrack magazine in the early 2000s. RELRO was developed as a mitigation; it appears to have been added to the GNU toolchain around 2007–2008. The progression of memory corruption mitigations (ASLR → NX/DEP → RELRO → CFI) represents two decades of attacker/defender back-and-forth in binary exploitation.
The XZ Utils backdoor (2024) demonstrated that even sophisticated dynamic linking defenses can be bypassed by attacking the upstream software supply chain before the binary is built — highlighting that security of dynamic linking includes the entire build and distribution chain.
Production Examples
# Check all security properties of a binary
checksec --file=/usr/sbin/nginx
# Output example (well-hardened binary):
# RELRO STACK CANARY NX PIE
# Full Canary found NX enabled PIE enabled
# Check libraries loaded by a process at runtime
cat /proc/$(pgrep nginx | head -1)/maps | grep '\.so'
# Audit LD_PRELOAD usage system-wide (via auditd)
auditctl -a always,exit -F arch=b64 -S execve \
-F env=LD_PRELOAD -k ld_preload_exec
ausearch -k ld_preload_exec | tail -20
# Intercept all dlopen calls with eBPF (bpftrace)
bpftrace -e 'uprobe:/lib/x86_64-linux-gnu/libdl.so.2:dlopen {
printf("PID %d dlopen: %s\n", pid, str(arg0));
}'
Debugging Notes
LD_DEBUG=all ./myapp 2>&1 | head -100shows every step ofld.so's initialization: library loading, symbol resolution, relocation processing. Invaluable for diagnosing missing symbol or wrong library version issues.LD_DEBUG=libs ./myappshows just library search paths and selections.pldd <pid>lists all shared libraries currently loaded in a process (Linux 2.19+).- If a binary crashes immediately with SIGBUS or SIGSEGV before
main(), checkLD_DEBUG=alloutput for relocation failures. Full RELRO issues manifest here. - Under Valgrind,
LD_PRELOADis used by Valgrind itself to interpose on allocation functions — this is whyLD_PRELOADof a custom allocator conflicts with Valgrind.
Security Implications Summary
- Dynamic linking is a trust chain: the binary trusts
ld.so, which trusts libraries found on the search path, which are trusted because they're installed by the package manager (which trusts upstream maintainers). - Every link in this chain is an attack surface: supply chain compromise at any level undermines the entire chain.
- Full RELRO + PIE + stack protector + CFI is the current best-practice hardened build configuration. It does not protect against a compromised library but prevents classic memory corruption → GOT overwrite → code execution chains.
- For highest-security deployments (HSMs, critical infrastructure): static linking + musl libc +
seccomp-bpfsyscall filtering + SELinux eliminates the dynamic linking attack surface at the cost of flexibility.
Performance Implications
- Full RELRO with eager binding: typically 1–10ms additional startup time for programs with 50–200 external functions. Negligible for long-running servers; may be relevant for frequently invoked CLI tools.
- CFI instrumentation overhead: ~1–3% CPU overhead on indirect-call-heavy workloads. Acceptable in most cases.
- Removing unused dynamic dependencies (use
--as-neededlinker flag): reduces load time by not mapping libraries with zero used symbols.
Failure Modes
- RELRO breaks plugin system: A plugin system that uses
dlopenafter startup and expects to patch GOT entries may fail with full RELRO. Redesign to use function pointer tables instead of relying on GOT mutability. LD_PRELOADconflict: TwoLD_PRELOADlibraries both interpose onmalloc. The one listed first wins; the second's interpositioning is broken. UseRTLD_NEXTin each to chain properly.- CFI violation in generated code: If a JIT compiler generates code that makes indirect calls through function pointers, CFI may fault because the generated code doesn't follow CFI's type-checking protocol. JIT compilers must disable CFI for their generated call sites.
Modern Usage
Clang's Control Flow Integrity (CFI) (-fsanitize=cfi) inserts compile-time type checks before every indirect call. If an indirect function call targets a function of a different type signature, CFI aborts the program. This defeats virtually all call-oriented ROP chains that depend on type confusion.
Chrome, Android, and LLVM itself are compiled with CFI in production. The overhead is 1–3% for typical workloads.
Apple's Hardened Runtime on macOS: A flag that restricts runtime code execution and library loading — disables JIT, restricts dlopen to signed libraries, and prevents LD_PRELOAD injection without explicit entitlement. All App Store apps are required to use the Hardened Runtime.
Future Directions
- Shadow stacks (Intel CET / ARM PAC): Hardware-enforced return address integrity. Intel's CET SHSTK (Shadow Stack) maintains a separate read-only stack just for return addresses. ARM Pointer Authentication (PAC) signs return addresses. Both prevent ROP. Supported in Linux 5.18+ (CET) and Apple Silicon (PAC).
- ShadowCallStack and BTI on AArch64: Available in Android builds; protecting return addresses and ensuring indirect branches target valid landing pad instructions.
- SBOM (Software Bill of Materials) for dynamic dependencies: Tools like
syftandgrypeenumerate the dynamic library graph of an application and check versions against vulnerability databases. - eBPF-based LSM for library load control: BPF LSM (Linux Security Module) hooks on
security_mmap_fileto enforce policy on which libraries any process may load, at the kernel level, without requiring SELinux labels.
Exercises
- Build a vulnerable binary that performs
GOT[puts]overwrite via a contrived arbitrary-write primitive. Usepwntoolsto write the exploit. Then rebuild with full RELRO and observe that the overwrite fails (SIGSEGV on write to read-only memory). - Write an
LD_PRELOADlibrary that interceptsopen(), checks if the opened path starts with/etc/passwd, and returnsEACCESif so. Test it withcat /etc/passwdvsLD_PRELOAD=./deny_passwd.so cat /etc/passwd. Verify it has no effect on a setuid binary. - Build a binary with partial RELRO and one with full RELRO. Use
gdbto examine the writability of the.got.pltsection: set a hardware watchpoint on a GOT entry and observe whether lazy binding fires on first call in the partial case but not the full RELRO case. - Implement a secure plugin loader: the loader accepts a plugin path, verifies it is in a trusted directory, checks its filesystem permissions (owner=root, no world-write), and loads it with
RTLD_NOW | RTLD_LOCAL. Demonstrate that a plugin in/tmpis rejected. - Set up
auditdto log allLD_PRELOADusage on a test system. Write a Bash script that simulates an attacker usingLD_PRELOADto hijackread(). Verify the audit log captures it. Then write a complementary response script that kills any process using an unauthorizedLD_PRELOAD.
References
- Nergal, "The Advanced Return-into-lib(c) Exploits." Phrack 58, 2001. (GOT overwrite origins)
- Tavis Ormandy, "Analysis of the XZ Backdoor." https://openwall.com/lists/oss-security/2024/03/29/4
- Ulrich Drepper, "How to Write Shared Libraries." https://www.akkadia.org/drepper/dsohowto.pdf — Section 2.3 on RELRO
- Pax Team, "Address Space Layout Randomization." https://pax.grsecurity.net/docs/aslr.txt
- Clang CFI documentation: https://clang.llvm.org/docs/ControlFlowIntegrity.html
- checksec.sh: https://github.com/slimm609/checksec.sh
- "Intel® CET Answers Call to Protect Against Common Malware Threats." Intel white paper, 2020.