Section 48: Research Papers — Overview
Purpose and Scope
This section is the primary literature catalog for the systems knowledge archive. It organizes the research papers and technical reports that form the intellectual foundation of modern systems software: the papers that introduced ideas now taken for granted (virtual memory, Paxos, log-structured filesystems, eBPF), the papers that documented landmark implementations (Unix, Multics, Alto, Mach), and the papers that identified attacks and defenses shaping the current security landscape (Meltdown, Spectre, rowhammer). The catalog is organized by domain, annotated with reading difficulty and estimated reading time, and cross-referenced to the archive section where each paper is placed in context.
The reading list is curated, not exhaustive. The selection criterion is educational leverage: papers that, when read and understood, unlock the ability to understand an entire domain. Quantity is secondary to depth. A practitioner who has deeply understood thirty papers in this list understands systems better than one who has skimmed three hundred.
Prerequisites
This section requires no prerequisites for browsing. Reading individual papers requires background in the relevant domain section of the archive.
Learning Objectives
Upon completing this section, the reader will be able to:
- Navigate the primary literature of systems research across eleven domains
- Locate the original paper for any major systems concept encountered in the archive
- Apply a structured paper-reading methodology (Keshav's three-pass method)
- Identify which papers are foundational (must read), which are important context, and which are advanced
- Build a personalized reading list for their target career track
Paper Reading Methodology
KESHAV'S THREE-PASS METHOD
============================
PASS 1 (5-10 min): Skim
─────────────────────────
Read title, abstract, introduction
Read section headings
Read conclusions
Goal: understand what problem is solved and claimed contribution
PASS 2 (1-2 hours): Understand
───────────────────────────────
Read carefully, skip proofs and details
Note figures and graphs
Mark unclear references
Goal: grasp the main idea and evidence
PASS 3 (4-8 hours): Critique
─────────────────────────────
Re-implement the key idea mentally (or actually)
Evaluate assumptions and claims
Identify what this does NOT solve
Goal: deep understanding; ability to extend or critique
Domain 1: Foundational CS and Concurrency
| Paper |
Authors |
Year |
Significance |
| Go To Statement Considered Harmful |
Dijkstra |
1968 |
Structured programming; discipline of control flow |
| Cooperating Sequential Processes |
Dijkstra |
1968 |
Semaphores; mutual exclusion; first formalization of concurrent processes |
| Time, Clocks, and the Ordering of Events |
Lamport |
1978 |
Lamport clocks; happened-before relation; the basis of distributed reasoning |
| Solution of a Problem in Concurrent Programming Control |
Dijkstra |
1965 |
Mutual exclusion; the problem that spawned synchronization theory |
| How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs |
Lamport |
1979 |
Sequential consistency definition; memory model foundations |
| Concurrent Reading and Writing |
Lamport |
1977 |
Foundational work on concurrent data structures |
Domain 2: Operating Systems Architecture
| Paper |
Authors |
Year |
Significance |
| The Structure of the MULTICS Supervisor |
Corbató, Vyssotsky |
1965 |
Protection rings, segmentation, hierarchical file system |
| The UNIX Time-Sharing System |
Ritchie, Thompson |
1974 |
Unix design; simplicity as organizing principle |
| The Evolution of the Unix Time-Sharing System |
Ritchie |
1979 |
Design rationale; what was discarded and why |
| Reflections on Trusting Trust |
Thompson |
1984 |
Turing Award lecture; compiler trojan; trust in toolchains |
| The Nucleus of a Multiprogramming System |
Hansen |
1970 |
First microkernel concept |
| Exokernel: An Operating System Architecture for Application-Level Resource Management |
Engler et al. |
1995 |
Library OS; hardware resource exposure to applications |
| Improving IPC by Kernel Design |
Liedtke |
1993 |
Proof that microkernels can be fast; L4 design principles |
| seL4: Formal Verification of an OS Kernel |
Klein et al. |
2009 |
Machine-checked proof of functional correctness |
| Barrelfish: Think Differently about Operating Systems |
Baumann et al. |
2009 |
Multikernel; message-passing between per-core kernels |
| Dune: Safe User-level Access to Privileged CPU Features |
Belay et al. |
2012 |
Hardware virtualization for OS research; VT-x in user space |
| Singularity: Rethinking the Software Stack |
Hunt, Larus |
2007 |
Type-safe OS; SIP isolation without hardware protection |
Domain 3: Memory Management
| Paper |
Authors |
Year |
Significance |
| Virtual Memory, Processes, and Sharing in MULTICS |
Daley, Dennis |
1968 |
First complete virtual memory design |
| The Working Set Model for Program Behavior |
Denning |
1968 |
Working set; page replacement theory foundation |
| Practical, Transparent Operating System Support for Superpages |
Navarro et al. |
2002 |
Huge pages; transparent hugepage research basis |
| Reconsidering Custom Memory Allocation |
Berger et al. |
2002 |
Heap allocation analysis; comparison of custom allocators |
| Scalable Kernel TCP Design and Implementation for Short-Lived Connections |
Lin et al. |
2016 |
NUMA-aware socket allocation |
Domain 4: Scheduling
| Paper |
Authors |
Year |
Significance |
| Lottery Scheduling: Flexible Proportional-Share Resource Management |
Waldspurger, Weihl |
1994 |
Randomized fair scheduling; proportional share |
| Stride Scheduling: Deterministic Proportional-Share Resource Management |
Waldspurger |
1995 |
Deterministic version of lottery scheduling |
| Scheduling for Reduced CPU Energy |
Weiser et al. |
1994 |
Dynamic voltage/frequency scaling motivation |
| Borrowed-Virtual-Time Scheduling |
Duda, Cheriton |
1999 |
BVT; mixed time-sharing and real-time |
| Completely Fair Scheduler Design |
Molnar |
2007 |
CFS Linux kernel design notes (lwn.net) |
Domain 5: Storage and Filesystems
| Paper |
Authors |
Year |
Significance |
| A Fast File System for UNIX |
McKusick et al. |
1984 |
FFS; cylinder groups; block allocation strategy |
| The Design and Implementation of a Log-Structured File System |
Rosenblum, Ousterhout |
1992 |
LFS; sequential writes; segment cleaning |
| Soft Updates: A Solution to the Metadata Update Problem in File Systems |
Ganger, Patt |
1994 |
Dependency tracking for metadata consistency |
| ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks |
Mohan et al. |
1992 |
WAL; the algorithm underlying most database recovery |
| F2FS: A New File System for Flash Storage |
Lee et al. |
2015 |
Flash-optimized filesystem design |
| FSCQ: Using Crash Hoare Logic for Certifying the YGGDRASIL File System |
Chen et al. |
2015 |
Verified crash-safe filesystem |
| Don't Stack Your Log on My Log |
Yang et al. |
2014 |
Layered log systems; interaction between FS journal and device FTL |
Domain 6: Distributed Systems
| Paper |
Authors |
Year |
Significance |
| The Part-Time Parliament |
Lamport |
1989/1998 |
Paxos consensus algorithm; the foundational distributed consensus paper |
| Paxos Made Simple |
Lamport |
2001 |
Accessible Paxos explanation |
| In Search of an Understandable Consensus Algorithm (Raft) |
Ongaro, Ousterhout |
2014 |
Raft; designed for understandability; widely implemented |
| Dynamo: Amazon's Highly Available Key-Value Store |
DeCandia et al. |
2007 |
Eventual consistency; consistent hashing; vector clocks in production |
| Bigtable: A Distributed Storage System for Structured Data |
Chang et al. |
2006 |
Wide-column store; foundation for HBase, Cassandra |
| The Google File System |
Ghemawat et al. |
2003 |
Large-file distributed storage; relaxed consistency; append-optimized |
| MapReduce: Simplified Data Processing on Large Clusters |
Dean, Ghemawat |
2004 |
Bulk data processing; fault tolerance via re-execution |
| Spanner: Google's Globally-Distributed Database |
Corbett et al. |
2012 |
TrueTime; external consistency; globally distributed transactions |
| Chubby: The Chubby Lock Service for Loosely-Coupled Distributed Systems |
Burrows |
2006 |
Distributed lock service; Paxos in practice at Google |
| Dapper: A Large-Scale Distributed Systems Tracing Infrastructure |
Sigelman et al. |
2010 |
Distributed tracing; trace context propagation; sampling strategy |
| Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications |
Stoica et al. |
2001 |
Consistent hashing ring; O(log N) routing |
| A Comprehensive Study of CRDTs |
Shapiro et al. |
2011 |
Conflict-free replicated data types; eventual consistency without conflicts |
| CAP Twelve Years Later: How the Rules Have Changed |
Brewer |
2012 |
CAP theorem clarification; PACELC model |
| Zookeeper: Wait-free Coordination for Internet-Scale Systems |
Hunt et al. |
2010 |
Coordination service; watch mechanism; ZAB protocol |
| Apache Kafka: A Distributed Messaging System for Log Processing |
Kreps et al. |
2011 |
Log-as-database; durable messaging; partition-based parallelism |
Domain 7: Networking
| Paper |
Authors |
Year |
Significance |
| A Protocol for Packet Network Intercommunication |
Cerf, Kahn |
1974 |
TCP/IP design; the foundation |
| Congestion Avoidance and Control |
Jacobson |
1988 |
TCP congestion control; slow start; AIMD |
| The BSD Packet Filter: A New Architecture for User-Level Packet Capture |
McCanne, Jacobson |
1993 |
BPF virtual machine; the ancestor of eBPF |
| netmap: a novel framework for fast packet I/O |
Rizzo |
2012 |
Zero-copy packet I/O; user-space packet processing |
| The eXpress Data Path: Fast Programmable Packet Processing in the Operating System Kernel |
Høiland-Jørgensen et al. |
2018 |
XDP; eBPF in driver receive path |
Domain 8: Security
| Paper |
Authors |
Year |
Significance |
| Smashing the Stack for Fun and Profit |
Aleph One |
1996 |
Stack overflow exploitation techniques; introduced the exploit community |
| On the Effectiveness of Address-Space Randomization |
Shacham et al. |
2004 |
ASLR limitations; offset2lib attack |
| Return-Oriented Programming: Exploits Without Code Injection |
Shacham |
2007 |
ROP; defeating W^X/NX with existing code gadgets |
| Meltdown: Reading Kernel Memory from User Space |
Lipp et al. |
2018 |
Speculative execution side channel; KPTI required |
| Spectre Attacks: Exploiting Speculative Execution |
Kocher et al. |
2018 |
Branch predictor attacks; Spectre variants 1 and 2 |
| Rowhammer.js: A Remote Software-Induced Fault Attack |
Gruss et al. |
2016 |
DRAM rowhammer from JavaScript; hardware security implications |
Domain 9: Virtualization
| Paper |
Authors |
Year |
Significance |
| Xen and the Art of Virtualization |
Barham et al. |
2003 |
Paravirtualization; Xen hypervisor design |
| A Comparison of Software and Hardware Techniques for x86 Virtualization |
Adams, Agesen |
2006 |
Binary translation vs. VT-x; VMware analysis |
| Firecracker: Lightweight Virtualization for Serverless Applications |
Agache et al. |
2020 |
MicroVM for serverless; KVM + minimal VMM in Rust |
Domain 10: Modern Systems
| Paper |
Authors |
Year |
Significance |
| Scalable Kernel TCP Design and Implementation for Short-Lived Connections |
Lin et al. |
2016 |
NUMA socket scalability |
| The Tock Embedded Operating System |
Levy et al. |
2017 |
Rust OS for embedded; process isolation via Rust type system |
| eBPF-based Content and Service Aware Networking for Cloud-Native 5G Mobile Networks |
Rajagopalan et al. |
2021 |
eBPF in 5G infrastructure |
| io_uring |
Axboe |
2019 |
io_uring design documentation (kernel.dk/io_uring.pdf) |
| The Linux SCTP Implementation |
Stewart et al. |
2001 |
Protocol layering and kernel integration |
File Map
48-research-papers/
├── 00-overview.md ← This file
├── 01-foundational-papers.md
├── 02-os-papers.md
├── 03-memory-management-papers.md
├── 04-scheduling-papers.md
├── 05-storage-filesystem-papers.md
├── 06-distributed-systems-papers.md
├── 07-networking-papers.md
├── 08-security-papers.md
├── 09-virtualization-papers.md
├── 10-modern-systems-papers.md
├── 11-performance-papers.md
└── 12-paper-reading-guide.md
Cross-References
- All Sections 00-44: each section references the papers cataloged here
- Section 45 (Learning Roadmaps): paper reading lists for each career track
- Section 43 (Formal Verification): seL4, CompCert, FSCQ verification papers
- Section 44 (Rust and Memory Safety): Tock, Theseus, Rust ownership papers
- Section 40 (Failure History): Meltdown, Spectre, and fault injection papers