Skip to content

02 — Unix History: Philosophy, Family Tree, and the Fragmentation Wars

Technical Overview

Unix is the most influential operating system ever written, not because it was the most powerful or most commercially successful at any given moment, but because its ideas — the hierarchical filesystem, the process model, pipes, everything-is-a-file, the C implementation — were simple enough to spread and correct enough to persist. Unix began as a rejected Multics simplification on a discarded minicomputer, spread through university source code licenses, forked into BSD and System V, provoked a decade of patent litigation that nearly destroyed it, was codified into POSIX, and ultimately seeded Linux — which now runs most of the world's servers, phones, and supercomputers. This file traces that entire arc.

Prerequisites

  • Understanding of the minicomputer era (PDP-7, PDP-11, VAX)
  • Familiarity with what a process, filesystem, and shell are
  • Awareness of the batch/time-sharing evolution

Origins: Space Travel and the PDP-7 (1969)

Bell Labs Withdraws from Multics

Bell Labs joined the Multics project in 1964 as a partner with MIT and GE. By 1969, Bell Labs management concluded that Multics would not deliver on its schedule or budget and withdrew. Ken Thompson, Dennis Ritchie, Doug McIlroy, and others who had been working on Multics were left without a computer system they liked.

Thompson had been developing a simulation game, "Space Travel" — a simulation of the solar system with a spacecraft the player could maneuver — first on Multics, then on a GE-635. He found a discarded PDP-7 in a Bell Labs corridor and decided to port Space Travel to it.

To run Space Travel properly, Thompson needed: - A filesystem (for game data) - A way to load and run programs - An assembler for the PDP-7

He wrote all three. The filesystem became the core of Unix. Thompson later described the process: "One week, one week, and one week."

The resulting system was called Unics — a pun on Multics, suggesting "one" rather than "many" — and eventually spelled "Unix."

The Unix Filesystem Model

Thompson's filesystem introduced the inode structure — a design so correct it is still used in ext4, XFS, and APFS 55 years later:

Unix Filesystem Structure:

Disk Layout:
+----------+----------+----------+------------------+
| Boot     | Superblock| Inode   | Data blocks      |
| block    | (FS meta) | table   | (file contents)  |
+----------+----------+----------+------------------+

Inode (metadata for one file or directory):
+---------------------------+
| File type (regular/dir/  |
|   symlink/device/pipe)   |
| Permissions (rwxrwxrwx)  |
| Owner UID, Group GID     |
| Size (bytes)             |
| Timestamps (atime/mtime/ |
|   ctime)                 |
| Link count               |
| Direct block pointers    |  -> 10 data blocks directly
|   [0]..[9]               |
| Single indirect pointer  |  -> block of block pointers
| Double indirect pointer  |  -> block of (block of ptrs)
| Triple indirect pointer  |  -> three levels of indirection
+---------------------------+

Directory entry:
+----------+---------------+
| inode #  | filename      |
+----------+---------------+
  (just an inode number and a name -- the inode holds everything else)

Key insight: a directory is just a file containing (inode, name) pairs
             Hard links: multiple directory entries with same inode number
             Symlinks: a file whose content is a path string

The inode's separation of metadata from filename enabled hard links (multiple names for one file), efficient stat() calls (metadata without reading file content), and clean deletion semantics (decrement link count; free inode when it reaches 0).


Unix on the PDP-11 and C (1970–1973)

Thompson persuaded Bell Labs management to fund a PDP-11/20 by proposing a text processing system for patent documentation. The PDP-11 arrived in 1970; Unix was ported to it and significantly extended.

The C Language (1972–1973)

Dennis Ritchie had been developing B (Thompson's typeless systems language) into a typed language called C. The key additions over B: - Type system: int, char, float, pointers, arrays - Structures (struct): aggregate types with named fields - Efficient code generation: C's semantics mapped closely to PDP-11 hardware operations

In 1973, Thompson and Ritchie rewrote the Unix kernel — approximately 11,000 lines — in C, retaining only the lowest-level hardware interaction code in PDP-11 assembly. This was unprecedented: no OS had been written in a high-level language before.

The result: portability. To run Unix on a new machine, you needed: 1. A C compiler for that machine 2. A few hundred lines of machine-specific assembly 3. A few days of work

This made Unix the first "portable" operating system.

The Unix Philosophy (Articulated by Doug McIlroy)

McIlroy, who managed Thompson and Ritchie, articulated the principles guiding Unix design:

  1. Write programs that do one thing and do it well.
  2. Write programs to work together.
  3. Write programs to handle text streams, because that is a universal interface.

Everything is a file: Regular files, directories, devices (/dev/tty, /dev/sda), inter-process communication (/proc, pipes, sockets) — all accessed via open(), read(), write(), close(). This uniformity meant programs written to read from stdin worked equally well reading from a file, a pipe, a network socket, or a device.

Pipes: Thompson implemented pipes (proposed by McIlroy) — a way to connect the stdout of one program to the stdin of another. This enabled composing Unix tools:

# Count words in all .c files modified in last 24 hours
find . -name "*.c" -mtime -1 | xargs wc -l | sort -rn

# Find top 10 IP addresses in a web log
awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -10

# Each program does one thing; the pipeline does something complex
# No program needed to know about the others' existence

Unix v6 and Lions' Commentary (1975–1977)

Unix v6: The Canonical Source

Unix Version 6 (1975) was the version licensed to universities with source code. The source was approximately 9,000 lines of C and 750 lines of assembly — small enough for a single person to read and understand in a few weeks.

John Lions, a computer science lecturer at the University of New South Wales (Australia), annotated the entire v6 kernel source in 1977: the "Lions' Commentary on UNIX 6th Edition." It was the first (and for 20 years only) book that explained how a real OS kernel worked, line by line.

AT&T prohibited redistribution of the Lions' Commentary (the annotated source was licensed, not open). Photocopied bootleg editions circulated for two decades. The Commentary was not officially published until 1996, when AT&T's hold on Unix source code had been broken.

The Lions' Commentary taught the generation of engineers who built SunOS, AIX, HP-UX, Ultrix, IRIX, and Linux what an OS kernel actually looked like.

Unix v6 Source Distribution (approximate):
File                Lines    Purpose
-------             ------   -------
ken/sys1.c          ~700     Basic system calls (read, write, open, close)
ken/sys2.c          ~600     More system calls (fork, exec, wait, signal)
ken/sys3.c          ~400     More system calls
ken/sys4.c          ~400     More system calls (pipe, link, chmod, etc.)
ken/slp.c           ~350     Scheduler (sleep/wakeup, context switch)
ken/trap.c          ~250     Trap handling (system call dispatch, hardware traps)
ken/iget.c          ~400     Inode management
ken/fio.c           ~500     File I/O operations
ken/alloc.c         ~200     Block allocation
ken/rdwri.c         ~200     Read/write implementation
dmr/rk.c            ~250     RK05 disk driver
dmr/tty.c           ~800     Terminal driver
...                 ~9000    Total

BSD: Berkeley Takes Over Unix Development (1977–1993)

Bill Joy and the Berkeley Software Distribution

Bill Joy received a Unix v6 source license at UC Berkeley in 1976 and began improving it. The improvements were distributed as the Berkeley Software Distribution:

BSD Evolution:

1BSD (March 1977): Pascal compiler, Joy's ex editor, new tools
  |
  | [Bill Joy]
  v
2BSD (May 1978): Joy's vi editor, csh (C shell), job control
  |
  v
3BSD (December 1979): VAX port, virtual memory improvements
  |
  v
4BSD (November 1980): Job control (full), symbolic links, more
  |
  | [DARPA funding begins; TCP/IP implementation starts]
  v
4.1BSD (June 1981): Performance improvements, auto-configuration
  |
  v
4.2BSD (August 1983): TCP/IP stack, Fast File System, IPC, sockets
  |           <- The release that put TCP/IP everywhere
  v
4.3BSD (June 1986): Networking improvements, NFS, more
  |
  v
4.3BSD Tahoe (June 1988): Clean portability improvements
  |
  v
4.3BSD Reno (June 1990): Early IEEE POSIX compliance, NFS improvements
  |
  v
4.4BSD (June 1993): Major rearchitecture (Mach VM, cleaner internals)
  | <- Delayed by AT&T lawsuit; released after settlement
  v
4.4BSD-Lite: Source without AT&T code -> FreeBSD, NetBSD, OpenBSD

BSD's Technical Contributions

Fast File System (FFS, Marshall Kirk McKusick, 1984): The original Unix filesystem placed file data blocks randomly on disk, requiring seeks between the inode and data. FFS introduced: - Cylinder groups: Disk divided into groups of cylinders; each group has its own inode table and data blocks. Files are allocated in the same cylinder group as their directory. - Larger blocks (4KB or 8KB vs 512B) and fragments (1KB sub-blocks for small files) - Rotational layout: Blocks of a file laid out with gaps to account for disk rotation time between reads (relevant for rotational disks; less so for SSDs)

FFS throughput was 10–40x faster than the original filesystem on typical workloads. FFS concepts survive in ext4's block group allocation.

BSD Sockets (1983): The API for network programming that remains universal today. Key design: sockets as file descriptors — you read() and write() network connections just like files.

Virtual Memory (1979, 4.0BSD): BSD added demand paging to Unix on the VAX before AT&T's official VM support. The VM system in 4.4BSD was later replaced with the Mach VM (see file 05, XNU).


System V: AT&T's Commercial Unix (1983)

As BSD was growing through university adoption, AT&T was developing its own commercial Unix lineage: System V (pronounced "System Five"). After the Bell System breakup (1984), AT&T was free to commercialize Unix.

System V Release 1 (1983): First AT&T release targeting commercial customers. Based on internal development rather than BSD.

System V Release 2 (1984): Added shared memory, semaphores, and message queues (the "System V IPC" interfaces still present in Linux today).

System V Release 3 (1987): Added STREAMS (a modular I/O subsystem designed by Dennis Ritchie), Remote File Sharing (RFS, AT&T's NFS competitor).

System V Release 4 (SVR4, 1989): The most significant release — merged AT&T System V, BSD features, SunOS features, and Xenix (Microsoft's x86 Unix). SVR4 introduced: - Unified filesystem (UFS) combining FFS concepts - Virtual filesystem (VFS) switch for filesystem independence - POSIX compliance - Merged AT&T/BSD networking

SVR4 was co-developed by AT&T and Sun Microsystems. Its descendants include Solaris, HP-UX, and AIX (though each vendor added proprietary extensions).


AT&T vs. BSDI/UCB Lawsuit (1990–1994): The Lawsuit That Nearly Killed BSD

Background

By 1990, Berkeley had been rewriting BSD to remove AT&T-licensed code, working toward a fully free distribution. BSDI (Berkeley Software Design Inc.) was a startup commercializing 386BSD (a BSD port to x86) using the UC Berkeley code.

AT&T's Unix Systems Laboratories (USL) filed suit in 1992 claiming: 1. BSDI's product was derived from AT&T source code 2. BSDI was not honoring its license terms 3. UC Berkeley had included AT&T code in BSD without proper attribution

Berkeley counter-sued, claiming AT&T had incorporated BSD code into System V without credit.

The Effect on Linux: The lawsuit created enormous legal uncertainty around BSD. Hackers who might have contributed to BSD or chosen it as a Linux alternative held back. Torvalds released Linux 0.01 in September 1991; the BSD lawsuit filed in early 1992 created a "don't touch BSD" climate that inadvertently channeled contributor energy toward Linux.

Settlement (January 1994): Negotiated between Novell (which had purchased USL) and UC Berkeley. Berkeley agreed to remove 3 files from the BSD distribution and add copyright notices. The 4.4BSD-Lite distribution (without the three files) was freely redistributable. FreeBSD, NetBSD, and OpenBSD were derived from it.

The Cost: The lawsuit delayed BSD by approximately two years. Had it not occurred, FreeBSD (not Linux) might have become the dominant open-source Unix clone. By 1994, Linux had two more years of community development and mindshare that it never lost.


POSIX: Standardizing Unix (1988)

By the mid-1980s, Unix had fractured into incompatible variants: BSD, System V, SunOS, HP-UX, AIX, Ultrix, IRIX, Xenix. A program written for BSD required source changes to run on System V (different signal semantics, different IPC mechanisms, different terminal I/O). This incompatibility fragmented the Unix market and frustrated application developers.

POSIX (Portable Operating System Interface, IEEE 1003.1, 1988) defined a standard API for "Unix-like" operating systems: - Process creation and control (fork, exec, wait, signals) - File I/O (open, read, write, close, stat, lseek) - Directory operations (opendir, readdir, mkdir, rmdir) - Terminal control (termios) - User and group management

POSIX did not define an OS implementation — only the interface. A program written to POSIX could compile and run on any POSIX-compliant system.

Windows NT implemented POSIX (for government contract compliance). macOS is POSIX-certified. Linux is POSIX-compliant (not formally certified but functionally compliant). POSIX is why you can compile and run the same C code on Linux, macOS, and FreeBSD without modification.


The Unix Wars (1985–1993): Workstation Fragmentation

The "Unix wars" refer to the period when competing Unix vendors had incompatible workstation Unix variants, making software portability extremely difficult.

Unix Variants in the 1980s-90s:

Vendor      OS          Based On      Processor
------      --          --------      ---------
Sun         SunOS 4.x   4.3BSD        SPARC, 68000
Sun         Solaris     SVR4 + BSD    SPARC, x86
HP          HP-UX       SVR3/4        PA-RISC, IA-64
IBM         AIX         SVR2 + BSD    POWER, RS/6000
DEC         Ultrix      4.3BSD        MIPS, VAX
DEC         Tru64 UNIX  OSF/1         Alpha
SGI         IRIX        SVR3 + BSD    MIPS
Microsoft   Xenix       V7 + SVR3     x86, 68000
Apple       A/UX        SVR2 + BSD    68000

"Almost compatible": same Unix philosophy, same C language, 
incompatible system call numbers, different header files,
different compiler flags, different library ABIs

The commercial Unix vendors formed alliances: AT&T + Sun (USO: Unix Software Operation) vs everyone else (IBM + HP + DEC = OSF: Open Software Foundation). These alliances produced competing standards (UI, SVR4 vs POSIX from OSF). The irony: a system designed for portability had fragmented into a dozen incompatible variants.

Linux's rise from 1991–1998 was partly driven by this frustration. Linux chose to implement POSIX and glibc (the GNU C library), offering a single, free, standard target. Application developers could write once and run on any Linux distribution.


Unix Family Tree ASCII

Unix Family Tree (simplified):

         AT&T UNIX (1969, PDP-7, Thompson/Ritchie)
               |
         v1-v6 (1971-1975) <---- Lions' Commentary (1977)
               |
      +--------+--------+
      |                 |
    v7 (1979)        BSD (1977, Berkeley, Joy)
      |                 |
  +---+----+    +-------+--------+--------+
  |        |    |       |        |        |
System III  1BSD  2BSD  3BSD    4BSD
(1981)    (1977)(1978)(1979)  (1980)
  |                              |
System V                       4.1BSD
(1983)                        (1981)
  |                              |
  |                            4.2BSD (1983) <- TCP/IP, FFS
  |                              |
  +--------+                   4.3BSD (1986)
  |  SVR3  |                     |
  | (1987) |                 Net/2 (1991) <- litigation-clouded
  |        |                     |
  +--------+              4.4BSD (1993) <- post-lawsuit
  |  SVR4  |             /        \
  | (1989) |       FreeBSD     NetBSD (1993)
  +--------+       (1993)        |
       |               |       OpenBSD (1995)
    Solaris          macOS/  
    HP-UX           Darwin (XNU uses FreeBSD code)
    AIX             iOS

           POSIX (1988) ----------+
                                  |
    GNU tools (1984+) --------+   |
    (Stallman: gcc,           |   |
     glibc, bash, etc.)       |   |
               |              |   |
    Minix (Tanenbaum, 1987)   |   |
               |              |   |
          Linux 0.01 (1991)   |   |
    (Torvalds, comp.os.minix) |   |
               |              |   |
          Linux 1.0 (1994) ---+---+
               |          GNU/Linux
          Linux 2.0 (1996, SMP)
               |
          Linux 2.6 (2003, enterprise)
               |
          Linux 5.x (2019+)
          [Android, servers, supercomputers]

Commercial Unix Variants: Technical Notes

Solaris (Sun Microsystems, 1992)

Sun developed SunOS 4.x (BSD-based) through the 1980s, then switched to SVR4 with Solaris 2.0. Solaris technical contributions: - ZFS (2005): A filesystem combining volume management, RAID, and filesystem in one layer. Features: copy-on-write, transactional semantics, checksums on all data, snapshots, send/receive. ZFS was open-sourced via OpenSolaris in 2005 and ported to Linux and FreeBSD. - DTrace (2005): Dynamic tracing framework for production systems — safe instrumentation of running kernels and applications without rebooting. Ported to macOS, FreeBSD, and Linux. - SMF (Service Management Facility): Dependency-aware service management, predecessor to systemd's concept.

HP-UX (HP, 1984)

Hewlett-Packard's Unix for PA-RISC and IA-64 processors. Notable for very early support of large files, large filesystems, and journaling with the VxFS (Veritas Filesystem). HP-UX introduced kernel threads and processor affinity concepts before mainstream Linux.

AIX (IBM, 1986)

IBM's Unix for POWER processors. Notable for: - Logical Volume Manager (LVM): Abstraction layer between physical disks and filesystems; concept adopted by Linux LVM, extensively used. - SMIT: A curses-based system management interface. Before GUIs were practical for administrators. - The most extreme backward compatibility of any Unix: AIX can run 30-year-old binaries.

IRIX (SGI, 1988)

SGI's Unix for MIPS processors, targeting 3D graphics and scientific computing. IRIX innovations: - XFS: A 64-bit journaling filesystem with excellent large-file performance. XFS was open-sourced to Linux in 2001 and is the default filesystem in RHEL 7+. - Advanced GPU support (before modern GPU drivers existed) - NUMA (Non-Uniform Memory Architecture) support — critical for large SMP systems


POSIX vs. Reality: Where Standards Fail

POSIX standardized APIs but not behaviors. Edge cases that differ across implementations:

/* POSIX-compliant but implementation-defined: */

/* Signal delivery: can a signal interrupt a system call? */
/* POSIX says SA_RESTART controls this, but behavior varies */
sigaction(SIGINT, &sa, NULL);
read(fd, buf, len);  /* may or may not restart on signal */

/* File locking: advisory vs mandatory, network vs local */
flock(fd, LOCK_EX);  /* advisory on Linux; doesn't prevent other access */

/* mmap behavior with concurrent writes: */
char *p = mmap(..., MAP_SHARED, fd, 0);
/* Concurrent writes to file: visible immediately? Depends on OS */

/* stat() on /proc: */
stat("/proc/self/fd", &s);  /* st_nlink may be 0, 2, or fd count */

POSIX conformance is a floor, not a ceiling. Code that needs to be truly portable requires careful testing on each target platform.


Production Relevance

Unix history is immediately relevant:

  1. Every system call you make is in POSIX. open(), read(), fork(), exec(), pipe(), socket() — these interfaces were defined by Unix and POSIX. Understanding their design intent explains their behavior.

  2. The Unix filesystem model (inodes, directories as files, hard links, device files) is what ext4, XFS, APFS, ZFS all implement. When you debug a filesystem problem, you are debugging an inode structure.

  3. The BSD code is in macOS. The networking stack, filesystem semantics, and POSIX layer in macOS directly descend from 4.4BSD. Understanding BSD helps you understand macOS internals.

  4. System V IPC is in Linux. shmget(), semget(), msgget() — the System V IPC mechanisms from SVR2 (1984) are still in the Linux kernel and used by databases (PostgreSQL, Oracle) for shared memory and semaphores.

  5. The fragmentation that POSIX solved is recurring with containers and cloud APIs. Docker, Kubernetes, cloud provider APIs — all show the same fragmentation pattern that Unix showed in the 1980s. POSIX emerged; what will emerge to standardize container interfaces?


Key Figures

Person Contribution
Ken Thompson Unix origins, PDP-7, filesystem, pipes
Dennis Ritchie C language, Unix co-author, portability
Doug McIlroy Pipes concept, Unix philosophy articulation
Bill Joy BSD, TCP/IP, vi, csh
Marshall Kirk McKusick FFS, 4.4BSD, FreeBSD
Kirk McKusick, Sam Leffler, Mike Karels BSD networking team
John Lions Lions' Commentary — taught a generation
Andrew Tanenbaum Minix, influenced Torvalds' Linux
Richard Stallman GNU tools — gcc, glibc, bash (without these, Linux would have had no tools)

Lessons Learned

  1. A good idea spreads through source code, not documentation. Unix spread because AT&T gave universities the source. BSD spread because Berkeley released it. Linux spread because Torvalds released it under GPL. Every major OS advance in this history was propelled by source code availability.

  2. Simplification is harder than generalization. Multics tried to generalize everything; Unix simplified aggressively. The simplified version won. The hardest thing in system design is deciding what to leave out.

  3. Legal battles delay technology and redirect communities. The AT&T/BSD lawsuit delayed BSD by two years and handed Linux a runway it might not otherwise have had. Patent and copyright wars in computing consistently damage the ecosystem more than they protect intellectual property.

  4. Standardization follows fragmentation, not the reverse. POSIX was created after the Unix wars made incompatibility painful. Standardization bodies rarely anticipate fragmentation; they clean it up afterward.

  5. The Unix philosophy scales. Pipes as composition, text as universal interface, small single-purpose tools — these principles scale from shell scripts to microservices. The conceptual model that Thompson designed for an 18-bit machine in 1969 still shapes distributed system design.


Exercises

  1. Examine the inode structure in a real filesystem: create a file, examine its inode with stat, debugfs, or istat. Add a hard link. Observe that both names share the same inode number. Delete one name. What happens to the inode?
  2. Implement a simple pipe-based data pipeline: write three small programs (a producer, a filter, a consumer) and connect them with Unix pipes. Measure throughput vs passing data through files.
  3. Research the BSD socket API history: find the original 4.2BSD socket implementation. How many lines of code was the original accept/listen/connect TCP implementation? Compare to the current Linux TCP implementation size.
  4. Read "The Evolution of the Unix Time-sharing System" (Ritchie, 1979, available online). What aspects of the Unix design does Ritchie identify as accidents rather than design? What would he do differently?
  5. Compile a simple POSIX program on Linux and macOS. Find at least one behavior difference (hint: try SIGCHLD handling with wait(), or O_DSYNC semantics, or mmap of a sparse file).

References

  • Ritchie, D.M. and Thompson, K. (1974). "The UNIX Time-Sharing System." CACM.
  • Ritchie, D.M. (1979). "The Evolution of the Unix Time-sharing System." Language Design and Programming Methodology.
  • Lions, J. (1977, published 1996). Lions' Commentary on UNIX 6th Edition. Peer-to-Peer Communications.
  • McKusick, M.K. et al. (1996). The Design and Implementation of the 4.4BSD Operating System. Addison-Wesley.
  • McKusick, M.K. (1984). "A Fast File System for UNIX." ACM Transactions on Computer Systems.
  • Salus, P.H. (1994). A Quarter Century of UNIX. Addison-Wesley.
  • Garfinkel, S. and Spafford, G. (1993). Practical UNIX Security. O'Reilly.
  • IEEE Standard 1003.1 (POSIX). IEEE, 1988.
  • Raymond, E.S. (2003). The Art of Unix Programming. Addison-Wesley.