Section 14: Device Drivers
Purpose and Scope
Device drivers are the software translation layer between the kernel's abstract device model and the concrete behavior of hardware. This section covers the Linux driver model from first principles: how drivers are structured as kernel modules, how they integrate with the bus and power management subsystems, and the mechanics of each device class (character, block, network, platform, PCIe, USB). It extends to the specialized domains of GPU driver architecture, the audio stack (ALSA/JACK), storage driver stacks, interrupt handling (legacy, MSI, MSI-X), DMA, ACPI, Device Tree, and the practical discipline of driver debugging.
The subject demands precision: a driver bug corrupts kernel memory directly, and incorrect DMA programming triggers IOMMU faults or silent data corruption. Understanding driver architecture is prerequisite to understanding how hardware actually behaves at runtime.
Prerequisites
- Section 02 (CPU Architecture): PCIe topology, interrupt delivery, DMA, MMIO
- Section 03 (OS Fundamentals): kernel modules, system calls, process/interrupt context
- Section 11 (Memory Management): DMA API, IOMMU, kmalloc/vmalloc
- Basic C programming; familiarity with Linux kernel coding style
Learning Objectives
Upon completing this section you will be able to:
- Write a minimal loadable kernel module (LKM) and explain the module lifecycle (init, exit, reference counting).
- Describe the Linux device model: bus, device, driver, class, and how sysfs exposes them.
- Implement a character device with file_operations (open, read, write, ioctl, mmap).
- Explain PCIe BAR mapping, MSI-X interrupt setup, and DMA ring buffer management.
- Describe the USB subsystem: host controller, hub topology, URB submission, and endpoint types.
- Explain how the DMA API works (dma_alloc_coherent, dma_map_sg) and when to use each variant.
- Describe interrupt handling: hardirq vs softirq vs tasklet vs workqueue and when to use each.
- Use kernel debugging tools: printk, dynamic debug, ftrace, kprobe, kmemleak, KASAN, syzkaller.
- Describe ACPI enumeration and Device Tree for embedded/ARM systems.
Architecture Overview
Hardware
┌─────────────────────────────────────────────────────────────────┐
│ PCIe Device USB Device Platform Device GPU │
│ (NIC, NVMe) (HID, Storage) (SoC peripheral) (discrete) │
└────────┬─────────────┬─────────────────┬────────────────┬──────┘
│ │ │ │
┌────────▼─────────────▼─────────────────▼────────────────▼──────┐
│ Bus Subsystem │
│ pci_bus │ usb_bus │ platform_bus │ i2c_bus │ SPI │
└────────┬──────────────┬────────────────┬───────────────────────┘
│ │ │
┌────────▼──────────────▼────────────────▼──────────────────────┐
│ Driver Core (driver model) │
│ struct device ─ struct device_driver ─ struct bus_type │
│ probe() / remove() / suspend() / resume() │
└─────────┬─────────────────────────────────────────────────────┘
│
┌─────────▼──────────────────────────────────────────────────────┐
│ Device Classes │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
│ │ char │ │ block │ │ network │ │ misc/platform │ │
│ │ /dev/xxx │ │ blk-mq │ │ netdev │ │ input, sound │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────────────┘ │
└─────────┬──────────────────────────────────────────────────────┘
│
┌─────────▼──────────────────────────────────────────────────────┐
│ Interrupt & DMA Layer │
│ IRQ domain ─ MSI/MSI-X ─ NAPI polling ─ DMA API ─ IOMMU │
└────────────────────────────────────────────────────────────────┘
Key Concepts
- Kernel Module (LKM): Loadable object file that extends the kernel at runtime;
insmod/modprobeload,rmmodunloads; reference counted. - Device Model: Linux's unified representation: each device is a
struct device; each driver is astruct device_driver; buses match devices to drivers viaprobe(). - sysfs: Virtual filesystem at /sys exposing the device model hierarchy, driver bindings, and device attributes; udev reads sysfs to create /dev nodes.
- Character Device: Device accessed as a stream (bytes); file_operations table maps system calls to driver functions; major/minor numbers identify the device.
- Block Device: Device accessed in fixed-size sectors via the block layer; submits
struct bioobjects. - Network Device:
struct net_device; transmit viando_start_xmit(); receive via NAPI polling. - Platform Device: Device without an enumerable bus (SoC peripherals); described by Device Tree or ACPI; matched by compatible string.
- PCIe Driver: Registered via
pci_driver;probe()called on PCI ID match; maps BARs viaioremap(), sets up MSI-X. - USB Subsystem: Host Controller Interface (xHCI/EHCI) manages USB topology; URBs (USB Request Blocks) are the I/O primitive; endpoint types: control, bulk, interrupt, isochronous.
- DMA API:
dma_alloc_coherent()for CPU/device-coherent buffers;dma_map_single()/dma_map_sg()for DMA-mapped kernel buffers; IOMMU performs address translation. - Hardirq / Softirq: Hardirq (ISR) runs at interrupt priority — must be fast; softirq deferred work runs at reduced priority; NAPI is a softirq-based receive polling mechanism.
- Tasklet: Single-threaded deferred work, deprecated in modern kernels (replaced by workqueue or threaded IRQ).
- Workqueue: Kernel threads that execute deferred work in process context (can sleep);
schedule_work(). - MSI/MSI-X: PCI message-signaled interrupts; write a magic value to a host memory address instead of asserting a pin; MSI-X provides up to 2048 independent vectors per device.
- ACPI: Advanced Configuration and Power Interface; firmware table describes hardware topology on x86/x64; AML bytecode executed by the kernel.
- Device Tree: Hardware description for embedded/ARM systems; FDT blob passed from bootloader; nodes describe memory ranges, IRQs, clocks.
- GPU Driver: KMS/DRM subsystem for display; GEM for memory management; command submission ring buffers; fence objects for GPU/CPU synchronization.
Major Historical Milestones
| Year | Milestone |
|---|---|
| 1991 | Linux 0.01: inline, monolithic device support; no module system |
| 1995 | Linux kernel modules (LKM) infrastructure stabilizes in 1.x |
| 1999 | USB 1.1 Linux support; UHCI/OHCI host controller drivers |
| 2001 | Linux driver model (kobject, sysfs) introduced — Greg KH, Pat Mochel |
| 2002 | udev created; replaces static /dev with dynamic device files |
| 2003 | Linux 2.6.0: unified driver model, class system, sysfs |
| 2005 | Linux NAPI for high-speed network receive; reduces IRQ overhead |
| 2006 | PCI MSI support in Linux; MSI-X widely adopted by 2008 |
| 2008 | DRM/KMS merged: kernel mode setting for GPU display |
| 2009 | ARM Device Tree support begins merging into mainline |
| 2010 | IOMMU API unified in Linux (Intel VT-d and AMD-Vi) |
| 2011 | Linux USB 3.0 (xHCI) support |
| 2013 | VFIO framework: safe device passthrough to userspace/VMs |
| 2014 | Threaded IRQ becomes recommended pattern for drivers |
| 2016 | RDMA subsystem refactored; mlx5 core split into core + RDMA/Eth |
| 2018 | GPU compute (OpenCL/CUDA/ROCm) via DRM render nodes |
| 2021 | Rust-for-Linux project begins (driver abstractions in Rust) |
| 2022 | First Rust kernel modules merged (6.1 release, 2022) |
Modern Relevance and Production Use Cases
NVMe drivers in the Linux kernel use per-CPU I/O queues mapped to hardware queues via blk-mq; understanding the driver-to-hardware queue mapping is essential for NUMA-aware storage performance.
RDMA NIC drivers (mlx5, efa for AWS Nitro) expose kernel bypass paths via the VFIO/RDMA interface; ML training jobs on GPU clusters depend on correctly configured RoCE or InfiniBand driver settings.
GPU drivers (i915 for Intel, amdgpu, nouveau/nouveau-reclocked for NVIDIA open source) manage command submission, fence synchronization, and display pipeline; production ML inference depends on correct GPU memory management in the driver.
Android kernel ships a modified driver model (binder IPC driver, ion memory allocator, gralloc HAL); understanding the driver model is the foundation for Android hardware support.
Rust-for-Linux is actively introducing Rust driver abstractions; understanding the C driver model illuminates what the Rust abstractions are replacing and why.
File Map
| File | Description |
|---|---|
01-kernel-modules.md |
LKM structure, module_init/exit, parameters, signing |
02-driver-model.md |
kobject, kset, device/driver/bus, probe/remove lifecycle |
03-sysfs-udev.md |
sysfs layout, attribute groups, udev rules, devtmpfs |
04-char-devices.md |
cdev, file_operations, major/minor, ioctl conventions |
05-block-devices.md |
gendisk, blk-mq integration, bio submission |
06-network-devices.md |
net_device, NAPI, ethtool, netdev queues |
07-platform-devices.md |
platform_driver, resource enumeration, devm_ managed resources |
08-pcie-driver-model.md |
pci_driver, BAR mapping, reset, error recovery (AER) |
09-usb-subsystem.md |
URB, endpoint types, USB classes, hub driver |
10-gpu-driver-architecture.md |
DRM/KMS, GEM, fence, render nodes, command submission |
11-alsa-jack-audio.md |
ALSA PCM/control, JACK low-latency, sound card abstraction |
12-storage-driver-stack.md |
SCSI mid-layer, libata, NVMe driver queue mapping |
13-dma-api.md |
Coherent vs streaming DMA, scatter-gather, bounce buffers |
14-interrupt-handling.md |
IRQ domain, request_irq, threaded IRQ, softirq, workqueue |
15-msi-msix.md |
MSI vs MSI-X, vector allocation, affinity, IRQ balancing |
16-acpi-device-tree.md |
ACPI tables (DSDT/SSDT), AML, FDT, compatible strings |
17-power-management.md |
Runtime PM, system suspend/resume, wakeup sources |
18-driver-debugging.md |
dynamic_debug, ftrace, kprobes, KASAN, kmemleak, syzkaller |
Cross-References
- Section 02 (CPU Architecture): PCIe topology, MSI interrupt routing, MMIO, IOMMU
- Section 03 (OS Fundamentals): kernel module loading, process vs interrupt context
- Section 11 (Memory Management): DMA API, IOMMU, kmalloc for driver buffers
- Section 12 (Storage Systems): block layer — storage drivers plug into blk-mq
- Section 13 (Filesystems): MTD layer for flash drivers, char device for raw flash
- Section 15 (Networking): network driver NAPI, XDP in driver context
- Section 19 (Virtualization): VFIO passthrough, virtio driver model