04 — Signals

Technical Overview

Signals are the simplest inter-process communication mechanism in Unix: a one-way, asynchronous notification that something has happened. They carry no data beyond their number, can be sent from the kernel (hardware exception, timer expiry, child death) or from another process, and their delivery interrupts normal program flow in a way that demands careful, defensive programming. Despite their age — signals date to the earliest UNIX — they remain pervasive in every Linux system, and misunderstanding them is a consistent source of production bugs.

Prerequisites

01-process-concept.md: task_struct, signal fields, process state
03-process-lifecycle.md: SIGCHLD, zombie reaping
Basic C: function pointers, volatile, setjmp/longjmp awareness

Core Content

Signal Taxonomy

Linux supports two classes of signals:

Standard POSIX signals (1–31): fixed semantics, non-queuing (if the same signal is pending twice before delivery, it delivers once).

Real-time signals (32–64, SIGRTMIN–SIGRTMAX): queued (multiple instances stack), carry a value (si_value in siginfo_t), delivered in ascending numeric order.

Key standard signals and their default actions:

Signal	Number	Default action	Common trigger
`SIGHUP`	1	Terminate	Terminal hangup; also: reload config
`SIGINT`	2	Terminate	Ctrl+C from terminal
`SIGQUIT`	3	Core dump	Ctrl+\ from terminal
`SIGILL`	4	Core dump	Illegal CPU instruction
`SIGTRAP`	5	Core dump	Breakpoint / ptrace step
`SIGABRT`	6	Core dump	`abort()` call
`SIGBUS`	7	Core dump	Bus error (misaligned access, mmap'd file truncated)
`SIGFPE`	8	Core dump	Floating-point / integer divide-by-zero
`SIGKILL`	9	Terminate	Uncatchable, unblockable, unignorable
`SIGSEGV`	11	Core dump	Invalid memory access
`SIGPIPE`	13	Terminate	Write to broken pipe with no reader
`SIGALRM`	14	Terminate	`alarm()` timer expiry
`SIGTERM`	15	Terminate	Standard termination request (graceful shutdown)
`SIGCHLD`	17	Ignore	Child stopped or terminated
`SIGCONT`	18	Continue	Resume if stopped
`SIGSTOP`	19	Stop	Uncatchable, unblockable, unignorable
`SIGTSTP`	20	Stop	Ctrl+Z from terminal (catchable)
`SIGUSR1`	10	Terminate	User-defined (convention: reload/reopen logs)
`SIGUSR2`	12	Terminate	User-defined
`SIGWINCH`	28	Ignore	Terminal window size changed

SIGKILL (9) and SIGSTOP (19) are the two signals that cannot be caught, blocked, or ignored. This is intentional: they give the kernel and administrators unconditional control over any process.

Signal Delivery Internals

Signal delivery involves two phases: generation (the signal is sent) and delivery (the process actually handles it). Between the two, the signal is pending.

Signal generation path:
─────────────────────────────────────────────────────────────────────
Source                          Kernel action
─────────────────────────────────────────────────────────────────────
kill(pid, sig)                  send_signal() → sigaddset(&pending, sig)
                                               set TIF_SIGPENDING
hardware exception              do_trap() → force_sig()
timer expiry (SIGALRM)          hrtimer_interrupt → send_signal()
terminal Ctrl+C                 tty driver → kill_pgrp(SIGINT)
child exit                      do_exit() → do_notify_parent() → SIGCHLD
─────────────────────────────────────────────────────────────────────

Signal delivery path (simplified):
─────────────────────────────────────────────────────────────────────
  interrupt/syscall return
         │
         ▼
  exit_to_user_mode_loop()
         │
         ├─ TIF_SIGPENDING set?
         │    yes → get_signal()
         │              │
         │              ├─ iterate pending signals
         │              │     skip if blocked (in task->blocked mask)
         │              │     skip if SIG_IGN
         │              │
         │              ├─ SIG_DFL action?
         │              │     terminate → do_group_exit()
         │              │     core dump → do_coredump()
         │              │     stop → do_signal_stop()
         │              │     ignore → dequeue, continue
         │              │
         │              └─ custom handler (sigaction)?
         │                    → setup_rt_frame() — build sigframe on user stack
         │                       set PC = handler address
         │                       set SP = sigframe
         │                       return to user space running handler
         │
         └─ no pending signals → return normally
─────────────────────────────────────────────────────────────────────

The kernel builds a signal frame (rt_sigframe) on the user-mode stack. It contains a copy of the interrupted CPU state (ucontext_t), which the sigreturn(2) syscall uses to restore after the handler returns.

sigaction(): The Correct Way to Install Handlers

signal() is historical and has implementation-defined behavior on some platforms. Always use sigaction():

struct sigaction {
    void     (*sa_handler)(int);          // simple handler
    void     (*sa_sigaction)(int, siginfo_t *, void *); // if SA_SIGINFO
    sigset_t   sa_mask;     // signals to block during handler
    int        sa_flags;
    void     (*sa_restorer)(void);        // internal, do not use
};

Key sa_flags:

Flag	Effect
`SA_RESTART`	Automatically restart syscalls interrupted by this signal instead of returning `EINTR`. Essential for library code that cannot handle `EINTR`
`SA_SIGINFO`	Use `sa_sigaction` (3-arg) instead of `sa_handler`; handler receives `siginfo_t` with details
`SA_NODEFER`	Do not automatically block the signal during its own handler (allows reentrancy)
`SA_RESETHAND`	Reset the handler to `SIG_DFL` after first delivery (one-shot)
`SA_NOCLDWAIT`	(SIGCHLD only) Do not create zombies; auto-reap children
`SA_NOCLDSTOP`	(SIGCHLD only) Do not deliver SIGCHLD when children stop/continue

siginfo_t fields for SA_SIGINFO handlers:

siginfo_t {
    int      si_signo;   // signal number
    int      si_errno;
    int      si_code;    // SI_USER, SI_KERNEL, SI_TKILL, CLD_EXITED, ...
    pid_t    si_pid;     // sending process PID
    uid_t    si_uid;     // sending process real UID
    void    *si_addr;    // faulting address (SIGSEGV, SIGBUS, SIGILL)
    int      si_status;  // exit status or signal (SIGCHLD)
    union sigval si_value; // RT signal value
}

Signal Masks: Blocking Signals

Each task has a blocked sigset in its task_struct. Signals in the blocked set are not delivered while blocked — they remain pending in task_struct->pending until unblocked.

sigset_t mask;
sigemptyset(&mask);
sigaddset(&mask, SIGINT);
sigaddset(&mask, SIGTERM);

// Block SIGINT and SIGTERM in this thread:
sigprocmask(SIG_BLOCK, &mask, &old_mask);

// ... critical section ...

// Restore previous mask (unblocks SIGINT/SIGTERM):
sigprocmask(SIG_SETMASK, &old_mask, NULL);

In a multi-threaded program, sigprocmask is per-thread. The POSIX rule: signals sent to the process (via kill(pid, sig)) are delivered to an arbitrary thread that has the signal unblocked. Signals sent to a specific thread (via tgkill) go to that thread regardless of its mask (but SIGKILL/SIGSTOP still can't be blocked).

EINTR and SA_RESTART

Many blocking syscalls (read, write, accept, nanosleep, wait) return -1 with errno = EINTR if interrupted by a signal before completion. This is not an error — it is the mechanism by which signals interrupt long operations.

Handling strategies: 1. SA_RESTART: the kernel automatically restarts the syscall. Not all syscalls restart (see man 7 signal, "Interruption of system calls..."). nanosleep and pause never restart. 2. Manual retry loop: c ssize_t r; do { r = read(fd, buf, len); } while (r == -1 && errno == EINTR); 3. signalfd(): block signals with sigprocmask, then read them from a file descriptor — no EINTR possible because the signal is consumed via read() rather than interrupting it. See below.

kill(), tkill(), tgkill()

kill(pid, sig)         — send sig to process (thread group) pid
                         if pid == 0: send to entire process group
                         if pid == -1: send to all processes (except 1 and self)
                         if pid < -1: send to process group |pid|
tkill(tid, sig)        — send to specific thread by TID (deprecated, use tgkill)
tgkill(tgid, tid, sig) — send to thread tid within thread group tgid (safe: checks tgid)
raise(sig)             — send to calling thread (= tgkill(getpid(), gettid(), sig))

tgkill is the correct way to signal a specific POSIX thread — it validates that the TID belongs to the expected thread group, preventing PID/TID reuse attacks.

Signal Safety: async-signal-safe Functions

Signal handlers execute asynchronously with respect to the main program. If the main program is inside malloc() holding the allocator lock when the signal arrives, and the handler also calls malloc(), the result is a deadlock.

POSIX defines a list of async-signal-safe functions that can be safely called from a signal handler. They do not use non-reentrant locks:

Safe:                           NOT safe:
─────────────────────────────── ──────────────────────────────────
write(2)                        printf (uses FILE* lock)
send(2), recv(2)                malloc, free (allocator lock)
read(2)                         syslog (mutex)
open(2), close(2)               exit() (atexit handlers, stdio flush)
kill(2), raise(2)               any C++ exception handling
sigprocmask(2)                  pthread_mutex_lock
_exit(2)                        strtok (static buffer)
getpid(2), gettid(2)            sprintf (in some implementations)
sem_post(3)                     openlog, closelog
clock_gettime(2)                getenv (may allocate)

The canonical safe signal handler pattern: set a volatile flag and return.

static volatile sig_atomic_t shutdown_requested = 0;

static void sigterm_handler(int sig) {
    shutdown_requested = 1;  // safe: sig_atomic_t write is atomic
}
// Main loop:
while (!shutdown_requested) {
    // do work ...
}

signalfd(): Synchronous Signal Handling

signalfd(2) (Linux 2.6.22) allows a process to receive signals as data on a file descriptor, enabling signal handling within select/poll/epoll event loops without the async-signal-safe constraints:

sigset_t mask;
sigemptyset(&mask);
sigaddset(&mask, SIGTERM);
sigaddset(&mask, SIGINT);
sigaddset(&mask, SIGCHLD);

// Block signals so they don't fire traditional handlers:
sigprocmask(SIG_BLOCK, &mask, NULL);

// Create signal fd:
int sfd = signalfd(-1, &mask, SFD_CLOEXEC | SFD_NONBLOCK);

// Now add sfd to your epoll instance and handle like any fd:
// read(sfd, &fdsi, sizeof(fdsi)) returns struct signalfd_siginfo

The self-pipe trick is the pre-signalfd equivalent: a pipe is created; the signal handler writes a byte to the write end (safe — write(2) is async-signal-safe); the event loop monitors the read end. signalfd supersedes this.

Real-Time Signals

RT signals (SIGRTMIN (34) through SIGRTMAX (64) on Linux, adjusted for libc reservations): - Queued: multiple deliveries of the same RT signal stack up, each with its own siginfo_t payload. - Priority-ordered: lower signal numbers delivered first. - Value-carrying: sigqueue(pid, sig, val) attaches a union sigval (int or pointer) to the signal. - Used by: POSIX timers (SIGEV_SIGNAL), librt, some JVM internals (GC notifications).

Historical Context

Signals originated in the first edition of UNIX Research systems (1971) as a way for the kernel to notify a process of exceptional conditions. The original implementation was unreliable: if a signal arrived while the handler was already running, it was lost. BSD 4.2 (1983) introduced "reliable signals" with proper masking during handlers. POSIX.1 (1988) standardized the sigaction() interface.

Real-time signals were introduced in POSIX.1b (1993) and first implemented in Linux 2.1.x. signalfd() was added by Davide Libenzi in Linux 2.6.22 (2007) as part of the broader trend toward file-descriptor-based interfaces for OS events.

Production Examples

Graceful shutdown pattern (nginx/systemd style):

# Send SIGTERM for graceful shutdown:
kill -TERM $(cat /var/run/nginx.pid)
# Send SIGHUP to reload config without restart:
kill -HUP $(cat /var/run/nginx.pid)
# Last resort:
kill -9 $(cat /var/run/nginx.pid)

Tracing signal delivery with strace:

strace -e signal -p PID 2>&1
# Shows sigprocmask, sigreturn calls and signal arrivals

Catching SIGCHLD with signalfd in an event loop:

// In epoll loop, sfd readable:
struct signalfd_siginfo fdsi;
read(sfd, &fdsi, sizeof(fdsi));
if (fdsi.ssi_signo == SIGCHLD) {
    // Reap all exited children:
    while (waitpid(-1, &status, WNOHANG) > 0) { ... }
}

Sending signals between processes safely (tgkill):

// Send SIGUSR1 to a specific thread in another process:
tgkill(target_pid, target_tid, SIGUSR1);

Debugging Notes

EINTR storms: if SA_RESTART is not set and a high-frequency RT signal fires, every read/write returns EINTR. Use strace -c -p PID to see if EINTR dominates syscall time.
Signal handler deadlock: gdb -p PID → bt showing __lll_lock_wait inside malloc with a signal frame in the backtrace indicates a non-async-safe call from a handler. Switch to the volatile flag pattern.
Missing SIGCHLD: if SIGCHLD is set to SIG_IGN via sigaction (not just signal()), children are auto-reaped but no SIGCHLD is delivered. Verify with cat /proc/PID/status | grep SigIgn (bit 16 = bit for signal 17).
rt_sigtimedwait for synchronous delivery: sigwaitinfo(2) blocks until one of the masked signals is pending and atomically dequeues it — useful for a dedicated signal-handling thread.
Core dump not generated: check ulimit -c, kernel.core_pattern, and whether the binary is setuid (setuid binaries don't core-dump by default; controlled by fs.suid_dumpable).

Security Implications

Signal spoofing: any process with the same real UID (or with CAP_KILL) can send any signal to any process. The si_pid and si_uid in siginfo_t identify the sender for signals sent via kill(), but these are only trustworthy for SI_USER signals. Kernel-generated signals (SI_KERNEL, SI_TKILL) are authoritative.
SIGSEGV as an exploit primitive: a SIGSEGV handler that does longjmp out of the handler is technically undefined behavior, but widely used in JVMs (null pointer handling) and fuzzing harnesses. The interaction between signal stack (sigaltstack) and the restored register state via sigreturn is a historical exploit surface ("SROP" — Sigreturn-Oriented Programming).
Signal flooding as DoS: a malicious process with same-UID access can send thousands of RT signals to a target, filling its signal queue (/proc/sys/kernel/sigqueue_max, default 829 * (1 + process_count)) and causing EAGAIN on sigqueue().
SIGKILL and resource cleanup: because SIGKILL cannot be caught, any resources not freed by the kernel (external state: database connections, network state, file locks via lockf) will be left dangling. Design systems to handle abrupt process death.

Performance Implications

Signal delivery overhead: each signal delivery requires entering the kernel, checking the signal frame, setting up the rt_sigframe on the user stack, and doing a second kernel entry for sigreturn. Total cost: ~1–4 µs per signal on modern hardware.
High-frequency signals: using SIGALRM as a profiling tick (as the old gprof did) limits profiling resolution. perf uses hardware performance counters via PMU overflow interrupts instead — much lower overhead.
SA_RESTART and latency: applications that need bounded latency (real-time, trading systems) must audit every sigaction call. SA_RESTART can cause a system call to execute for much longer than expected if it is repeatedly interrupted and restarted.
Blocking vs. signalfd: signalfd + epoll integrates signal handling with I/O in a single event loop thread, eliminating context switches to a signal handler and back. For high-throughput event loops (e.g., HAProxy, nginx), this is the preferred pattern.

Failure Modes

Failure	Symptom	Root cause
Deadlock in signal handler	Process hangs, gdb shows malloc lock in bt	Non-async-safe function in handler
SIGCHLD lost	Zombie accumulation despite handler	Signals arriving while handler runs; loop with `WNOHANG`
EINTR not handled	Spurious errors in production	Syscall returns EINTR, caller doesn't retry
Core dump missing	Crash with no diagnosis	RLIMIT_CORE=0, suid_dumpable=0, or core_pattern misconfigured
RT signal queue overflow	`sigqueue` returns EAGAIN	Queue depth exceeded `sigqueue_max`; process slow to handle
Signal to wrong thread	Handler runs in unintended thread	`kill()` delivers to arbitrary thread; use `tgkill()` for specific thread

Modern Usage

Systemd and SIGTERM/SIGKILL sequence: systemctl stop sends SIGTERM, waits TimeoutStopSec (default 90s), then sends SIGKILL. Services must handle SIGTERM for graceful shutdown. Use KillSignal=SIGUSR1 in unit files to send a different initial signal to daemons that use SIGUSR1 for graceful stop (e.g., old nginx --with-debug).

Go runtime signals: the Go runtime installs handlers for SIGSEGV, SIGBUS, SIGFPE, and SIGABRT to convert hardware exceptions into panics. SIGTERM and SIGINT are caught by the os/signal package. Sending SIGQUIT to a Go process dumps all goroutine stacks (useful for live debugging).

Java JVM signals: the JVM uses SIGUSR1 internally for GC notifications (HotSpot). Do not send SIGUSR1 to a JVM unless you know what you're doing. Use kill -3 <JVM_PID> (SIGQUIT) for a thread dump.

Future Directions

signalfd replacement via io_uring: proposal to handle signals as io_uring completions, integrating with the async I/O submission queue for zero-syscall signal consumption in tight event loops.
Safer signal delivery ordering: the POSIX model for which thread receives a process-directed signal is deliberately vague. Proposals for explicit signal routing (always to a specific nominated thread) would eliminate a class of races.
BPF signal programs: bpf_send_signal() kernel helper allows eBPF programs to send signals to processes being traced, enabling complex policy-based signal delivery from BPF programs without a userspace intermediary.

Exercises

SA_RESTART audit: write a C program that installs a SIGALRM handler (using alarm(1)) without SA_RESTART. Show that read() on stdin returns EINTR. Then add SA_RESTART and verify read() no longer returns EINTR. Use strace to observe the difference in the syscall trace.
Async-signal-safe crash: write a C program that holds a pthread_mutex in the main thread and then raises SIGUSR1. The handler attempts pthread_mutex_lock(). Observe the deadlock. Fix it using the volatile sig_atomic_t flag pattern.
signalfd event loop: implement a minimal event loop using epoll that handles three fds: stdin (readable events), a timerfd firing every second, and a signalfd for SIGINT/SIGTERM. On SIGTERM, print statistics and exit cleanly.
RT signal queue: write two programs: a sender that calls sigqueue() in a tight loop sending SIGRTMIN with an incrementing value, and a receiver using sigwaitinfo() to consume them. Measure how many signals are lost (gaps in the value sequence) as you increase the sender's rate.
SROP awareness: research Sigreturn-Oriented Programming (SROP). Set up a test binary that installs a SIGSEGV handler and uses sigreturn() manually (bypassing the normal sigreturn trampoline). Explain why kernel mitigations like shadow stacks and SA_RESTORER validation make this attack class harder on modern kernels.

References

kernel/signal.c — send_signal(), get_signal(), setup_rt_frame()
arch/x86/kernel/signal.c — signal frame setup, sigreturn system call
include/uapi/asm-generic/signal.h — signal numbers
include/uapi/linux/signalfd.h, fs/signalfd.c — signalfd implementation
Kerrisk, The Linux Programming Interface — Chapters 20–22 (signals), 63 (signalfd)
Stevens & Rago, Advanced Programming in the UNIX Environment — Chapter 10
man 2 sigaction, man 2 sigprocmask, man 2 kill, man 2 tgkill, man 2 signalfd, man 7 signal
POSIX.1-2017: <signal.h>, async-signal-safe function list
"Sigreturn-Oriented Programming" — Erik Bosman, HitB 2014
LWN: "Signals and threads" series