Skip to content

04 — Signals

Technical Overview

Signals are the simplest inter-process communication mechanism in Unix: a one-way, asynchronous notification that something has happened. They carry no data beyond their number, can be sent from the kernel (hardware exception, timer expiry, child death) or from another process, and their delivery interrupts normal program flow in a way that demands careful, defensive programming. Despite their age — signals date to the earliest UNIX — they remain pervasive in every Linux system, and misunderstanding them is a consistent source of production bugs.


Prerequisites

  • 01-process-concept.md: task_struct, signal fields, process state
  • 03-process-lifecycle.md: SIGCHLD, zombie reaping
  • Basic C: function pointers, volatile, setjmp/longjmp awareness

Core Content

Signal Taxonomy

Linux supports two classes of signals:

Standard POSIX signals (1–31): fixed semantics, non-queuing (if the same signal is pending twice before delivery, it delivers once).

Real-time signals (32–64, SIGRTMINSIGRTMAX): queued (multiple instances stack), carry a value (si_value in siginfo_t), delivered in ascending numeric order.

Key standard signals and their default actions:

Signal Number Default action Common trigger
SIGHUP 1 Terminate Terminal hangup; also: reload config
SIGINT 2 Terminate Ctrl+C from terminal
SIGQUIT 3 Core dump Ctrl+\ from terminal
SIGILL 4 Core dump Illegal CPU instruction
SIGTRAP 5 Core dump Breakpoint / ptrace step
SIGABRT 6 Core dump abort() call
SIGBUS 7 Core dump Bus error (misaligned access, mmap'd file truncated)
SIGFPE 8 Core dump Floating-point / integer divide-by-zero
SIGKILL 9 Terminate Uncatchable, unblockable, unignorable
SIGSEGV 11 Core dump Invalid memory access
SIGPIPE 13 Terminate Write to broken pipe with no reader
SIGALRM 14 Terminate alarm() timer expiry
SIGTERM 15 Terminate Standard termination request (graceful shutdown)
SIGCHLD 17 Ignore Child stopped or terminated
SIGCONT 18 Continue Resume if stopped
SIGSTOP 19 Stop Uncatchable, unblockable, unignorable
SIGTSTP 20 Stop Ctrl+Z from terminal (catchable)
SIGUSR1 10 Terminate User-defined (convention: reload/reopen logs)
SIGUSR2 12 Terminate User-defined
SIGWINCH 28 Ignore Terminal window size changed

SIGKILL (9) and SIGSTOP (19) are the two signals that cannot be caught, blocked, or ignored. This is intentional: they give the kernel and administrators unconditional control over any process.


Signal Delivery Internals

Signal delivery involves two phases: generation (the signal is sent) and delivery (the process actually handles it). Between the two, the signal is pending.

Signal generation path:
─────────────────────────────────────────────────────────────────────
Source                          Kernel action
─────────────────────────────────────────────────────────────────────
kill(pid, sig)                  send_signal() → sigaddset(&pending, sig)
                                               set TIF_SIGPENDING
hardware exception              do_trap() → force_sig()
timer expiry (SIGALRM)          hrtimer_interrupt → send_signal()
terminal Ctrl+C                 tty driver → kill_pgrp(SIGINT)
child exit                      do_exit() → do_notify_parent() → SIGCHLD
─────────────────────────────────────────────────────────────────────

Signal delivery path (simplified):
─────────────────────────────────────────────────────────────────────
  interrupt/syscall return
         │
         ▼
  exit_to_user_mode_loop()
         │
         ├─ TIF_SIGPENDING set?
         │    yes → get_signal()
         │              │
         │              ├─ iterate pending signals
         │              │     skip if blocked (in task->blocked mask)
         │              │     skip if SIG_IGN
         │              │
         │              ├─ SIG_DFL action?
         │              │     terminate → do_group_exit()
         │              │     core dump → do_coredump()
         │              │     stop → do_signal_stop()
         │              │     ignore → dequeue, continue
         │              │
         │              └─ custom handler (sigaction)?
         │                    → setup_rt_frame() — build sigframe on user stack
         │                       set PC = handler address
         │                       set SP = sigframe
         │                       return to user space running handler
         │
         └─ no pending signals → return normally
─────────────────────────────────────────────────────────────────────

The kernel builds a signal frame (rt_sigframe) on the user-mode stack. It contains a copy of the interrupted CPU state (ucontext_t), which the sigreturn(2) syscall uses to restore after the handler returns.


sigaction(): The Correct Way to Install Handlers

signal() is historical and has implementation-defined behavior on some platforms. Always use sigaction():

struct sigaction {
    void     (*sa_handler)(int);          // simple handler
    void     (*sa_sigaction)(int, siginfo_t *, void *); // if SA_SIGINFO
    sigset_t   sa_mask;     // signals to block during handler
    int        sa_flags;
    void     (*sa_restorer)(void);        // internal, do not use
};

Key sa_flags:

Flag Effect
SA_RESTART Automatically restart syscalls interrupted by this signal instead of returning EINTR. Essential for library code that cannot handle EINTR
SA_SIGINFO Use sa_sigaction (3-arg) instead of sa_handler; handler receives siginfo_t with details
SA_NODEFER Do not automatically block the signal during its own handler (allows reentrancy)
SA_RESETHAND Reset the handler to SIG_DFL after first delivery (one-shot)
SA_NOCLDWAIT (SIGCHLD only) Do not create zombies; auto-reap children
SA_NOCLDSTOP (SIGCHLD only) Do not deliver SIGCHLD when children stop/continue

siginfo_t fields for SA_SIGINFO handlers:

siginfo_t {
    int      si_signo;   // signal number
    int      si_errno;
    int      si_code;    // SI_USER, SI_KERNEL, SI_TKILL, CLD_EXITED, ...
    pid_t    si_pid;     // sending process PID
    uid_t    si_uid;     // sending process real UID
    void    *si_addr;    // faulting address (SIGSEGV, SIGBUS, SIGILL)
    int      si_status;  // exit status or signal (SIGCHLD)
    union sigval si_value; // RT signal value
}

Signal Masks: Blocking Signals

Each task has a blocked sigset in its task_struct. Signals in the blocked set are not delivered while blocked — they remain pending in task_struct->pending until unblocked.

sigset_t mask;
sigemptyset(&mask);
sigaddset(&mask, SIGINT);
sigaddset(&mask, SIGTERM);

// Block SIGINT and SIGTERM in this thread:
sigprocmask(SIG_BLOCK, &mask, &old_mask);

// ... critical section ...

// Restore previous mask (unblocks SIGINT/SIGTERM):
sigprocmask(SIG_SETMASK, &old_mask, NULL);

In a multi-threaded program, sigprocmask is per-thread. The POSIX rule: signals sent to the process (via kill(pid, sig)) are delivered to an arbitrary thread that has the signal unblocked. Signals sent to a specific thread (via tgkill) go to that thread regardless of its mask (but SIGKILL/SIGSTOP still can't be blocked).


EINTR and SA_RESTART

Many blocking syscalls (read, write, accept, nanosleep, wait) return -1 with errno = EINTR if interrupted by a signal before completion. This is not an error — it is the mechanism by which signals interrupt long operations.

Handling strategies: 1. SA_RESTART: the kernel automatically restarts the syscall. Not all syscalls restart (see man 7 signal, "Interruption of system calls..."). nanosleep and pause never restart. 2. Manual retry loop: c ssize_t r; do { r = read(fd, buf, len); } while (r == -1 && errno == EINTR); 3. signalfd(): block signals with sigprocmask, then read them from a file descriptor — no EINTR possible because the signal is consumed via read() rather than interrupting it. See below.


kill(), tkill(), tgkill()

kill(pid, sig)         — send sig to process (thread group) pid
                         if pid == 0: send to entire process group
                         if pid == -1: send to all processes (except 1 and self)
                         if pid < -1: send to process group |pid|
tkill(tid, sig)        — send to specific thread by TID (deprecated, use tgkill)
tgkill(tgid, tid, sig) — send to thread tid within thread group tgid (safe: checks tgid)
raise(sig)             — send to calling thread (= tgkill(getpid(), gettid(), sig))

tgkill is the correct way to signal a specific POSIX thread — it validates that the TID belongs to the expected thread group, preventing PID/TID reuse attacks.


Signal Safety: async-signal-safe Functions

Signal handlers execute asynchronously with respect to the main program. If the main program is inside malloc() holding the allocator lock when the signal arrives, and the handler also calls malloc(), the result is a deadlock.

POSIX defines a list of async-signal-safe functions that can be safely called from a signal handler. They do not use non-reentrant locks:

Safe:                           NOT safe:
─────────────────────────────── ──────────────────────────────────
write(2)                        printf (uses FILE* lock)
send(2), recv(2)                malloc, free (allocator lock)
read(2)                         syslog (mutex)
open(2), close(2)               exit() (atexit handlers, stdio flush)
kill(2), raise(2)               any C++ exception handling
sigprocmask(2)                  pthread_mutex_lock
_exit(2)                        strtok (static buffer)
getpid(2), gettid(2)            sprintf (in some implementations)
sem_post(3)                     openlog, closelog
clock_gettime(2)                getenv (may allocate)

The canonical safe signal handler pattern: set a volatile flag and return.

static volatile sig_atomic_t shutdown_requested = 0;

static void sigterm_handler(int sig) {
    shutdown_requested = 1;  // safe: sig_atomic_t write is atomic
}
// Main loop:
while (!shutdown_requested) {
    // do work ...
}

signalfd(): Synchronous Signal Handling

signalfd(2) (Linux 2.6.22) allows a process to receive signals as data on a file descriptor, enabling signal handling within select/poll/epoll event loops without the async-signal-safe constraints:

sigset_t mask;
sigemptyset(&mask);
sigaddset(&mask, SIGTERM);
sigaddset(&mask, SIGINT);
sigaddset(&mask, SIGCHLD);

// Block signals so they don't fire traditional handlers:
sigprocmask(SIG_BLOCK, &mask, NULL);

// Create signal fd:
int sfd = signalfd(-1, &mask, SFD_CLOEXEC | SFD_NONBLOCK);

// Now add sfd to your epoll instance and handle like any fd:
// read(sfd, &fdsi, sizeof(fdsi)) returns struct signalfd_siginfo

The self-pipe trick is the pre-signalfd equivalent: a pipe is created; the signal handler writes a byte to the write end (safe — write(2) is async-signal-safe); the event loop monitors the read end. signalfd supersedes this.


Real-Time Signals

RT signals (SIGRTMIN (34) through SIGRTMAX (64) on Linux, adjusted for libc reservations): - Queued: multiple deliveries of the same RT signal stack up, each with its own siginfo_t payload. - Priority-ordered: lower signal numbers delivered first. - Value-carrying: sigqueue(pid, sig, val) attaches a union sigval (int or pointer) to the signal. - Used by: POSIX timers (SIGEV_SIGNAL), librt, some JVM internals (GC notifications).


Historical Context

Signals originated in the first edition of UNIX Research systems (1971) as a way for the kernel to notify a process of exceptional conditions. The original implementation was unreliable: if a signal arrived while the handler was already running, it was lost. BSD 4.2 (1983) introduced "reliable signals" with proper masking during handlers. POSIX.1 (1988) standardized the sigaction() interface.

Real-time signals were introduced in POSIX.1b (1993) and first implemented in Linux 2.1.x. signalfd() was added by Davide Libenzi in Linux 2.6.22 (2007) as part of the broader trend toward file-descriptor-based interfaces for OS events.


Production Examples

Graceful shutdown pattern (nginx/systemd style):

# Send SIGTERM for graceful shutdown:
kill -TERM $(cat /var/run/nginx.pid)
# Send SIGHUP to reload config without restart:
kill -HUP $(cat /var/run/nginx.pid)
# Last resort:
kill -9 $(cat /var/run/nginx.pid)

Tracing signal delivery with strace:

strace -e signal -p PID 2>&1
# Shows sigprocmask, sigreturn calls and signal arrivals

Catching SIGCHLD with signalfd in an event loop:

// In epoll loop, sfd readable:
struct signalfd_siginfo fdsi;
read(sfd, &fdsi, sizeof(fdsi));
if (fdsi.ssi_signo == SIGCHLD) {
    // Reap all exited children:
    while (waitpid(-1, &status, WNOHANG) > 0) { ... }
}

Sending signals between processes safely (tgkill):

// Send SIGUSR1 to a specific thread in another process:
tgkill(target_pid, target_tid, SIGUSR1);

Debugging Notes

  • EINTR storms: if SA_RESTART is not set and a high-frequency RT signal fires, every read/write returns EINTR. Use strace -c -p PID to see if EINTR dominates syscall time.
  • Signal handler deadlock: gdb -p PIDbt showing __lll_lock_wait inside malloc with a signal frame in the backtrace indicates a non-async-safe call from a handler. Switch to the volatile flag pattern.
  • Missing SIGCHLD: if SIGCHLD is set to SIG_IGN via sigaction (not just signal()), children are auto-reaped but no SIGCHLD is delivered. Verify with cat /proc/PID/status | grep SigIgn (bit 16 = bit for signal 17).
  • rt_sigtimedwait for synchronous delivery: sigwaitinfo(2) blocks until one of the masked signals is pending and atomically dequeues it — useful for a dedicated signal-handling thread.
  • Core dump not generated: check ulimit -c, kernel.core_pattern, and whether the binary is setuid (setuid binaries don't core-dump by default; controlled by fs.suid_dumpable).

Security Implications

  • Signal spoofing: any process with the same real UID (or with CAP_KILL) can send any signal to any process. The si_pid and si_uid in siginfo_t identify the sender for signals sent via kill(), but these are only trustworthy for SI_USER signals. Kernel-generated signals (SI_KERNEL, SI_TKILL) are authoritative.
  • SIGSEGV as an exploit primitive: a SIGSEGV handler that does longjmp out of the handler is technically undefined behavior, but widely used in JVMs (null pointer handling) and fuzzing harnesses. The interaction between signal stack (sigaltstack) and the restored register state via sigreturn is a historical exploit surface ("SROP" — Sigreturn-Oriented Programming).
  • Signal flooding as DoS: a malicious process with same-UID access can send thousands of RT signals to a target, filling its signal queue (/proc/sys/kernel/sigqueue_max, default 829 * (1 + process_count)) and causing EAGAIN on sigqueue().
  • SIGKILL and resource cleanup: because SIGKILL cannot be caught, any resources not freed by the kernel (external state: database connections, network state, file locks via lockf) will be left dangling. Design systems to handle abrupt process death.

Performance Implications

  • Signal delivery overhead: each signal delivery requires entering the kernel, checking the signal frame, setting up the rt_sigframe on the user stack, and doing a second kernel entry for sigreturn. Total cost: ~1–4 µs per signal on modern hardware.
  • High-frequency signals: using SIGALRM as a profiling tick (as the old gprof did) limits profiling resolution. perf uses hardware performance counters via PMU overflow interrupts instead — much lower overhead.
  • SA_RESTART and latency: applications that need bounded latency (real-time, trading systems) must audit every sigaction call. SA_RESTART can cause a system call to execute for much longer than expected if it is repeatedly interrupted and restarted.
  • Blocking vs. signalfd: signalfd + epoll integrates signal handling with I/O in a single event loop thread, eliminating context switches to a signal handler and back. For high-throughput event loops (e.g., HAProxy, nginx), this is the preferred pattern.

Failure Modes

Failure Symptom Root cause
Deadlock in signal handler Process hangs, gdb shows malloc lock in bt Non-async-safe function in handler
SIGCHLD lost Zombie accumulation despite handler Signals arriving while handler runs; loop with WNOHANG
EINTR not handled Spurious errors in production Syscall returns EINTR, caller doesn't retry
Core dump missing Crash with no diagnosis RLIMIT_CORE=0, suid_dumpable=0, or core_pattern misconfigured
RT signal queue overflow sigqueue returns EAGAIN Queue depth exceeded sigqueue_max; process slow to handle
Signal to wrong thread Handler runs in unintended thread kill() delivers to arbitrary thread; use tgkill() for specific thread

Modern Usage

Systemd and SIGTERM/SIGKILL sequence: systemctl stop sends SIGTERM, waits TimeoutStopSec (default 90s), then sends SIGKILL. Services must handle SIGTERM for graceful shutdown. Use KillSignal=SIGUSR1 in unit files to send a different initial signal to daemons that use SIGUSR1 for graceful stop (e.g., old nginx --with-debug).

Go runtime signals: the Go runtime installs handlers for SIGSEGV, SIGBUS, SIGFPE, and SIGABRT to convert hardware exceptions into panics. SIGTERM and SIGINT are caught by the os/signal package. Sending SIGQUIT to a Go process dumps all goroutine stacks (useful for live debugging).

Java JVM signals: the JVM uses SIGUSR1 internally for GC notifications (HotSpot). Do not send SIGUSR1 to a JVM unless you know what you're doing. Use kill -3 <JVM_PID> (SIGQUIT) for a thread dump.


Future Directions

  • signalfd replacement via io_uring: proposal to handle signals as io_uring completions, integrating with the async I/O submission queue for zero-syscall signal consumption in tight event loops.
  • Safer signal delivery ordering: the POSIX model for which thread receives a process-directed signal is deliberately vague. Proposals for explicit signal routing (always to a specific nominated thread) would eliminate a class of races.
  • BPF signal programs: bpf_send_signal() kernel helper allows eBPF programs to send signals to processes being traced, enabling complex policy-based signal delivery from BPF programs without a userspace intermediary.

Exercises

  1. SA_RESTART audit: write a C program that installs a SIGALRM handler (using alarm(1)) without SA_RESTART. Show that read() on stdin returns EINTR. Then add SA_RESTART and verify read() no longer returns EINTR. Use strace to observe the difference in the syscall trace.

  2. Async-signal-safe crash: write a C program that holds a pthread_mutex in the main thread and then raises SIGUSR1. The handler attempts pthread_mutex_lock(). Observe the deadlock. Fix it using the volatile sig_atomic_t flag pattern.

  3. signalfd event loop: implement a minimal event loop using epoll that handles three fds: stdin (readable events), a timerfd firing every second, and a signalfd for SIGINT/SIGTERM. On SIGTERM, print statistics and exit cleanly.

  4. RT signal queue: write two programs: a sender that calls sigqueue() in a tight loop sending SIGRTMIN with an incrementing value, and a receiver using sigwaitinfo() to consume them. Measure how many signals are lost (gaps in the value sequence) as you increase the sender's rate.

  5. SROP awareness: research Sigreturn-Oriented Programming (SROP). Set up a test binary that installs a SIGSEGV handler and uses sigreturn() manually (bypassing the normal sigreturn trampoline). Explain why kernel mitigations like shadow stacks and SA_RESTORER validation make this attack class harder on modern kernels.


References

  • kernel/signal.csend_signal(), get_signal(), setup_rt_frame()
  • arch/x86/kernel/signal.c — signal frame setup, sigreturn system call
  • include/uapi/asm-generic/signal.h — signal numbers
  • include/uapi/linux/signalfd.h, fs/signalfd.c — signalfd implementation
  • Kerrisk, The Linux Programming Interface — Chapters 20–22 (signals), 63 (signalfd)
  • Stevens & Rago, Advanced Programming in the UNIX Environment — Chapter 10
  • man 2 sigaction, man 2 sigprocmask, man 2 kill, man 2 tgkill, man 2 signalfd, man 7 signal
  • POSIX.1-2017: <signal.h>, async-signal-safe function list
  • "Sigreturn-Oriented Programming" — Erik Bosman, HitB 2014
  • LWN: "Signals and threads" series