02 — SELinux

Technical Overview

SELinux (Security-Enhanced Linux) is a mandatory access control (MAC) security architecture developed by the NSA and released to the Linux community in 2000, mainlined in kernel 2.6.0 (2003). It implements Type Enforcement (TE) as its core policy mechanism: every process and file has a security context (label), and a policy defines which context-to-context interactions are permitted.

The philosophical distinction from traditional Unix access control (DAC — Discretionary Access Control) is fundamental: in DAC, the owner of a resource decides who can access it. In MAC, a central policy administrator decides, and that policy cannot be overridden by any user, including root. A compromised root process constrained by SELinux policy can do only what the policy permits.

Prerequisites

Linux DAC model (user/group/other, rwx bits, setuid).
Process and file concepts (inodes, descriptors).
Linux kernel module concepts.
Basic familiarity with RPM/Debian packaging.

Core Content

DAC vs. MAC

Discretionary Access Control (traditional Unix):

file: /etc/shadow
permissions: 640, root:shadow
→ root can read/write
→ any process running as root can read /etc/shadow
→ a compromised web server running as root can read all files

Mandatory Access Control (SELinux):

file: /etc/shadow
SELinux context: system_u:object_r:shadow_t:s0
→ only processes with type shadow_t read permission in policy can read
→ a web server (type httpd_t) does NOT have read permission on shadow_t
→ a compromised httpd cannot read /etc/shadow regardless of UID

SELinux Architecture

System Call
    │
    ▼
Linux VFS / Network Stack / etc.
    │
    ▼
LSM Hooks (Linux Security Module)
    │  (200+ hook points: file_open, socket_connect, process_fork, etc.)
    ▼
SELinux Security Server
    │  (evaluates policy against request)
    │
    ├── AVC (Access Vector Cache)
    │     hit: return cached decision (fast path, ~100 ns)
    │     miss: consult policy, update cache
    │
    └── Policy Database
          (loaded from /etc/selinux/ at boot)
    │
    ▼
Allow / Deny / Audit

The AVC (Access Vector Cache) is critical for performance. Policy decisions are binary (allow/deny) and cacheable by (source type, target type, object class). Cache hit rate is typically > 99% in steady state.

SELinux Security Context

Every process and object has a four-field context:

user:role:type:level
  │     │     │    │
  │     │     │    └── MLS/MCS sensitivity level (s0, s0-s15:c0.c1023)
  │     │     └─────── Type (the core enforcement element)
  │     └───────────── Role (constrains which types a user can assume)
  └─────────────────── SELinux user (distinct from Linux user)

Examples:
  Process (Apache):   system_u:system_r:httpd_t:s0
  File (/var/www):    system_u:object_r:httpd_sys_content_t:s0
  File (/etc/shadow): system_u:object_r:shadow_t:s0
  Admin shell:        unconfined_u:unconfined_r:unconfined_t:s0

View contexts:

# Process contexts
ps auxZ | grep httpd

# File contexts
ls -Z /etc/shadow
# system_u:object_r:shadow_t:s0 /etc/shadow

# Your current context
id -Z
# unconfined_u:unconfined_r:unconfined_t:s0:c0.c1023

Type Enforcement Policy

The policy is a set of allow rules of the form:

allow source_type target_type:object_class { permissions };

Examples:

# Allow httpd to read files labeled httpd_sys_content_t
allow httpd_t httpd_sys_content_t:file { read open getattr };

# Allow httpd to connect TCP to ports labeled http_cache_port_t
allow httpd_t http_cache_port_t:tcp_socket { name_connect };

# Prevent httpd from reading /etc/shadow (shadow_t)
# (no allow rule = implicit deny)

The policy file (compiled binary) at /etc/selinux/targeted/policy/policy.32 or .33 is loaded by the SELinux kernel module at boot.

SELinux Modes

# Check current mode
getenforce
# Enforcing | Permissive | Disabled

# Set mode temporarily
setenforce 0  # Permissive (log denials, don't enforce)
setenforce 1  # Enforcing

# Permanent mode (requires reboot)
/etc/selinux/config:
SELINUX=enforcing   # enforcing, permissive, or disabled
SELINUXTYPE=targeted # targeted (recommended) or strict

Permissive mode is invaluable for troubleshooting: SELinux logs what it would have denied but does not block anything. Use permissive mode to audit a new application before enforcing.

Targeted policy (default on RHEL/Fedora/CentOS): only specific system services (httpd, sshd, etc.) are confined. User processes run as unconfined_t, which is nearly unrestricted. This is the pragmatic balance between security and usability.

Strict policy: all processes confined, including user shells. Maximum security but requires extensive policy work for new applications.

SELinux Access Control Flow

Process (httpd_t) attempts to open /etc/shadow (shadow_t)
     │
     ▼
VFS: file_open LSM hook triggered
     │
     ▼
SELinux security server receives:
  source_sid: httpd_t
  target_sid: shadow_t
  class:      file
  permission: open
     │
     ▼
AVC lookup (httpd_t, shadow_t, file, open)
     │
     ├── AVC HIT: return cached decision
     │
     └── AVC MISS: search policy for allow rules
           "allow httpd_t shadow_t:file open;" → NOT FOUND
                    │
                    ▼
          DENY + AVC_AUDIT log entry:
          avc:  denied  { open } for pid=1234 comm="httpd"
                name="shadow" dev="sda1"
                scontext=system_u:system_r:httpd_t:s0
                tcontext=system_u:object_r:shadow_t:s0
                tclass=file permissive=0

SELinux Policy Modules

Policies are organized as modules (.pp — policy package). Write and load a new module:

# 1. Generate policy template from audit log
audit2allow -a -M my_module

# 2. Review generated policy (CRITICAL: audit2allow generates OVERLY PERMISSIVE rules)
cat my_module.te

# 3. Compile module
make -f /usr/share/selinux/devel/Makefile

# 4. Install module
semodule -i my_module.pp

# 5. Verify
semodule -l | grep my_module

audit2allow caution: the generated rules grant exactly what was denied. They may be broader than necessary. Always review and tighten. Never blindly apply audit2allow -a output—this is a common mistake that nullifies SELinux protection.

Custom File Context

When deploying an application in a non-standard location:

# Check current context
ls -Z /opt/myapp/conf/

# Set context permanently (survives relabeling)
semanage fcontext -a -t httpd_config_t '/opt/myapp/conf(/.*)?'

# Apply to existing files
restorecon -Rv /opt/myapp/conf/

# Verify
ls -Z /opt/myapp/conf/

Boolean Policies

SELinux ships with toggleable booleans for common use cases:

# List available booleans
getsebool -a | grep httpd

# Allow httpd to make network connections
setsebool -P httpd_can_network_connect on

# Allow httpd to read home directories
setsebool -P httpd_read_user_content on

# Allow SSH to use PAM
setsebool -P authlogin_nsswitch_use_ldap on

Booleans avoid writing custom policy modules for common configurations.

SELinux for Containers

Container isolation relies on Linux namespaces and cgroups—but these do not prevent container breakout via kernel vulnerabilities. SELinux provides an additional layer:

Containers in Podman/Docker run as container_t by default.
SELinux policy restricts container_t from accessing host files, host processes, and sensitive kernel interfaces.
Even a root process inside a container is confined to container_t permissions.

# Verify Docker uses SELinux
docker info | grep -i selinux

# Run container with custom SELinux label
podman run --security-opt label=type:container_t myimage

# Check container process context
ps auxZ | grep container
# system_u:system_r:container_t:s0:c123,c456 ...

Container file labels (using MCS — Multi-Category Security): - Each container gets a unique c123,c456 category pair. - Prevents container A from reading container B's files even if both run as container_t.

Real Benefit: CVE-2016-5195 (Dirty COW)

Dirty COW was a race condition in Linux's copy-on-write handling that allowed a local attacker to write to any read-only memory-mapped file, gaining privilege escalation. It affected every Linux kernel from 2.6.22 to 4.8.

Without SELinux: the write-to-read-only exploit allows modifying /etc/passwd or /etc/shadow, granting root. Exploits were widely available.

With SELinux in enforcing mode: the exploit requires a process that can: 1. mmap() a file (allowed for any process). 2. Write to a read-only mmap via the race (the kernel bug).

But SELinux controls what the result of the exploit can do. A confined process (httpd_t) that exploited Dirty COW to write to /etc/passwd would be denied by SELinux's file write rules:

avc: denied { write } for pid=1234 comm="httpd"
     name="passwd" scontext=system_u:system_r:httpd_t:s0
     tcontext=system_u:object_r:passwd_file_t:s0

The exploit succeeds at the kernel level but SELinux policy prevents the attacker from leveraging it for privilege escalation. Containment in practice. Red Hat's security advisory explicitly stated that SELinux-enabled systems with httpd_t or similar confined domains had significantly reduced exposure.

Historical Context

The NSA's Flux Advanced Security Kernel (Flask) research (1990s) formalized the security model that became SELinux. Flask was ported to Linux and released in 2000. Peter Loscocco and Stephen Smalley were the primary NSA authors.

SELinux was controversial at its mainlining: the initial patch was enormous (~10K lines), and Linus Torvalds was skeptical about a mandatory-access-control system in the kernel. The introduction of the LSM (Linux Security Module) framework as an abstraction layer (2001, by Chris Wright and Crispin Cowan) resolved the architectural objection—SELinux became one implementation of LSM hooks rather than a direct kernel modification.

AppArmor (Novell, 2006; Ubuntu default) and TOMOYO Linux (NTT, 2009) are alternative LSM-based MAC systems using path-based (rather than inode-label-based) policies, which are simpler to configure but less expressive.

Production Examples

Case: Apache configuration drift mitigation. A Red Hat enterprise deployment configured SELinux to confine httpd_t. After a configuration management failure deployed a PHP shell as /var/www/html/shell.php, an attacker accessed it. The shell attempted to: 1. Read /etc/passwd — denied (passwd_file_t, not allowed for httpd_t). 2. Execute /bin/bash — denied (no execute permission from httpd_t to shell_exec_t). 3. Connect to external C2 — denied (httpd_can_network_connect was off).

The attacker had code execution but could do nothing with it. The incident was detected via AVC audit logs in 72 hours.

Case: SSH configuration breakage after manual file move. An administrator moved /etc/ssh/sshd_config without preserving SELinux context. sshd failed to start. The fix: restorecon -v /etc/ssh/sshd_config. This is the most common SELinux "break" in practice—not a policy gap, but a file context mismatch from manual operations.

Debugging Notes

# View recent AVC denials
ausearch -m avc -ts recent

# View denials in audit.log
grep avc /var/log/audit/audit.log | tail -20

# sealert: human-readable denial analysis with remediation suggestions
sealert -a /var/log/audit/audit.log

# Check if SELinux is blocking a specific command
strace -e trace=all ./mycommand 2>&1 | head  # look for EACCES
# Then check dmesg or audit.log for AVC denial

# Temporarily set domain to permissive (for debugging only)
semanage permissive -a httpd_t  # put httpd_t in permissive mode
semanage permissive -d httpd_t  # restore

Security Implications

SELinux is not a replacement for other security controls—it is a last line of defense after exploitation. Key limitations: - Unconfined domains (unconfined_t) in targeted policy are nearly unrestricted. Most user-interactive processes run unconfined. - Policy mistakes: overly permissive allow rules (from audit2allow misuse) can negate protection. - Kernel vulnerabilities: if the attacker can exploit the SELinux policy enforcement code itself, or bypass the LSM hooks, SELinux provides no protection. This has happened (CVE-2023-2604 in a hypothetical form).

The security model relies on the correctness of the policy. Policy auditing is specialized work.

Performance Implications

AVC cache hit rate: > 99% in steady state. The performance cost of SELinux in enforcing mode is typically 1–3% for I/O-intensive workloads and near zero for compute-bound workloads.

AVC miss is expensive (~10 µs policy lookup). Workloads that constantly create new file contexts (e.g., high-turnover file creation with unique labels) may see AVC misses more frequently. Use consistent file contexts.

The dentry_to_sid() function caches inode → security ID mappings. Large directory trees with millions of inodes can put pressure on this cache. Monitor via /sys/fs/selinux/avc/cache_stats.

Failure Modes and Real Incidents

The "disable SELinux" anti-pattern. A survey of enterprise deployments found that approximately 40% of RHEL systems had SELinux disabled, typically because an application vendor's documentation said "disable SELinux" to resolve installation issues. This represents a systematic failure of the tooling and education around SELinux, not a limitation of SELinux itself.

Apache + mod_wsgi context mismatch. A Python web application deployed in /srv/app failed to start in enforcing mode because the directory had default_t context. The developer disabled SELinux. The correct fix: semanage fcontext -a -t httpd_sys_content_t '/srv/app(/.*)?' + restorecon. SELinux was correct; the deployment was incomplete.

Modern Usage

SELinux is the default and mandatory MAC system on RHEL, Fedora, CentOS, and Amazon Linux 2/2023. Ubuntu uses AppArmor by default. Both provide MAC; SELinux is generally considered more expressive and more battle-hardened for enterprise use.

Kubernetes via OpenShift uses SELinux extensively—pod security policies are enforced partly through SELinux MCS categories, ensuring that pods on the same node cannot access each other's files.

Future Directions

SELinux policy language modernization: CIL (Common Intermediate Language) is the new SELinux policy format (Linux 4.3+). More readable and maintainable than the binary .pp format.
Fedora/RHEL plans: expanding the targeted policy to cover more user-space applications that currently run as unconfined_t.
SELinux + eBPF: eBPF programs are subject to LSM hooks since Linux 5.7 (CONFIG_BPF_LSM). SELinux can restrict which processes can load eBPF programs—closing a significant privilege escalation vector.

Exercises

On an SELinux-enabled system, place a file in /tmp and check its context with ls -Z. Move it to /var/www/html/ and observe the context change. Explain why cp may not preserve the SELinux context.
Set SELINUX=permissive in /etc/selinux/config, restart, and attempt to access a file that would normally be denied. Check ausearch -m avc for the log entry. Generate a policy module with audit2allow -a -M testmodule. Review what the module would allow before installing it.
Run ps auxZ on a system with SELinux enabled. Identify which processes are confined (non-unconfined_t domains) and which are unconfined. Explain the security significance of each.
Configure a web server (Apache or Nginx) to serve files from a non-standard directory (e.g., /data/webroot). Use semanage fcontext and restorecon to set the correct context. Verify with ls -Z and test that the server can serve files.
Research CVE-2016-5195 (Dirty COW). Write a two-paragraph analysis of how SELinux's type enforcement would limit (not prevent) exploitation of a confined process (httpd_t) vs. an unconfined process.

References

Loscocco, P., Smalley, S. "Integrating Flexible Support for Security Policies into the Linux Operating System." USENIX Annual Technical Conference, 2001.
NSA SELinux documentation: https://www.nsa.gov/what-we-do/research/selinux/
Red Hat SELinux User's and Administrator's Guide: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/using_selinux/
Mayer, F. et al. "SELinux by Example." Prentice Hall, 2006.
CIL documentation: https://github.com/SELinuxProject/selinux/wiki/CIL-Introduction
CVE-2016-5195 Red Hat advisory: https://access.redhat.com/security/cve/cve-2016-5195
AppArmor vs. SELinux comparison: https://www.kernel.org/doc/html/latest/admin-guide/LSM/index.html