Node.js Event Loop
Technical Overview
Node.js is a JavaScript runtime built on three core components: the V8 JavaScript engine (execution and JIT), libuv (cross-platform asynchronous I/O, event loop, thread pool), and a set of Node.js C++ bindings that expose OS capabilities to JavaScript. The architecture enables high-throughput I/O-bound services on a single OS thread without the overhead of thread-per-request context switching, by using the operating system's native async I/O facilities (epoll on Linux, kqueue on macOS/BSD, IOCP on Windows).
The result is that a single Node.js process can handle tens of thousands of concurrent connections — a feat that would require tens of thousands of OS threads under a blocking I/O model, each consuming ~1MB of stack space and generating excessive context-switch overhead.
Prerequisites
- Understanding of OS async I/O: epoll, kqueue, select
- JavaScript event-driven programming (callbacks, Promises, async/await)
- Basic understanding of thread pools and OS threads
- Familiarity with V8 JIT (see 02-jit-compilation.md for V8 context)
Event Loop Phase Diagram
Node.js Event Loop (libuv)
+--------------------------+
| |
+---------+ | +--------------------+ |
| timers |<--+--| Phase 1: Timers | |
| phase | | | setTimeout() | |
| | | | setInterval() | |
+---------+ | +--------------------+ |
| | |
| +--------------------+ |
| | Phase 2: Pending | |
| | Callbacks | |
| | (deferred I/O errs)| |
| +--------------------+ |
| | |
| +--------------------+ |
| | Phase 3: Idle / | |
| | Prepare (internal) | |
| +--------------------+ |
| | |
| +--------------------+ |
| | Phase 4: Poll | | <-- epoll/kqueue wait
| | I/O callbacks | | for I/O events
| | (network, file via | |
| | thread pool) | |
| +--------------------+ |
| | |
| +--------------------+ |
| | Phase 5: Check | |
| | setImmediate() | |
| +--------------------+ |
| | |
| +--------------------+ |
| | Phase 6: Close | |
| | Callbacks | |
| | socket.destroy() | |
| +--------------------+ |
| | |
| v |
| nextTick queue + microtask|
| queue drained between |
| each phase |
+--------------------------+
Core Content
Node.js Architecture
V8 provides JavaScript execution. It compiles JavaScript to native machine code via the Ignition → TurboFan pipeline. In Node.js, the event loop calls back into V8 to execute JavaScript callbacks.
libuv (libuv.so) is the cross-platform async I/O library. It implements:
- The event loop itself
- The async file I/O thread pool (4 threads by default, configurable via UV_THREADPOOL_SIZE)
- Timer management (setTimeout, setInterval)
- DNS resolution (uses the thread pool for dns.lookup)
- setImmediate scheduling
Node.js C++ bindings: C++ code (in src/ of the Node.js source) wraps OS APIs and V8 primitives. The bindings register JavaScript-callable functions via V8's C++ API and handle the marshaling between V8 values and native types.
Event Loop Phases in Detail
Phase 1 — Timers: Executes callbacks registered with setTimeout() and setInterval() whose threshold time has passed. Timer callbacks are not guaranteed to fire at exactly the specified delay — they fire on the next loop iteration after the delay has elapsed, subject to other work in the loop.
Phase 2 — Pending Callbacks: Executes I/O callbacks that were deferred to the next loop iteration (error callbacks from the previous iteration, TCP errors, etc.).
Phase 3 — Idle / Prepare: Internal libuv phases, not used by userspace code.
Phase 4 — Poll: The core I/O phase. The event loop: 1. Calculates how long to block in the OS I/O poll (epoll_wait / kevent / GetQueuedCompletionStatus) 2. Blocks until I/O events arrive or the timeout expires (whichever comes first) 3. Executes I/O completion callbacks for network sockets, resolved thread-pool tasks (file I/O), etc. 4. Stays in the poll phase until the callback queue is exhausted or a system-dependent limit is reached
The poll phase is where the process sleeps when there is no work to do, consuming zero CPU. For a Node.js HTTP server with no active requests, the process is blocked here.
Phase 5 — Check: Executes setImmediate() callbacks. setImmediate fires after poll, meaning it runs after I/O callbacks in the current iteration but before timers in the next. This makes it useful for deferring work until after the current I/O handlers complete.
Phase 6 — Close Callbacks: Socket or handle close callbacks (socket.on('close', ...)).
Microtask queues (drain between each phase transition, not just between full iterations):
- process.nextTick() queue: drains completely before advancing to the next phase
- Promise microtasks (Promise.then): drain after the nextTick queue
This means process.nextTick callbacks can starve the event loop if they recursively schedule more nextTick callbacks.
libuv Thread Pool
Network I/O (TCP, UDP sockets) uses the OS's native non-blocking I/O (epoll/kqueue/IOCP) — no thread pool involved. The kernel tells libuv when a socket is readable/writable; the callback is scheduled in the poll phase.
Thread pool operations (blocking work offloaded to worker threads):
- fs module operations (most of fs.readFile, fs.stat, fs.write, etc.)
- dns.lookup() (uses getaddrinfo(), which is a blocking C library call)
- crypto module operations (pbkdf2, randomFill, scrypt)
- User-defined work via worker_threads or napi_create_async_work
Default thread pool size: 4 threads (UV_THREADPOOL_SIZE=4). This means at most 4 blocking file I/O or DNS operations can proceed simultaneously. In applications with heavy fs or dns.lookup usage, this is frequently the bottleneck. Increase with UV_THREADPOOL_SIZE=64 (or up to 128 on newer libuv).
The thread pool interaction with the event loop:
1. JavaScript calls fs.readFile(path, cb)
2. Node.js C++ code submits a work item to the libuv thread pool
3. One of the 4 pool threads executes the blocking open()/read() syscall
4. When complete, the result is placed in the poll phase completion queue
5. The event loop's poll phase picks it up and calls the JavaScript callback
Single-Threaded Performance Model
The single-threaded model provides: - No lock contention: Only one JavaScript execution context runs at a time. JavaScript code cannot have data races. No mutex for shared data structures. - No context switching overhead: A thread-per-connection model with 10,000 connections involves 10,000 OS threads. Each context switch is ~5–10 µs. Under load, context switching can consume 30–50% of CPU time. Node.js has one thread for all 10,000 connections. - Predictable execution order: The event loop's phase model makes callback ordering deterministic within a tick.
The cost: CPU-bound work blocks the event loop for all concurrent connections. A 100ms synchronous computation causes 100ms latency spikes for every other request.
Blocking the Event Loop
The most critical Node.js performance anti-pattern:
// WRONG: Blocks the event loop for all requests
const crypto = require('crypto');
app.get('/hash', (req, res) => {
// pbkdf2Sync blocks for ~200ms per call
const hash = crypto.pbkdf2Sync(req.body.password, 'salt', 100000, 64, 'sha256');
res.send(hash.toString('hex'));
});
// CORRECT: Uses the thread pool
app.get('/hash', (req, res) => {
crypto.pbkdf2(req.body.password, 'salt', 100000, 64, 'sha256', (err, hash) => {
res.send(hash.toString('hex'));
});
});
// ALSO WRONG: Synchronous JSON parsing of large payload blocks event loop
const huge = JSON.parse(fs.readFileSync('10mb.json')); // blocks!
// WRONG: Tight computation loop
app.get('/fib', (req, res) => {
const n = parseInt(req.query.n);
res.send(String(fib(n))); // fib(45) = ~5 seconds
});
Detecting event loop lag:
// Measure event loop delay (should be <1ms under no load)
const { monitorEventLoopDelay } = require('perf_hooks');
const h = monitorEventLoopDelay({ resolution: 10 });
h.enable();
setInterval(() => {
console.log(`Event loop delay P99: ${h.percentile(99)}ms`);
h.reset();
}, 5000);
Worker Threads (Node.js 10+)
Worker threads provide true parallel JavaScript execution in Node.js using V8 Isolates — separate JavaScript heaps that can run in parallel OS threads.
const { Worker, isMainThread, parentPort } = require('worker_threads');
if (isMainThread) {
const worker = new Worker(__filename);
worker.on('message', result => console.log('Result:', result));
worker.postMessage({ n: 45 });
} else {
parentPort.on('message', ({ n }) => {
// This runs in a separate OS thread, won't block main thread
parentPort.postMessage(fib(n));
});
}
Workers have separate V8 heaps — data is passed by structured clone (copy) or SharedArrayBuffer (shared memory). SharedArrayBuffer requires Atomics for synchronization.
Streams and Backpressure
Node.js streams are the canonical way to handle large data transfers. Backpressure prevents a fast producer from overwhelming a slow consumer:
const readable = fs.createReadStream('largefile.dat');
const writable = fs.createWriteStream('dest.dat');
readable.pipe(writable); // pipe() handles backpressure automatically
// Manual backpressure:
readable.on('data', chunk => {
const ok = writable.write(chunk);
if (!ok) {
readable.pause(); // pause if write buffer is full
writable.once('drain', () => readable.resume());
}
});
Without backpressure, writable.write() returns false when its internal buffer exceeds the high-water mark, but a naive producer ignores this and continues calling write(). The buffer grows unboundedly, consuming memory until the process crashes.
Cluster Module
The cluster module forks multiple Node.js processes (one per CPU core), each running a full copy of the application, sharing a TCP/UDP port:
const cluster = require('cluster');
const http = require('http');
if (cluster.isPrimary) {
for (let i = 0; i < require('os').cpus().length; i++) {
cluster.fork();
}
} else {
http.createServer((req, res) => {
res.writeHead(200);
res.end('Hello');
}).listen(3000);
}
The primary process distributes incoming connections to workers using a round-robin distribution (default on all platforms except Windows). Workers are independent processes with separate V8 heaps — no shared memory. Use Redis or a shared database for cross-worker state.
Node.js Memory Model
Node.js memory has distinct regions:
- V8 heap: Holds JavaScript objects, closures, strings. Limited by --max-old-space-size (default: 1.5GB on 64-bit). Managed by V8's GC (Scavenger for young gen, Major GC for old gen).
- External/ArrayBuffer memory: Off-V8-heap binary data (Buffer, TypedArray backed by native memory). Counted toward V8's external memory to influence GC timing.
- C++ heap (libuv, binding code): Not counted in the V8 heap or process.memoryUsage(). Tracked as process.memoryUsage().external.
- RSS vs heap used: process.memoryUsage().heapUsed is V8 live objects; RSS includes code, stack, libuv, and OS-level allocations.
Buffer.alloc allocates directly from the native heap (C++ malloc/new), backed by a SharedArrayBuffer. It does not go through V8 object allocation.
Performance Profiling
# Built-in V8 profiler (sampling profiler)
node --prof app.js
node --prof-process isolate-*.log > profile.txt
# 0x flame graph (best option for event loop analysis)
npx 0x -- node app.js
# Generates an HTML flamegraph
# clinic.js suite
npx clinic doctor -- node app.js # general diagnostics
npx clinic flame -- node app.js # CPU flame graph
npx clinic bubbleprof -- node app.js # async operation profiling
Historical Context
Node.js was created by Ryan Dahl in 2009. The key insight was using the browser's JavaScript engine (V8, released by Google in 2008) server-side, coupled with libuv's event loop for non-blocking I/O. libuv was created specifically for Node.js (written by Ryan Dahl and Ben Noordhuis) to provide a unified async I/O abstraction across Linux (epoll), macOS (kqueue), and Windows (IOCP). The "10,000 connections" benchmark that Ryan Dahl demonstrated at JSConf.eu 2009 shocked the backend community accustomed to Apache's thread-per-connection model. Worker threads were added in Node.js 10 (2018) to address the CPU-bound computation gap.
Production Examples
// Health endpoint that measures real event loop health
const { performance, PerformanceObserver } = require('perf_hooks');
let lastMark = performance.now();
setInterval(() => {
const now = performance.now();
const lag = now - lastMark - 100; // should be ~0ms over 100ms interval
if (lag > 50) {
console.warn(`Event loop lag: ${lag.toFixed(1)}ms`);
// Alert: something blocked the event loop
}
lastMark = now;
}, 100);
# Heap snapshot for memory leak analysis
node --inspect app.js
# In Chrome DevTools: Memory > Heap Snapshot
# Or programmatically:
v8.writeHeapSnapshot('/tmp/heap.heapsnapshot')
# Analyze in Chrome DevTools Memory tab
Debugging Notes
--inspect/--inspect-brkenables the Chrome DevTools Protocol debugger; usechrome://inspector VS Code's built-in Node debuggernode --expose-internalsallows requiringinternal/*modules for deep introspection- Memory leaks: heap snapshot comparison (before vs after suspected leak) shows retained objects and their GC roots
unhandledRejectionevents from unhandled Promise rejections are critical to monitor; in production, set--unhandled-rejections=throwto convert them to crashes (better than silent data corruption)- Diagnosis of event loop blocking: instrument with
async_hooksmodule to trace async context propagation and find where callbacks are slow
Security Implications
- Prototype pollution: Merging user input into
{}objects can overrideObject.prototype, affecting all objects in the V8 heap — leads to privilege escalation or sandbox bypass. UseObject.create(null)for safe dictionaries. Lodash <4.17.11 was vulnerable (CVE-2019-10744). - ReDoS (Regular Expression Denial of Service): A single-threaded event loop means a catastrophically backtracking regex blocks all connections. Use the
safe-regexlibrary or WASM-based regex engines with linear-time guarantees. - Path traversal via
path.join:path.join('/uploads', req.params.file)does NOT prevent../../etc/passwd. Usepath.resolveand validate the result starts with the expected directory prefix. - Child process injection:
child_process.exec(userInput)passes the string to/bin/sh— shell injection is trivial. Usechild_process.execFilewith argument arrays.
Performance Implications
- Event loop tick overhead: V8 function call overhead is ~1–10 ns; async callback dispatch adds ~1–5 µs of libuv/V8 overhead per async operation
process.nextTickis cheaper thanPromise.resolve().then()(microtask), which is cheaper thansetImmediate(), which is cheaper thansetTimeout(fn, 0)— they run in this priority order- JSON.parse/stringify blocking: parsing a 10MB JSON payload synchronously in a request handler blocks for ~50ms. Use streaming JSON parsers (e.g.,
stream-json) for large payloads. Buffer.concaton many small buffers is O(n²) if done in a loop — batch them withBuffer.concat([...buffers])once.
Failure Modes
- EventEmitter memory leak warning:
MaxListenersExceededWarningwhen >10 listeners are added without removing them — classic leak in connection pools that forget to remove listeners on cleanup - V8 heap OOM:
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory— increase--max-old-space-sizeor fix the memory leak - UV_THREADPOOL_SIZE exhaustion: All 4 (default) thread pool threads are occupied with slow
fsoperations;dns.lookup()and newfscalls queue indefinitely, causing timeouts. IncreaseUV_THREADPOOL_SIZE. - zombie connections: Connections whose clients have disconnected but the server hasn't noticed (no keepalive timeout). Accumulate until file descriptor limit is hit (
EMFILE). Setserver.keepAliveTimeoutandserver.headersTimeout.
Modern Usage
Node.js 22 (current LTS as of 2024) includes native fetch, WebStreams, and --experimental-strip-types for running TypeScript directly. The node:test module provides a built-in test runner. AsyncLocalStorage (Node.js 12+) provides context propagation across async boundaries — the Node.js equivalent of thread-local storage, built on async_hooks.
Deno and Bun are alternative JavaScript runtimes competing with Node.js. Deno uses tokio (Rust async runtime) instead of libuv, enabling a fully async I/O model including file I/O without a thread pool. Bun uses JavaScriptCore (WebKit's engine) and custom bindings targeting startup performance.
Future Directions
- Single-executable applications (SEA): Node.js 20+ supports bundling a Node.js app into a single executable, bundling the Node.js binary with the application code
- WASI integration: Running WebAssembly code in Node.js via
node:wasifor sandboxed native-speed modules - Async context propagation improvements: Reducing the overhead of
AsyncLocalStorageon hot paths (currently ~5–10% overhead in some benchmarks) - HTTP/3 native support: libuv and Node.js core HTTP/3 (QUIC) support, removing the need for third-party
quicmodules
Exercises
- Write a Node.js HTTP server that deliberately blocks the event loop for 500ms on every 10th request (using a spin loop, not
sleep). Usewrkorautocannonto load test it. Observe P50 vs P99 latency behavior and identify the blocking request in the flame graph. - Set
UV_THREADPOOL_SIZE=2and write a server that makes 10 concurrentdns.lookup()calls per request. Load test and observe the latency cliff as the thread pool saturates. Increase toUV_THREADPOOL_SIZE=16and re-measure. - Demonstrate backpressure: create a readable stream that produces data faster than a writable stream can consume it (use
writable._writewith a 10ms delay). Show that without backpressure handling, memory grows; withpipe()or manual pause/resume, it stays bounded. - Implement a worker thread pool for CPU-bound tasks. The main thread distributes fibonacci(n) tasks to N worker threads via message passing. Measure throughput vs N workers and compare to a single-threaded implementation.
- Use
async_hooksto implement request-scoped logging — every log statement from code within a request handler automatically includes the request ID, without passing it explicitly through every function call.
References
- Ryan Dahl, "Node.js" JSConf.eu 2009 presentation. https://www.youtube.com/watch?v=ztspvPYybIY
- libuv documentation: https://docs.libuv.org/en/v1.x/design.html
- Bert Belder, "Everything You Need to Know About Node.js Event Loop." JSConf Asia 2016. https://www.youtube.com/watch?v=PNa9OMajl9s
- Deepal Jayasekara, "Node.js Event Loop Series." https://blog.insiderattack.net/event-loop-and-the-big-picture-nodejs-event-loop-part-1-1cb67a182810
- Node.js documentation: https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick
- Clinic.js docs: https://clinicjs.org/documentation/