To access material, start machines and answer questions login.
Set up your virtual environment
Most local privilege escalation exploits are fragile. They depend on precise kernel version offsets, require winning a race condition reliably, or crash the system when timing slips. -2026-31431, nicknamed Copy Fail, is none of those things. A 732-byte Python script with no external packages, run by any unprivileged local user, returns a root shell in seconds.
The researcher's own description of the primitive frames it concisely: "an unprivileged local user can write four controlled bytes into the page cache of any readable file on a system, and use that to gain root."
Copy Fail was assigned -2026-31431 and a v3.1 base score of 7.8 (High). It was reported by Taeyang Lee, a researcher at Theori, who paired his own domain insight into the crypto subsystem with Xint Code, an -assisted code analysis tool also developed at Theori, to scan the kernel's crypto codepaths and surface the bug. The full disclosure is published on the Xint research blog (opens in new tab). The kernel security team was notified on 23 March 2026, with public disclosure following on 29 April 2026. The vulnerability sat silently in virtually every mainstream distribution for nine years, from a 2017 kernel optimisation through to the April 2026 patch.
If you are familiar with -2022-0847 (Dirty Pipe), the class of primitive that drives Copy Fail will already be familiar. Both exploits write into the in-memory page cache of a file without touching the on-disk content, and both bypass file monitoring as a result. The underlying mechanisms differ. Dirty Pipe used a flaw in the pipe subsystem, whereas Copy Fail abuses the kernel's AF_ALG crypto socket combined with a specific AEAD template. Despite the different entry points, both fall into the same attack surface class of kernel-level page cache writes via splice(). The two are compared below.
| Property | Dirty Pipe (CVE-2022-0847) | Copy Fail (CVE-2026-31431) |
|---|---|---|
| Kernel subsystem | pipe / splice | AF_ALG crypto / splice |
| Write primitive | Arbitrary via pipe flag bug | Controlled 4-byte via authencesn scratch |
| Race condition required | Yes (originally) | No |
| Kernel offsets needed | No | No |
| On-disk file changed | No | No |
| File integrity bypass | Yes | Yes |
The "No" in the race condition row is what makes Copy Fail substantially more reliable as a weapon. A race-based exploit may fail a fifth of the time, demand a retry loop, or hang the system on a bad attempt. The Copy Fail write is deterministic. It either lands cleanly or it errors without side effects. That reliability drove the rapid weaponisation observed after disclosure, with multiple public reimplementations appearing within 24 hours.
Learning Objectives
By the end of this room, you will be able to:
- Explain the page cache write primitive that makes Copy Fail exploitable and why it bypasses file integrity monitoring
- Identify the four kernel components (
page cache,AF_ALG,authencesn,splice) that combine to create the vulnerability - Execute the proof-of-concept as an unprivileged user and obtain a root shell
- Describe the primary detection signals and apply the
modprobemitigation on Ubuntu and Debian systems
Prerequisites
This room assumes comfort with the Linux command line, including basic shell navigation and reading command output, as covered in Linux Fundamentals. It also assumes a working understanding of local privilege escalation concepts as introduced in Linux Privilege Escalation.
Connecting to the Machine
Click the Start Machine button at the top of this task and allow about a minute for it to boot. The room launches an in-browser split-view terminal that is already logged in as the unprivileged user karen, so no further setup is required to follow along. If you prefer to connect from your own terminal over , the credentials are provided in Task 3. Use whichever terminal suits you best for the practical steps in Task 3.
I have successfully started my machine.
Copy Fail emerges from the interaction of four independent components of the kernel, each of which behaves correctly in isolation. The vulnerability does not exist in any single component. It exists in what happens when they meet, specifically when a 2017 optimisation intended to make one of them faster failed to account for a data path that was not considered at the time.
In this section, we cover each component in the order it appears in the exploit, before describing how the combined behaviour produces the write primitive.
The Page Cache
The page cache is a region of kernel memory that holds copies of recently-read file contents to avoid repeated disk reads. When any process reads a file on , the kernel reads it from disk once into the page cache, and every subsequent read of that file by any process is served from the cached copy.
The cache is shared. All processes on the same system, including every sharing the host kernel, see the same cached pages for a given file. Writes to a file ordinarily update both the on-disk copy and the cached copy. However, the cached copy can also be modified directly without any change being written back to disk.
A useful analogy is a photocopier with an internal print buffer. The original document remains on the glass, but every print job is reproduced from the buffer rather than from the glass. If the buffer contents are tampered with, every printout is wrong. Inspecting the original document on the glass shows nothing out of place.
This matters because file tools such as AIDE, Tripwire, and IMA work by hashing the on-disk file. They read from disk, not from memory. If the page cache has been corrupted but the disk is untouched, every check passes. The hash check reports the file as clean, while the binary the kernel actually loads when the file is executed is the corrupted in-memory version.
AF_ALG: The Kernel Crypto Socket
AF_ALG (Address Family: Algorithm, socket family number 38) is a Linux socket interface that exposes the kernel's cryptographic subsystem to userspace. Introduced in kernel 2.6.38, it allows programs to request AEAD encryption and decryption, hashing, and symmetric cipher operations through standard socket calls such as bind(), sendmsg(), and recvmsg().
The property that matters for this exploit is access. AF_ALG is available to unprivileged users by default. No CAP_SYS_ADMIN or other special capability is required to open such a socket. Legitimate software that uses AF_ALG is a short and well-defined set, including cryptsetup, systemd-cryptsetup, kcapi-enc, kcapi-dgst, kcapi-mac, kcapi-speed, and the charon and charon-systemd daemons used by strongSwan for IPsec. This list becomes the baseline for detecting anomalous AF_ALG usage in Task 4.
authencesn and the Scratch Write
authencesn is a Linux kernel AEAD template used for IPsec with Extended Sequence Number (ESN) support. During an AEAD decryption operation, authencesn writes 4 bytes (the lower 32 bits of the Extended Sequence Number, stored as seqno_lo) at the offset assoclen + cryptlen within the output buffer. This write is a scratch operation the algorithm performs to rearrange the sequence number fields before verifying the authentication tag.
The detail that makes this exploitable is the order of operations. The scratch write happens before HMAC tag verification. The algorithm writes its 4 bytes into the output buffer first, then checks whether the HMAC tag is valid. If the HMAC check fails, and in this exploit it is deliberately made to fail, the error is returned to userspace. However, the write has already landed. There is no rollback, no cleanup of the output buffer, and no mechanism to undo a write that has already completed.
A common confusion point is whether a failing HMAC verification implies the operation as a whole was rolled back. It does not. The write happened first. The failure only means the authentication check at the end of the operation rejected the data. The 4 bytes written as a scratch value during processing are already in the output buffer by the time the HMAC result is evaluated.
Both the value and the destination of the write are attacker-controlled. The attacker controls the value through seqno_lo, which is derived from fields in the message they construct. The attacker controls the offset within the output buffer through assoclen and cryptlen. This gives full control over both what is written and where.
splice() and Page References
splice() is a Linux system call that moves data between two file descriptors by transferring page references rather than copying the data. When you splice from a regular file into a socket, the kernel does not copy the file contents. It hands the socket a reference to the same memory pages already used by the file's page cache.
This zero-copy behaviour is what makes splice() fast. It is also what makes it dangerous in this context. After a splice() from /usr/bin/su into an AF_ALG socket, the AF_ALG pipeline holds a reference to the live page cache pages backing /usr/bin/su. Those pages are still kernel-owned page cache pages, not a user-controlled copy.
The 2017 Optimisation
Before 2017, algif_aead (the AEAD implementation in the AF_ALG subsystem) maintained separate source and destination scatterlists for cryptographic operations. A scatterlist is a linked list of memory pages describing where input data comes from and where output should be written.
Commit 72548b093ee3 in kernel 4.14 introduced an in-place optimisation. Instead of keeping separate source and destination scatterlists, the code merged them and set req->src = req->dst. For the common case, where the user supplies data via a normal write, this was safe. Both sides of the assignment referenced user-controlled memory.
The optimisation was written without considering the splice() data path. When pages arrive via splice() from a file, those pages are not user-controlled memory. They are the kernel's own live page cache pages for that file. Setting req->src = req->dst to the same scatterlist meant the "output" scatterlist now pointed directly at the file's page cache pages. The authencesn scratch write targeting the "output" buffer was now writing into the kernel's page cache.
The Resulting Primitive
Combining all four components produces a controlled 4-byte write to an arbitrary offset within the page cache of any file the attacker can open for reading. The attacker controls the written value through seqno_lo and controls the offset through the splice() position combined with assoclen and cryptlen. The HMAC verification fails afterwards, returning an error to userspace, but the write has already completed.
The exploit repeats this primitive approximately 40 times to write successive 4-byte chunks of shellcode into the target file's cached pages. Each iteration opens a fresh AF_ALG socket, splices the target file at the next offset, and calls recvmsg() to trigger the scratch write. Each call corrupts exactly 4 bytes of the page cache. After roughly 40 calls, enough shellcode is in place to overwrite the target binary's execution path.
There is no race condition, no kernel version-specific offset table, and no external tool dependency. The same script runs on every vulnerable kernel, which spans 4.14 (released November 2017) through 6.18.21 in the 6.18 series, and 6.19.0 through 6.19.11 in the 6.19 series, covering nine years of distributions.
What year was the optimization introduced that created this vulnerability?
Which AEAD algorithm template performs the scratch write that corrupts the page cache?
What system call transfers page cache pages into the AF_ALG socket without copying them?
The HMAC verification fails after the scratch write is performed. Does this undo the page cache corruption? (Answer Format: Yay or Nay)
The proof-of-concept () script pre-placed on the machine targets /usr/bin/su. It overwrites the cached copy of that binary with shellcode, after which any execution of su runs that shellcode with setuid root privileges. The complete operation takes a few seconds. The exploit used in this task is hosted at GitHub (opens in new tab).
When the Start Machine button is pressed, the room opens a split-view terminal in the browser, already logged in as the unprivileged user karen. To use that terminal, run the steps below directly. To connect from your own machine over SSH instead, use the username karen and the password copyfail2026:
ssh karen@MACHINE_IP
Either approach lands you at the same prompt. The steps below should be followed in order in whichever terminal you choose.
Step 1: Confirm Your Context
Confirm the current user has no elevated privileges:
karen@ubuntu:~$ id
uid=1001(karen) gid=1001(karen) groups=1001(karen)
This is the starting context the exploit requires. No elevated privileges, no special groups, just an ordinary local user.
Step 2: Inspect the Proof of Concept
The exploit script lives at /home/karen/exploit.py. Before running it, take a quick look at its high-level structure:
head -30 /home/karen/exploit.py
The script uses only Python standard library modules. The os module supplies splice() and execve(), socket provides the AF_ALG socket calls, and zlib is used for CRC calculations when building the authentication key blob. There are no pip packages and no compiled extensions. The script runs on any Python 3.10 or later installation. Ubuntu 24.04 ships Python 3.12, so os.splice is available out of the box.
At a high level, the script performs the following steps in sequence:
- Opens an
AF_ALGsocket bound toauthencesn(hmac(sha256),cbc(aes)) - Calculates the target offset within
/usr/bin/suwhere each shellcode chunk should land - Constructs the AAD so that bytes 4-7 carry the shellcode value to write as
seqno_lo - Calls
splice()to feed page cache pages from/usr/bin/suinto the socket at the calculated offset - Calls
recvmsg()to trigger the AEAD decryption, at which pointauthencesnperforms its scratch write and the shellcode bytes land in the page cache - Repeats approximately 40 times to write successive 4-byte shellcode chunks
- Calls
os.execve("/usr/bin/su", ...)so the kernel loads from the corrupted cache and the shellcode runs
Step 3: Run the Proof of Concept
Execute the script:
python3 /home/karen/exploit.py
The script prints progress as it writes each 4-byte chunk. After the writes complete, it calls execve() on the target binary. The shell prompt should change and the effective user ID becomes root:
karen@ubuntu:~$ python3 /home/karen/exploit.py
# whoami
uid=0(root) gid=0(root) groups=0(root)
Step 4: Read the Flag
root@ubuntu:~# cat /root/flag.txt
THM{xxxxxxxxxxxxxxxxxxxxxxxxxxx}
Step 5: The sha256sum Moment
While still in the root shell, verify the file just exploited against its expected on-disk hash:
root@ubuntu:~# sha256sum /usr/bin/su
c4d2e053445c5f89d13b68bb54de8d67358e1aa20a2b8f0688cb8a47a32edbdf /usr/bin/su
That hash matches a freshly-installed, unmodified system. AIDE, Tripwire, and IMA would all report this file as unmodified. A file integrity scan run at this moment returns clean.
The exploit only modified the in-memory page cache copy. The on-disk binary at /usr/bin/su was never written to. File tools read from disk, compute the hash from disk, and compare against a baseline taken from disk. None of that touches the page cache. The corruption used to gain root is invisible to every filesystem-based check.
This is a core property of the vulnerability rather than an incidental side effect. Any detection strategy that relies on watching for changes to setuid binaries will miss this exploit entirely.
Note: Copy Fail is not the first exploit to bypass file monitoring this way. Dirty Pipe (CVE-2022-0847) had the same property. The page cache as an attack surface for file bypass is a recurring theme that predates both CVEs. What distinguishes Copy Fail is its reliability. The absence of a race condition means an attacker can run it once and expect it to work.
Step 6: Exit the Root Shell
exit
When the root shell was spawned, execution passed through the corrupted page cache. As part of cleanup, the PoC calls posix_fadvise(POSIX_FADV_DONTNEED) on /usr/bin/su. This hints to the kernel that those pages are no longer needed, and the kernel evicts them from the page cache. The next time any process reads /usr/bin/su, the kernel reloads the original, clean binary from disk.
For a defender, this means the exploitation window is extremely narrow. The corrupted pages exist in memory only between the last recvmsg() call and the POSIX_FADV_DONTNEED cleanup, a window measured in seconds. By the time a filesystem-based alert fires and a responder opens a terminal, the page cache already shows the original binary. There is nothing to find on disk and nothing in the current page cache. Detection must come from observing the exploitation as it happens, which is covered in Task 4.
Note: Re-running the PoC in Task 4 to verify the mitigation blocks it is supported. The cleanup step evicted the corrupted pages, so the next execution of /usr/bin/su loads cleanly from disk.
What is the content of /root/flag.txt?
The exploitation window for Copy Fail is seconds wide. The on-disk file is never modified, the page cache is clean by the time cleanup runs, and the uses only standard library calls that blend in with normal process activity. Filesystem monitoring, file checks, and binary signature validation all miss this entirely.
What remains detectable is the process behaviour, specifically the sequence of system calls that no legitimate application produces at the volumes the exploit requires.
Detection
AF_ALG Socket Creation: The Primary Signal
The exploit must call socket(AF_ALG, SOCK_SEQPACKET, 0) approximately 40 times in rapid succession, once for each 4-byte write chunk. Each call opens a fresh AF_ALG socket, and 40 shellcode bytes require roughly 40 iterations.
The SOCK_SEQPACKET detail matters for detection precision. AF_ALG sockets opened with SOCK_DGRAM are commonly used by hashing and symmetric crypto operations, which produces background noise on busy systems. SOCK_SEQPACKET is the type used for AEAD operations and is far less commonly seen on a healthy system, making it the higher-fidelity filter for this exploit chain.
The list of legitimate processes that open AF_ALG AEAD sockets is short and well known. It includes cryptsetup, systemd-cryptsetup, kcapi-enc, kcapi-dgst, kcapi-mac, kcapi-speed, bluez, iwd, and the charon and charon-systemd daemons used by strongSwan for IPsec. Any process outside that list creating an AF_ALG SOCK_SEQPACKET socket is unusual. A process outside that list creating 40 or more such sockets in a few seconds is almost certainly running this exploit or a derivative of it.
The socket(AF_ALG, ...) call uses domain value 38. In auditd, that translates to -F a0=38. The rule below fires on the very first socket call, before any page cache corruption has occurred:
-a always,exit -F arch=b64 -S socket -F a0=38 -k copy_fail_af_alg
-a always,exit -F arch=b64 -S splice -k copy_fail_splice
Key Syscalls to Monitor
| Syscall | What to Watch For | Relevance |
|---|---|---|
socket(AF_ALG, ...) |
Any process outside expected allowlist | Primary signal, fires before corruption |
splice() |
Splice from a setuid binary FD into a socket FD | Page cache pages entering the crypto pipeline |
recvmsg() on AF_ALG fd |
~40 calls in seconds from same PID | Each call writes 4 bytes of shellcode |
posix_fadvise(DONTNEED) |
Called on a setuid binary shortly after AF_ALG activity | Attacker cleanup, page eviction |
Falco Rule Outline
For environments running Falco or a compatible eBPF monitoring tool, the detection logic resembles the rule below. A live Falco installation is not required to follow the example. The rule is included as a reference for how the detection logic is encoded:
- macro: expected_af_alg_processes
condition: >
proc.name in (kcapi-enc, kcapi-dgst, kcapi-mac, cryptsetup,
kcapi-speed, charon, charon-systemd)
- rule: Potential Copy Fail Exploit (AF_ALG Socket Creation)
desc: >
Detects AF_ALG socket creation (family 38) by unexpected processes.
Primary vector for CVE-2026-31431 (Copy Fail) LPE.
condition: >
evt.type = socket and
evt.arg.domain = AF_ALG and
evt.res >= 0 and
not expected_af_alg_processes
output: >
Anomalous AF_ALG socket created
(user=%user.name uid=%user.loginuid command=%proc.cmdline
pid=%proc.pid container_id=%container.id image=%container.image.repository)
priority: CRITICAL
tags: [host, container, exploit, privilege_escalation, cve_2026_31431]
The rule fires on the first socket() call from any process not in the allowlist. At this point, no page cache corruption has occurred yet. An IDS or SIEM that correlates repeated AF_ALG alerts from the same PID with a subsequent execve() of a setuid binary in the same short time window has a high-confidence exploit chain indicator.
Detection Timing
The window for finding evidence after the fact is very short. The posix_fadvise(DONTNEED) cleanup evicts the corrupted pages. By the time a responder opens the machine for triage, the page cache is clean, the disk is clean, and there are no modified binaries to find.
Detection for this class of vulnerability works by watching process behaviour at the time of exploitation, not by examining system state afterwards. Correlating recvmsg() calls on an AF_ALG file descriptor with execve(/usr/bin/su) in the same narrow time window is the indicator to build on. Filesystem forensics alone will not find this.
MITRE ATT&CK Mapping
| Technique | MITRE ID | Primary Signal |
|---|---|---|
| Local Privilege Escalation via kernel flaw | T1068 | AF_ALG socket creation by unexpected process |
| Escape to Host from Container | T1611 | Container ID in Falco alert and subsequent host-level activity |
| Setuid binary abuse | T1548.001 | execve of setuid binary after AF_ALG activity |
| Indicator Removal via page eviction | T1070 | posix_fadvise(DONTNEED) called on setuid binary |
Mitigation
The vulnerability lives in the algif_aead kernel module. Disabling that module removes the exploit's ability to open an AF_ALG socket bound to authencesn. The permanent fix is a kernel update to 6.18.22, 6.19.12, or 7.0 (or a vendor backport of the same fix), with the mainline patch landing as commit a664bf3d603d on 1 April 2026. Until your distribution ships a patched kernel, the modprobe blacklist is the recommended interim mitigation on Ubuntu and Debian systems.
Step 1: Verify the Module Is Loadable
First confirm that algif_aead is a loadable module on this system rather than being compiled directly into the kernel:
modinfo algif_aead
Module information should be returned, including the filename path. If the command returns module information, the modprobe blacklist approach will work.
Step 2: Apply the Modprobe Blacklist
echo "install algif_aead /bin/false" | sudo tee /etc/modprobe.d/disable-algif-aead.conf
sudo rmmod algif_aead 2>/dev/null || true
The first command writes a configuration file telling modprobe to run /bin/false instead of actually loading algif_aead. The second unloads the module if it is currently in memory. The || true on the rmmod line prevents a harmless error from stopping the command if the module is already unloaded.
Step 3: Verify the Block Is Active
sudo modprobe algif_aead
This command should now return an error. The module load is blocked.
Step 4: Confirm the PoC Fails
Re-run the exploit script:
python3 /home/karen/exploit.py
The script should now fail at the very first step. The socket(AF_ALG, ...) call returns an error because the module cannot be loaded, and the rest of the exploit chain never executes. This is the expected outcome once the mitigation is applied.
Warning: The modprobe blacklist only works on distributions where algif_aead is a loadable kernel module. On RHEL, CentOS, and AlmaLinux, `algif_aead` is compiled directly into the kernel (`CONFIG_CRYPTO_USER_API_AEAD=y`). The `modprobe` configuration file is silently ignored on these systems because the module load never goes through `modprobe`. RHEL-family systems require the `grubby` approach.
sudo grubby --update-kernel=ALL --args="initcall_blacklist=algif_aead_init"
sudo reboot
Run sudo grubby --info=ALL | grep initcall_blacklist to verify the argument was applied before rebooting. The change can be reverted with --remove-args="initcall_blacklist=algif_aead_init" once the kernel patch has been applied.
Patch Status at Disclosure
The mainline fix was committed on 1 April 2026 (commit a664bf3d603d), nearly a month before public disclosure on 29 April. The fix reverts the 2017 in-place optimisation entirely, restoring separate req->src and req->dst scatterlists so that page cache pages from splice() can never end up in the output scatterlist. No vendor distribution had shipped a patched kernel on disclosure day. AlmaLinux was the first to release a patched kernel, on 1 May 2026. Ubuntu, Debian, RHEL, SUSE, and others followed in the days and weeks after.
The kernels affected from 4.14 through 6.18.21 in the 6.18 series and 6.19.0 through 6.19.11 in the 6.19 series, covering every mainstream distribution released between late 2017 and the April 2026 patch.
What command checks whether algif_aead is a loadable module or compiled into the kernel?
Takeaways
The page cache is shared infrastructure. One process writing to it affects every process sharing the same kernel, including containers that appear to be fully isolated by and capability restrictions. Isolation at the layer does not extend down to the page cache.
Exploits with no race condition and no kernel offsets change the operational risk window. Public reimplementations of Copy Fail in C, Rust, Go, and arm64 surfaced on GitHub within days of disclosure, alongside the original Python release. Weapons this reliable are used quickly, not gradually.
File tools hash from disk, while this exploit writes to memory. For this class of vulnerability, detection requires syscall-level monitoring, whether through auditd rules watching socket(AF_ALG) calls, eBPF-based tools such as Falco tracking process behaviour, or kernel-level telemetry correlating the recvmsg() volume with the subsequent execve(). Filesystem monitoring on its own is not sufficient.
The kernel update to 6.18.22, 6.19.12, or 7.0 should be applied as soon as the distribution makes it available. The modprobe blacklist is an effective interim control on Ubuntu and Debian, but it is not a substitute for patching.
I can now exploit CVE-2026-31341!
Ready to learn Cyber Security?
TryHackMe provides free online cyber security training to secure jobs & upskill through a fun, interactive learning environment.
Already have an account? Log in