CVE-2025-38554 in Linuxinfo

Summary

by MITRE • 08/19/2025

In the Linux kernel, the following vulnerability has been resolved:

mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped

By inducing delays in the right places, Jann Horn created a reproducer for a hard to hit UAF issue that became possible after VMAs were allowed to be recycled by adding SLAB_TYPESAFE_BY_RCU to their cache.

Race description is borrowed from Jann's discovery report: lock_vma_under_rcu() looks up a VMA locklessly with mas_walk() under rcu_read_lock(). At that point, the VMA may be concurrently freed, and it can be recycled by another process. vma_start_read() then increments the vma->vm_refcnt (if it is in an acceptable range), and if this succeeds, vma_start_read() can return a recycled VMA.

In this scenario where the VMA has been recycled, lock_vma_under_rcu() will then detect the mismatching ->vm_mm pointer and drop the VMA through vma_end_read(), which calls vma_refcount_put(). vma_refcount_put() drops the refcount and then calls rcuwait_wake_up() using a copy of vma->vm_mm. This is wrong: It implicitly assumes that the caller is keeping the VMA's mm alive, but in this scenario the caller has no relation to the VMA's mm, so the rcuwait_wake_up() can cause UAF.

The diagram depicting the race: T1 T2 T3 == == == lock_vma_under_rcu mas_walk <VMA gets removed from mm> mmap <the same VMA is reallocated> vma_start_read __refcount_inc_not_zero_limited_acquire munmap __vma_enter_locked refcount_add_not_zero vma_end_read vma_refcount_put __refcount_dec_and_test rcuwait_wait_event <finish operation> rcuwait_wake_up [UAF]

Note that rcuwait_wait_event() in T3 does not block because refcount was already dropped by T1. At this point T3 can exit and free the mm causing UAF in T1.

To avoid this we move vma->vm_mm verification into vma_start_read() and grab vma->vm_mm to stabilize it before vma_refcount_put() operation.

[[email protected]: v3]

If you want to get best quality of vulnerability data, you may have to visit VulDB.

Analysis

by VulDB Data Team • 12/15/2025

The vulnerability CVE-2025-38554 represents a use-after-free condition within the Linux kernel's memory management subsystem, specifically affecting the Virtual Memory Area (VMA) handling mechanism. This flaw arises from a race condition that occurs during the concurrent manipulation of VMA structures and their associated memory management contexts. The issue manifests when a VMA structure is freed and subsequently recycled by the kernel's memory allocator, creating a scenario where operations on the recycled structure can lead to memory corruption. The vulnerability is particularly concerning because it can be triggered through carefully orchestrated timing delays, making it difficult to detect during normal system operation.

The technical root cause of this vulnerability lies in the improper handling of reference counting and memory management during VMA access operations under RCU (Read-Copy-Update) protection. When lock_vma_under_rcu() performs a lockless lookup using mas_walk() under rcu_read_lock(), it can encounter a VMA that has already been freed by another thread. The subsequent vma_start_read() operation increments the vm_refcnt, but this increment can succeed even when the VMA has been recycled by a different process. The critical failure occurs in vma_end_read() when vma_refcount_put() is called, which invokes rcuwait_wake_up() using a copy of vma->vm_mm that may no longer be valid, as the original memory management context could have been freed by the time this function executes.

This vulnerability directly maps to CWE-416, which describes the use of memory after it has been freed, and aligns with ATT&CK technique T1059.003 for privilege escalation through kernel exploits. The race condition exploits the interaction between SLAB_TYPESAFE_BY_RCU memory allocation and the RCU-based VMA lookup mechanisms, creating a window where a freed VMA can be reallocated and accessed incorrectly. The flaw demonstrates the complexity of concurrent memory management in kernel space, where the recycling of memory structures can create unexpected dependencies between different threads operating on seemingly unrelated memory contexts. The improper synchronization between VMA access operations and memory management context validation creates a path for memory corruption that can lead to privilege escalation or system instability.

The operational impact of this vulnerability extends beyond simple memory corruption, as it can enable attackers to manipulate kernel memory structures and potentially achieve arbitrary code execution. The timing-dependent nature of the race condition means that exploitation requires precise control over system scheduling and memory allocation patterns, but once triggered, the consequences can be severe. The vulnerability affects systems that rely heavily on memory mapping operations and concurrent access to virtual memory areas, making it particularly dangerous in multi-threaded environments or systems with high memory pressure. The fix implemented addresses the core issue by moving the vm_mm verification into vma_start_read() and ensuring that the memory management context is stabilized before any reference count operations occur. This approach prevents the use of potentially freed memory contexts in the rcuwait_wake_up() call, effectively eliminating the use-after-free condition while maintaining the intended functionality of the VMA management system. The mitigation strategy ensures that proper reference counting and memory validation occur before any operations that might access freed memory structures, aligning with best practices for concurrent kernel memory management.

Responsible

Linux

Reservation

04/16/2025

Disclosure

08/19/2025

Moderation

accepted

CPE

ready

EPSS

0.00164

KEV

no

Activities

very low

Sources

Do you know our Splunk app?

Download it now for free!