CVE-2022-49236 in Linuxinfo

Summary

by MITRE • 02/26/2025

In the Linux kernel, the following vulnerability has been resolved:

bpf: Fix UAF due to race between btf_try_get_module and load_module

While working on code to populate kfunc BTF ID sets for module BTF from its initcall, I noticed that by the time the initcall is invoked, the module BTF can already be seen by userspace (and the BPF verifier). The existing btf_try_get_module calls try_module_get which only fails if mod->state == MODULE_STATE_GOING, i.e. it can increment module reference when module initcall is happening in parallel.

Currently, BTF parsing happens from MODULE_STATE_COMING notifier callback. At this point, the module initcalls have not been invoked. The notifier callback parses and prepares the module BTF, allocates an ID, which publishes it to userspace, and then adds it to the btf_modules list allowing the kernel to invoke btf_try_get_module for the BTF.

However, at this point, the module has not been fully initialized (i.e. its initcalls have not finished). The code in module.c can still fail and free the module, without caring for other users. However, nothing stops btf_try_get_module from succeeding between the state transition from MODULE_STATE_COMING to MODULE_STATE_LIVE.

This leads to a use-after-free issue when BPF program loads successfully in the state transition, load_module's do_init_module call fails and frees the module, and BPF program fd on close calls module_put for the freed module. Future patch has test case to verify we don't regress in this area in future.

There are multiple points after prepare_coming_module (in load_module) where failure can occur and module loading can return error. We illustrate and test for the race using the last point where it can practically occur (in module __init function).

An illustration of the race:

CPU 0 CPU 1 load_module notifier_call(MODULE_STATE_COMING) btf_parse_module btf_alloc_id // Published to userspace list_add(&btf_mod->list, btf_modules) mod->init(...) ... ^ bpf_check | check_pseudo_btf_id | btf_try_get_module | returns true | ... ... | module __init in progress return prog_fd | ... ... V if (ret < 0) free_module(mod) ... close(prog_fd) ... bpf_prog_free_deferred module_put(used_btf.mod) // use-after-free

We fix this issue by setting a flag BTF_MODULE_F_LIVE, from the notifier callback when MODULE_STATE_LIVE state is reached for the module, so that we return NULL from btf_try_get_module for modules that are not fully formed. Since try_module_get already checks that module is not in MODULE_STATE_GOING state, and that is the only transition a live module can make before being removed from btf_modules list, this is enough to close the race and prevent the bug.

A later selftest patch crafts the race condition artifically to verify that it has been fixed, and that verifier fails to load program (with ENXIO).

Lastly, a couple of comments:

1. Even if this race didn't exist, it seems more appropriate to only access resources (ksyms and kfuncs) of a fully formed module which has been initialized completely.

2. This patch was born out of need for synchronization against module initcall for the next patch, so it is needed for correctness even without the aforementioned race condition. The BTF resources initialized by module initcall are set up once and then only looked up, so just waiting until the initcall has finished ensures correct behavior.

Be aware that VulDB is the high quality source for vulnerability data.

Analysis

by VulDB Data Team • 06/12/2025

The vulnerability described in CVE-2022-49236 resides within the Linux kernel's BPF (Berkeley Packet Filter) subsystem, specifically concerning the handling of module BTF (BPF Type Format) data during the module loading process. This flaw manifests as a use-after-free condition that occurs due to a race between the BTF module preparation and the module initialization sequence. The issue is particularly critical because it can be exploited to cause system instability or potentially enable privilege escalation through malicious BPF program loading.

The technical root cause involves the interaction between the BTF (BPF Type Format) subsystem and the module loading infrastructure in the kernel. When a kernel module is loaded, the kernel's module subsystem transitions through several states including MODULE_STATE_COMING, MODULE_STATE_LIVE, and eventually MODULE_STATE_GOING. During the MODULE_STATE_COMING transition, the BTF subsystem parses and prepares the module's BTF data, allocates an ID, and publishes it to userspace before the module's initcalls have completed. The function btf_try_get_module, which is used by the BPF verifier to access module BTF data, only checks if the module is in the MODULE_STATE_GOING state and allows reference counting to proceed even when initcalls are in progress. This creates a window where a BPF program can successfully acquire a reference to a module's BTF data while the module is still being initialized, potentially leading to a use-after-free when the initialization fails and the module gets freed.

The race condition occurs at a specific point in the module loading sequence where the kernel's load_module function transitions from preparing a module to actually executing its initialization code. When the MODULE_STATE_COMING notifier callback processes the module, it parses the BTF data and makes it available to userspace before the module's initcalls have completed. If the module's initialization fails at a later point, the module gets freed, but any BPF programs that previously referenced this module's BTF data can still attempt to release their references, causing a use-after-free scenario. This vulnerability is categorized under CWE-416 Use After Free, which represents a critical class of memory safety issues in kernel space. The flaw affects the kernel's memory management and can lead to system crashes or unauthorized access to kernel memory.

The operational impact of this vulnerability extends beyond simple system instability, as it can potentially enable attackers to manipulate the kernel's memory state through carefully crafted BPF programs. The race condition can be triggered by loading a module with a BPF program that references the module's BTF data, combined with a module initialization that fails in a specific way. This creates an environment where a malicious user could potentially exploit the vulnerability to execute arbitrary code with kernel privileges, especially if they can control the module loading process and timing. The issue is particularly concerning because it affects the core kernel subsystems that handle module loading and BPF program verification, which are fundamental to system security and stability.

The fix implemented addresses the vulnerability by introducing a new flag BTF_MODULE_F_LIVE that is set when a module reaches the MODULE_STATE_LIVE state. This flag prevents btf_try_get_module from returning a valid reference to modules that are not fully initialized, effectively closing the race window. The solution ensures that BPF programs can only access module BTF data from modules that have completed their initialization process, thereby preventing the use-after-free condition. This approach aligns with the ATT&CK framework's concept of privilege escalation through kernel memory corruption, as it prevents unauthorized access to kernel resources during module initialization. The fix also includes verification through a selftest patch that artificially reproduces the race condition to confirm the fix works correctly, ensuring that BPF program loading fails appropriately with ENXIO error when attempting to access improperly initialized module BTF data.

The solution demonstrates good kernel security practices by addressing a fundamental synchronization issue in the module loading subsystem. It emphasizes the importance of ensuring complete module initialization before making module resources available to other kernel subsystems, particularly those that might access kernel memory in privileged contexts. The implementation follows established kernel patterns for resource management and state synchronization, ensuring that the fix does not introduce additional complexity or performance overhead while providing robust protection against the described race condition. The vulnerability's resolution also highlights the importance of proper module lifecycle management in kernel security, where the principle of least privilege and resource availability must be carefully balanced to prevent security exploits.

Responsible

Linux

Reservation

02/26/2025

Disclosure

02/26/2025

Moderation

accepted

CPE

ready

EPSS

0.00252

KEV

no

Activities

very low

Sources

Do you know our Splunk app?

Download it now for free!