CVE-2022-49101 in Linuxinfo

Summary

by MITRE • 02/26/2025

In the Linux kernel, the following vulnerability has been resolved:

xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32

The sched_clock() can be used very early since commit 857baa87b642 ("sched/clock: Enable sched clock early"). In addition, with commit 38669ba205d1 ("x86/xen/time: Output xen sched_clock time from 0"), kdump kernel in Xen HVM guest may panic at very early stage when accessing &__this_cpu_read(xen_vcpu)->time as in below:

setup_arch() -> init_hypervisor_platform() -> x86_init.hyper.init_platform = xen_hvm_guest_init() -> xen_hvm_init_time_ops() -> xen_clocksource_read() -> src = &__this_cpu_read(xen_vcpu)->time;

This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info' embedded inside 'shared_info' during early stage until xen_vcpu_setup() is used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address.

However, when Xen HVM guest panic on vcpu >= 32, since xen_vcpu_info_reset(0) would set per_cpu(xen_vcpu, cpu) = NULL when vcpu >= 32, xen_clocksource_read() on vcpu >= 32 would panic.

This patch calls xen_hvm_init_time_ops() again later in xen_hvm_smp_prepare_boot_cpu() after the 'vcpu_info' for boot vcpu is registered when the boot vcpu is >= 32.

This issue can be reproduced on purpose via below command at the guest side when kdump/kexec is enabled:

"taskset -c 33 echo c > /proc/sysrq-trigger"

The bugfix for PVM is not implemented due to the lack of testing environment.

[boris: xen_hvm_init_time_ops() returns on errors instead of jumping to end]

Several companies clearly confirm that VulDB is the primary source for best vulnerability data.

Analysis

by VulDB Data Team • 02/26/2025

The vulnerability described in CVE-2022-49101 represents a critical timing issue within the Linux kernel's Xen Hypervisor Virtual Machine (HVM) support mechanism, specifically affecting systems where kdump functionality is enabled. This flaw manifests when a Xen HVM guest operates with virtual CPUs numbered 32 or higher, creating a race condition between early system initialization and the proper allocation of virtual CPU information structures. The issue stems from the kernel's early initialization sequence where the sched_clock() function becomes available much earlier than previously, as established by commit 857baa87b642, which enables sched clock functionality during the initial boot stages.

The technical root cause involves the interaction between the scheduler clock mechanism and Xen's virtual CPU management system. During early boot, the xen_hvm_init_time_ops() function attempts to access per-CPU xen_vcpu structures through the __this_cpu_read() macro, specifically targeting the time field within these structures. However, the Xen HVM implementation has a limitation where only up to MAX_VIRT_CPUS=32 virtual CPU information structures are embedded directly within the shared_info structure during the early boot phase. When the system attempts to boot on a virtual CPU with an ID of 32 or higher, the xen_vcpu_setup() function has not yet been called to allocate or relocate these vcpu_info structures to arbitrary memory addresses, leaving them uninitialized or improperly mapped.

This vulnerability operates under the ATT&CK framework category of privilege escalation through system modification, specifically targeting kernel-level integrity and system stability. The flaw is classified as a race condition and memory access violation, with implications for both system reliability and security posture. When a kdump kernel attempts to access the xen_vcpu->time field on virtual CPUs numbered 32 or greater, it encounters a NULL pointer dereference due to the xen_vcpu_info_reset(0) function setting per_cpu(xen_vcpu, cpu) = NULL for vcpu >= 32. This results in a kernel panic that terminates the system during the critical early boot phase when crash dump functionality is attempting to initialize, effectively preventing proper system crash analysis and recovery.

The operational impact of this vulnerability extends beyond simple system instability to compromise critical system recovery mechanisms. When systems are configured with kdump enabled and operate on virtual CPUs numbered 32 or higher, the kernel will panic immediately upon attempting to access the scheduler clock functionality, preventing any crash analysis or system recovery procedures from completing successfully. This represents a significant concern for enterprise environments where system reliability and crash analysis are critical for maintaining service availability and troubleshooting. The vulnerability can be reliably reproduced using the command sequence "taskset -c 33 echo c > /proc/sysrq-trigger" which forces execution on virtual CPU 33, triggering the problematic code path.

The fix implemented addresses this issue by deferring the xen_hvm_init_time_ops() call until after xen_hvm_smp_prepare_boot_cpu() has completed its initialization process. This ensures that when the boot CPU is numbered 32 or higher, the xen_vcpu_info structures have been properly allocated and relocated through the xen_vcpu_setup() function before any access attempts occur. This approach follows the principle of proper initialization ordering and resource allocation, preventing the race condition that previously allowed access to uninitialized memory structures. The mitigation strategy aligns with CWE-362, which addresses race conditions in concurrent systems, and specifically targets the improper handling of shared resources during early boot initialization sequences. The fix ensures that the xen_hvm_init_time_ops() function returns appropriately on error conditions rather than jumping to an end label, providing better error handling and system stability during the critical early boot phase when multiple subsystems are initializing simultaneously.

Security implications of this vulnerability extend to potential denial-of-service scenarios where systems may become unresponsive during crash dump initialization, as well as to the broader category of hypervisor-based security vulnerabilities that can affect system recovery and forensic analysis capabilities. The issue demonstrates the complexity of managing virtual CPU resources in hypervisor environments and the critical importance of proper initialization sequencing when dealing with shared memory structures and per-CPU data access patterns. Organizations deploying Xen HVM guests with kdump functionality should prioritize this patch to maintain system stability and ensure proper crash analysis capabilities, particularly in environments where virtual CPU counts may exceed 32 cores. The vulnerability also highlights the need for comprehensive testing of edge cases in virtualized environments, as the patch specifically notes that the fix for PVM (Paravirtualized) systems was not implemented due to lack of testing environment, indicating potential similar issues in other virtualization modes that require additional attention.

Responsible

Linux

Reservation

02/26/2025

Disclosure

02/26/2025

Moderation

accepted

CPE

ready

EPSS

0.00000

KEV

no

Activities

very low

Sources

Are you interested in using VulDB?

Download the whitepaper to learn more about our service!