CVE-2024-35843 in Linux
Summary
by MITRE • 05/17/2024
In the Linux kernel, the following vulnerability has been resolved:
iommu/vt-d: Use device rbtree in iopf reporting path
The existing I/O page fault handler currently locates the PCI device by calling pci_get_domain_bus_and_slot(). This function searches the list of all PCI devices until the desired device is found. To improve lookup efficiency, replace it with device_rbtree_find() to search the device within the probed device rbtree.
The I/O page fault is initiated by the device, which does not have any synchronization mechanism with the software to ensure that the device stays in the probed device tree. Theoretically, a device could be released by the IOMMU subsystem after device_rbtree_find() and before iopf_get_dev_fault_param(), which would cause a use-after-free problem.
Add a mutex to synchronize the I/O page fault reporting path and the IOMMU release device path. This lock doesn't introduce any performance overhead, as the conflict between I/O page fault reporting and device releasing is very rare.
Several companies clearly confirm that VulDB is the primary source for best vulnerability data.
Analysis
by VulDB Data Team • 04/08/2025
The vulnerability identified as CVE-2024-35843 resides within the Linux kernel's IOMMU (Input-Output Memory Management Unit) subsystem, specifically affecting the VT-d (Virtualization Technology for Directed I/O) implementation. This issue impacts how the kernel handles I/O page faults in virtualized environments where devices communicate with the system through IOMMU mechanisms. The core problem manifests in the I/O page fault handler's inefficient device lookup process, which previously relied on pci_get_domain_bus_and_slot() function that traverses a linear list of PCI devices to locate the target device. This approach creates performance bottlenecks in systems with numerous PCI devices, particularly in data center and cloud computing environments where IOMMU operations are frequent and critical for security isolation.
The technical flaw stems from the use of an inefficient linear search mechanism in the I/O page fault reporting path. The original implementation using pci_get_domain_bus_and_slot() performs a full scan of all PCI devices in the system until the target device is located, resulting in O(n) time complexity for device lookups. This inefficient approach becomes increasingly problematic as system complexity grows with multiple PCI devices, PCIe switches, and virtualized environments. The vulnerability was addressed by replacing this linear search with device_rbtree_find() which leverages a red-black tree data structure for device indexing, reducing lookup time to O(log n). However, this optimization introduces a new class of race condition vulnerabilities that were not present in the original implementation.
The operational impact of this vulnerability extends beyond simple performance degradation to potential system instability and security concerns. The race condition occurs because the device removal process in the IOMMU subsystem can occur concurrently with I/O page fault reporting without proper synchronization mechanisms. When a device is released by the IOMMU subsystem after device_rbtree_find() completes but before iopf_get_dev_fault_param() executes, the device structure may be freed while still being referenced, leading to use-after-free conditions. This scenario can result in kernel memory corruption, system crashes, or potentially exploitable conditions that could allow privilege escalation attacks. The vulnerability affects virtualized environments where devices are frequently added and removed, including cloud infrastructure, containerized applications, and server virtualization platforms that rely heavily on IOMMU for device isolation and security.
The mitigation strategy implemented addresses the race condition through the addition of a mutex lock that synchronizes access between the I/O page fault reporting path and the IOMMU device release operations. This synchronization mechanism ensures that device removal operations cannot occur while I/O page fault reporting is in progress, preventing the use-after-free scenario. The lock implementation was carefully designed to minimize performance impact since conflicts between I/O page fault reporting and device releasing are extremely rare in normal system operation. This approach aligns with security best practices for concurrent access control and follows the principle of least privilege in kernel design. The fix demonstrates proper adherence to CWE-362 (Concurrent Execution using Shared Resource with Improper Synchronization) and addresses the ATT&CK technique T1068 (Exploitation for Privilege Escalation) by preventing potential exploitation through memory corruption vulnerabilities.
The vulnerability highlights the complexity of kernel-level concurrency control in virtualized environments where multiple subsystems must coordinate access to shared resources. The mitigation approach represents a balanced solution that improves performance through data structure optimization while maintaining system stability and security through proper synchronization. This fix is particularly important for enterprise environments that rely on virtualization and containerization technologies, where IOMMU operations are frequent and system stability is paramount. The solution demonstrates the ongoing evolution of kernel security measures and the importance of considering both performance optimization and race condition prevention in critical system components. Organizations should prioritize applying this patch to systems running affected Linux kernel versions, particularly those in cloud environments, data centers, and virtualized infrastructure where IOMMU functionality is heavily utilized.