CVE-2025-21829 in Linuxinfo

Summary

by MITRE • 03/06/2025

In the Linux kernel, the following vulnerability has been resolved:

RDMA/rxe: Fix the warning "__rxe_cleanup+0x12c/0x170 [rdma_rxe]"

The Call Trace is as below: " <TASK> ? show_regs.cold+0x1a/0x1f ? __rxe_cleanup+0x12c/0x170 [rdma_rxe]
? __warn+0x84/0xd0 ? __rxe_cleanup+0x12c/0x170 [rdma_rxe]
? report_bug+0x105/0x180 ? handle_bug+0x46/0x80 ? exc_invalid_op+0x19/0x70 ? asm_exc_invalid_op+0x1b/0x20 ? __rxe_cleanup+0x12c/0x170 [rdma_rxe]
? __rxe_cleanup+0x124/0x170 [rdma_rxe]
rxe_destroy_qp.cold+0x24/0x29 [rdma_rxe]
ib_destroy_qp_user+0x118/0x190 [ib_core]
rdma_destroy_qp.cold+0x43/0x5e [rdma_cm]
rtrs_cq_qp_destroy.cold+0x1d/0x2b [rtrs_core]
rtrs_srv_close_work.cold+0x1b/0x31 [rtrs_server]
process_one_work+0x21d/0x3f0 worker_thread+0x4a/0x3c0 ? process_one_work+0x3f0/0x3f0 kthread+0xf0/0x120 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30 </TASK> " When too many rdma resources are allocated, rxe needs more time to handle these rdma resources. Sometimes with the current timeout, rxe can not release the rdma resources correctly.

Compared with other rdma drivers, a bigger timeout is used.

If you want to get best quality of vulnerability data, you may have to visit VulDB.

Analysis

by VulDB Data Team • 02/16/2026

The vulnerability identified as CVE-2025-21829 resides within the Linux kernel's RDMA over Ethernet (rxe) driver component, specifically addressing an issue in the resource cleanup process during RDMA queue pair destruction. This flaw manifests as a warning message indicating that the __rxe_cleanup function cannot properly handle resource deallocation when excessive RDMA resources are allocated simultaneously. The call trace reveals a cascade of function calls originating from rxe_destroy_qp.cold, which eventually leads to ib_destroy_qp_user and rdma_destroy_qp.cold, ultimately reaching the rtrs_cq_qp_destroy.cold function. This sequence demonstrates how the cleanup process becomes problematic when dealing with high resource utilization scenarios. The warning indicates that the current timeout mechanism within the rxe driver is insufficient to properly release RDMA resources when system load increases, creating potential resource leak conditions that could lead to system instability or performance degradation. This vulnerability directly impacts the reliability of RDMA implementations on Linux systems where the rxe driver is utilized for remote direct memory access operations.

The technical root cause of this vulnerability stems from inadequate timeout handling within the RDMA resource cleanup mechanism of the rxe driver. When the system experiences high resource allocation loads, the current timeout configuration proves insufficient to complete the resource deallocation process properly. The kernel's __rxe_cleanup function, responsible for cleaning up RDMA queue pairs and associated resources, encounters scenarios where it cannot complete its operations within the allocated time window. This timeout limitation becomes particularly problematic during high-concurrency operations or when large numbers of RDMA resources are being destroyed simultaneously. The system's inability to properly handle resource cleanup under stress conditions creates a potential denial of service scenario where RDMA operations may fail or become unresponsive. This issue represents a classic resource management problem where insufficient timeout values lead to incomplete cleanup operations, potentially resulting in memory leaks or resource exhaustion that affects overall system stability and performance. The vulnerability aligns with CWE-674, which describes insufficient timeout handling in resource management operations, and demonstrates characteristics consistent with CWE-362, indicating a race condition or resource management flaw that could lead to system instability.

The operational impact of this vulnerability extends beyond simple performance degradation to potentially compromise system reliability and availability in RDMA-dependent environments. When the rxe driver cannot properly clean up RDMA resources due to timeout limitations, it creates conditions where subsequent RDMA operations may fail or exhibit unpredictable behavior. This can affect high-performance computing clusters, data center infrastructures, and any system relying on RDMA for low-latency network communication. The vulnerability particularly impacts systems under heavy load conditions where multiple RDMA operations are occurring simultaneously, as the cleanup process becomes increasingly strained by the timeout constraints. Network administrators and system operators may observe intermittent failures in RDMA operations, increased latency in resource deallocation, or complete service unavailability when the system reaches resource saturation points. The issue also has implications for security since incomplete resource cleanup can leave systems in vulnerable states or create conditions where attackers might exploit resource management weaknesses to cause system instability or denial of service.

Mitigation strategies for this vulnerability should focus on addressing the timeout configuration within the rxe driver and implementing proper resource management practices. The primary recommendation involves increasing the timeout values used during RDMA resource cleanup operations to accommodate high-load scenarios while maintaining system responsiveness. System administrators should monitor RDMA resource utilization patterns and adjust timeout parameters accordingly to prevent cleanup failures under stress conditions. Additionally, implementing proper resource monitoring and alerting mechanisms can help identify when systems are approaching resource saturation points, allowing for proactive management of RDMA operations. The Linux kernel should be updated to the patched version that addresses this timeout handling issue, ensuring that the rxe driver uses more appropriate timeout values compared to other RDMA drivers in the ecosystem. Organizations using RDMA technologies should also consider implementing load balancing strategies to distribute RDMA operations more evenly across system resources, reducing the likelihood of hitting resource cleanup timeouts. Regular system audits and performance testing should be conducted to verify that the timeout configurations remain appropriate for the operational environment, with particular attention to workloads that may stress RDMA resource management capabilities. The fix should align with ATT&CK technique T1499.004, which addresses resource exhaustion attacks, by ensuring that system resources can be properly managed and released under all operational conditions.

Responsible

Linux

Reservation

12/29/2024

Disclosure

03/06/2025

Moderation

accepted

CPE

ready

EPSS

0.00168

KEV

no

Activities

very low

Sources

Do you need the next level of professionalism?

Upgrade your account now!