CVE-2023-52632 in Linuxinfo

Summary

by MITRE • 04/02/2024

In the Linux kernel, the following vulnerability has been resolved:

drm/amdkfd: Fix lock dependency warning with srcu

====================================================== WARNING: possible circular locking dependency detected 6.5.0-kfd-yangp #2289 Not tainted ------------------------------------------------------ kworker/0:2/996 is trying to acquire lock: (srcu){.+.+}-{0:0}, at: __synchronize_srcu+0x5/0x1a0

but task is already holding lock: ((work_completion)(&svms->deferred_list_work)){+.+.}-{0:0}, at:
process_one_work+0x211/0x560

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 ((work_completion)(&svms->deferred_list_work)){+.+.}-{0:0}:
__flush_work+0x88/0x4f0 svm_range_list_lock_and_flush_work+0x3d/0x110 [amdgpu]
svm_range_set_attr+0xd6/0x14c0 [amdgpu]
kfd_ioctl+0x1d1/0x630 [amdgpu]
__x64_sys_ioctl+0x88/0xc0

-> #2 (&info->lock#2){+.+.}-{3:3}:
__mutex_lock+0x99/0xc70 amdgpu_amdkfd_gpuvm_restore_process_bos+0x54/0x740 [amdgpu]
restore_process_helper+0x22/0x80 [amdgpu]
restore_process_worker+0x2d/0xa0 [amdgpu]
process_one_work+0x29b/0x560 worker_thread+0x3d/0x3d0

-> #1 ((work_completion)(&(&process->restore_work)->work)){+.+.}-{0:0}:
__flush_work+0x88/0x4f0 __cancel_work_timer+0x12c/0x1c0 kfd_process_notifier_release_internal+0x37/0x1f0 [amdgpu]
__mmu_notifier_release+0xad/0x240 exit_mmap+0x6a/0x3a0 mmput+0x6a/0x120 do_exit+0x322/0xb90 do_group_exit+0x37/0xa0 __x64_sys_exit_group+0x18/0x20 do_syscall_64+0x38/0x80

-> #0 (srcu){.+.+}-{0:0}:
__lock_acquire+0x1521/0x2510 lock_sync+0x5f/0x90 __synchronize_srcu+0x4f/0x1a0 __mmu_notifier_release+0x128/0x240 exit_mmap+0x6a/0x3a0 mmput+0x6a/0x120 svm_range_deferred_list_work+0x19f/0x350 [amdgpu]
process_one_work+0x29b/0x560 worker_thread+0x3d/0x3d0

other info that might help us debug this: Chain exists of: srcu --> &info->lock#2 --> (work_completion)(&svms->deferred_list_work)

Possible unsafe locking scenario:

CPU0 CPU1 ---- ---- lock((work_completion)(&svms->deferred_list_work)); lock(&info->lock#2); lock((work_completion)(&svms->deferred_list_work)); sync(srcu);

VulDB is the best source for vulnerability data and more expert information about this specific topic.

Analysis

by VulDB Data Team • 03/17/2025

The vulnerability identified as CVE-2023-52632 resides within the Linux kernel's amdgpu driver, specifically in the drm/amdkfd subsystem where a lock dependency warning has been detected. This issue manifests as a potential circular locking dependency that could lead to system deadlock conditions during concurrent operations involving source update read-copy update (SRCU) mechanisms and work completion locks. The warning originates from kernel version 6.5.0-kfd-yangp and indicates that a worker thread attempting to acquire an SRCU lock is already holding a work completion lock, creating a dependency chain that violates kernel locking conventions.

The technical flaw stems from improper lock ordering within the AMDGPU kernel driver's memory management subsystem. When processing memory operations for AMD graphics devices, the kernel attempts to synchronize SRCU operations while already holding a work completion lock associated with deferred list processing. This creates a circular dependency where the SRCU lock depends on a work completion lock, which in turn depends on another lock, ultimately forming a chain that includes the original work completion lock. The dependency chain demonstrates a complex interplay between GPU memory management operations, work queue processing, and memory notifier mechanisms that are all interconnected through shared locking primitives.

The operational impact of this vulnerability extends to systems utilizing AMD graphics hardware through the amdgpu driver, particularly those running kernel versions affected by this issue. The potential for deadlock scenarios means that system stability could be compromised during intensive GPU workloads involving memory management operations such as process restoration, memory mapping, and virtual memory operations. This vulnerability affects systems where the kernel is performing concurrent operations between GPU memory management and work queue processing, potentially leading to system hangs or requiring manual intervention to recover from deadlock conditions. The risk is particularly elevated in server environments or high-performance computing setups where GPU acceleration is heavily utilized.

Mitigation strategies for CVE-2023-52632 involve applying the kernel patch that resolves the lock dependency issue by reordering the lock acquisition or restructuring the dependency chain to prevent circular locking. System administrators should update to kernel versions that include the fix, typically those incorporating the drm/amdkfd lock dependency resolution. Additionally, monitoring for lock dependency warnings in kernel logs can help identify systems that may be vulnerable to this issue, particularly in production environments where GPU workloads are common. The fix aligns with best practices for kernel lock management and addresses the specific locking pattern that violates the kernel's lock dependency tracking mechanisms. This vulnerability classification aligns with CWE-264, which addresses permissions, privileges, and access controls, and potentially with ATT&CK techniques related to system resource exploitation through kernel vulnerabilities.

The root cause analysis reveals that this issue is fundamentally about improper lock ordering in kernel space, where the SRCU synchronization mechanism conflicts with existing work completion locks in the GPU memory management subsystem. The fix implemented addresses the fundamental dependency chain by ensuring that locks are acquired in a consistent order, preventing the circular dependency that could lead to deadlock conditions. This type of vulnerability demonstrates the complexity of kernel-level memory management and the critical importance of proper lock ordering in concurrent systems. The warning message provides sufficient diagnostic information for kernel developers to trace the problematic lock acquisition sequence and implement appropriate fixes that maintain system stability while preserving the intended functionality of GPU memory management operations.

Reservation

03/06/2024

Disclosure

04/02/2024

Moderation

accepted

CPE

ready

EPSS

0.00168

KEV

no

Activities

very low

Sources

Might our Artificial Intelligence support you?

Check our Alexa App!