CVE-2024-40943 in Linuxinfo

Summary

by MITRE • 07/12/2024

In the Linux kernel, the following vulnerability has been resolved:

ocfs2: fix races between hole punching and AIO+DIO

After commit "ocfs2: return real error code in ocfs2_dio_wr_get_block", fstests/generic/300 become from always failed to sometimes failed:

======================================================================== [ 473.293420 ] run fstests generic/300

[ 475.296983 ] JBD2: Ignoring recovery information on journal
[ 475.302473 ] ocfs2: Mounting device (253,1) on (node local, slot 0) with ordered data mode.
[ 494.290998 ] OCFS2: ERROR (device dm-1): ocfs2_change_extent_flag: Owner 5668 has an extent at cpos 78723 which can no longer be found
[ 494.291609 ] On-disk corruption discovered. Please run fsck.ocfs2 once the filesystem is unmounted.
[ 494.292018 ] OCFS2: File system is now read-only.
[ 494.292224 ] (kworker/19:11,2628,19):ocfs2_mark_extent_written:5272 ERROR: status = -30
[ 494.292602 ] (kworker/19:11,2628,19):ocfs2_dio_end_io_write:2374 ERROR: status = -3
fio: io_u error on file /mnt/scratch/racer: Read-only file system: write offset=460849152, buflen=131072 =========================================================================

In __blockdev_direct_IO, ocfs2_dio_wr_get_block is called to add unwritten extents to a list. extents are also inserted into extent tree in ocfs2_write_begin_nolock. Then another thread call fallocate to puch a hole at one of the unwritten extent. The extent at cpos was removed by ocfs2_remove_extent(). At end io worker thread, ocfs2_search_extent_list found there is no such extent at the cpos.

T1 T2 T3 inode lock ... insert extents ... inode unlock ocfs2_fallocate __ocfs2_change_file_space inode lock lock ip_alloc_sem ocfs2_remove_inode_range inode ocfs2_remove_btree_range ocfs2_remove_extent ^---remove the extent at cpos 78723 ... unlock ip_alloc_sem inode unlock ocfs2_dio_end_io ocfs2_dio_end_io_write lock ip_alloc_sem ocfs2_mark_extent_written ocfs2_change_extent_flag ocfs2_search_extent_list ^---failed to find extent ... unlock ip_alloc_sem

In most filesystems, fallocate is not compatible with racing with AIO+DIO, so fix it by adding to wait for all dio before fallocate/punch_hole like ext4.

If you want to get the best quality for vulnerability data then you always have to consider VulDB.

Analysis

by VulDB Data Team • 09/17/2025

The vulnerability described in CVE-2024-40943 affects the Linux kernel's ocfs2 filesystem implementation and stems from a race condition between hole punching operations and asynchronous I/O with direct I/O. This issue manifests as a potential for on-disk corruption when concurrent operations attempt to modify the same file extent during different phases of I/O processing. The problem specifically occurs in the interaction between fallocate operations that punch holes and asynchronous direct I/O write operations that manage unwritten extents. The race condition creates a scenario where an extent that was initially identified for writing becomes unavailable when the I/O completion worker thread attempts to mark it as written, resulting in error codes and ultimately a read-only filesystem state.

The technical flaw resides in the lack of proper synchronization between the fallocate operation that removes extents and the direct I/O completion processing that attempts to update those same extents. When the ocfs2_dio_wr_get_block function is called during direct I/O write operations, it adds unwritten extents to a list and these extents are also inserted into the extent tree by ocfs2_write_begin_nolock. However, if a concurrent fallocate operation removes one of these extents through ocfs2_remove_extent during the time between when the extent is initially identified and when the I/O completion worker thread attempts to process it, the subsequent ocfs2_search_extent_list call fails to locate the extent, returning error code -30 and ultimately causing the filesystem to become read-only. This race condition is particularly problematic because it violates the expected consistency model of the filesystem and can lead to data loss or corruption.

The operational impact of this vulnerability is significant for systems relying on ocfs2 filesystems with concurrent I/O operations. When the race condition occurs, it can cause filesystem corruption that requires manual intervention through fsck.ocfs2 to repair. The error messages indicate that the filesystem detects on-disk corruption and automatically switches to read-only mode to prevent further damage. This type of corruption can affect applications that depend on consistent file system behavior, particularly those performing concurrent fallocate operations alongside direct I/O writes. The vulnerability affects the reliability and availability of storage systems using ocfs2, especially in high-concurrency environments where asynchronous I/O and file space management operations occur simultaneously.

The fix for this vulnerability follows established patterns used in other filesystem implementations such as ext4, which addresses similar race conditions by ensuring that all direct I/O operations complete before allowing fallocate or hole punching operations to proceed. This synchronization mechanism prevents the scenario where extents are removed while still being referenced by pending I/O operations. The solution involves adding proper waiting mechanisms to ensure that all direct I/O operations complete before fallocate or punch_hole operations begin, which aligns with the broader security principle of preventing race conditions in concurrent system operations. This approach is consistent with the ATT&CK framework's focus on preventing privilege escalation through race conditions and aligns with CWE-362, which describes race conditions in concurrent systems. The implementation of this fix ensures that ocfs2 maintains consistency in its extent management and prevents the corruption patterns that can occur when concurrent operations modify the same data structures without proper synchronization.

Responsible

Linux

Reservation

07/12/2024

Disclosure

07/12/2024

Moderation

accepted

CPE

ready

EPSS

0.00185

KEV

no

Activities

very low

Sources

Are you interested in using VulDB?

Download the whitepaper to learn more about our service!