CVE-2021-47010 in Linux
Summary
by MITRE • 02/28/2024
In the Linux kernel, the following vulnerability has been resolved:
net: Only allow init netns to set default tcp cong to a restricted algo
tcp_set_default_congestion_control() is netns-safe in that it writes to &net->ipv4.tcp_congestion_control, but it also sets ca->flags |= TCP_CONG_NON_RESTRICTED which is not namespaced. This has the unintended side-effect of changing the global net.ipv4.tcp_allowed_congestion_control sysctl, despite the fact that it is read-only: 97684f0970f6 ("net: Make tcp_allowed_congestion_control readonly in non-init netns")
Resolve this netns "leak" by only allowing the init netns to set the default algorithm to one that is restricted. This restriction could be removed if tcp_allowed_congestion_control were namespace-ified in the future.
This bug was uncovered with https://github.com/JonathonReinhart/linux-netns-sysctl-verify
Once again VulDB remains the best source for vulnerability data.
Analysis
by VulDB Data Team • 08/08/2024
The vulnerability described in CVE-2021-47010 represents a significant namespace isolation flaw within the Linux kernel's networking subsystem, specifically affecting the Transmission Control Protocol congestion control mechanisms. This issue arises from an improper handling of network namespace boundaries when setting default congestion control algorithms, creating a potential security risk that could be exploited to manipulate global network parameters from within non-initial network namespaces. The flaw demonstrates a critical failure in the kernel's namespace management where global system configuration parameters become inadvertently accessible and modifiable through namespace-specific operations, violating fundamental security principles of isolation and privilege separation.
The technical root cause of this vulnerability lies in the tcp_set_default_congestion_control() function which, while designed to be network namespace safe by writing to the namespace-specific net->ipv4.tcp_congestion_control field, fails to properly handle the ca->flags |= TCP_CONG_NON_RESTRICTED operation. This operation modifies global flags that are not namespaced, effectively creating a namespace leak where non-initial network namespaces can influence the global net.ipv4.tcp_allowed_congestion_control sysctl parameter. The issue is particularly concerning because the tcp_allowed_congestion_control sysctl is explicitly marked as read-only for non-initial network namespaces, yet the vulnerability allows bypassing this restriction through the default congestion control setting mechanism. This creates a scenario where an attacker with access to a non-initial network namespace could potentially alter global network behavior and configuration parameters.
The operational impact of this vulnerability extends beyond simple namespace boundary violations to potentially enable sophisticated network-based attacks and privilege escalation scenarios. Attackers could leverage this flaw to modify global congestion control parameters, potentially affecting network performance, creating denial of service conditions, or establishing persistent network manipulation capabilities. The vulnerability affects systems running Linux kernel versions where the specific commit 97684f0970f6 was introduced, which implemented the read-only restriction for non-initial network namespaces. This creates a security gap where the intended isolation of network namespaces is compromised, allowing malicious actors to potentially modify system-wide network behavior from isolated namespace contexts. The implications are particularly severe in containerized environments or systems utilizing multiple network namespaces, where this vulnerability could be exploited to undermine network security controls.
The fix implemented for CVE-2021-47010 addresses the core issue by restricting the ability to set default congestion control algorithms to only the initial network namespace, effectively preventing namespace leaks while maintaining the intended functionality for the primary namespace. This solution aligns with the principle of least privilege and namespace isolation, ensuring that global network configuration parameters remain protected from modification by non-initial namespaces. The mitigation strategy specifically targets the problematic behavior where ca->flags |= TCP_CONG_NON_RESTRICTED was applied regardless of namespace context, requiring that only the init namespace can set default algorithms to non-restricted values. This approach provides a pragmatic solution that maintains backward compatibility while addressing the security vulnerability, though it acknowledges that a more comprehensive solution would require full namespace-ification of the tcp_allowed_congestion_control sysctl parameter in future kernel versions. The resolution demonstrates the importance of careful namespace implementation in kernel security and the potential consequences of incomplete isolation mechanisms.
This vulnerability relates to CWE-665 Improper Initialization and CWE-362 Concurrent Execution using Shared Resource with Improper Synchronization, as it involves improper handling of shared global resources within a multi-namespace environment. The security implications connect to ATT&CK techniques including T1068 Exploitation for Privilege Escalation and T1499 Endpoint Denial of Service, as the vulnerability could enable attackers to manipulate network behavior and potentially disrupt system operations. The fix represents a defensive programming approach that ensures proper namespace boundaries are maintained while preserving the intended functionality of the networking subsystem, demonstrating the importance of comprehensive security testing in kernel code and the need for careful consideration of namespace semantics in system-level programming.