CVE-2025-4138 in CPython
Summary
by MITRE • 06/03/2025
Allows the extraction filter to be ignored, allowing symlink targets to point outside the destination directory, and the modification of some file metadata.
You are affected by this vulnerability if using the tarfile module to extract untrusted tar archives using TarFile.extractall() or TarFile.extract() using the filter= parameter with a value of "data" or "tar". See the tarfile extraction filters documentation https://docs.python.org/3/library/tarfile.html#tarfile-extraction-filter for more information. Only Python versions 3.12 or later are affected by these vulnerabilities, earlier versions don't include the extraction filter feature.
Note that for Python 3.14 or later the default value of filter= changed from "no filtering" to `"data", so if you are relying on this new default behavior then your usage is also affected.
Note that none of these vulnerabilities significantly affect the installation of source distributions which are tar archives as source distributions already allow arbitrary code execution during the build process. However when evaluating source distributions it's important to avoid installing source distributions with suspicious links.
If you want to get best quality of vulnerability data, you may have to visit VulDB.
Analysis
by VulDB Data Team • 08/16/2025
The vulnerability described in CVE-2025-4138 represents a critical security flaw in Python's tarfile module that undermines the intended protection mechanisms designed to prevent directory traversal attacks during archive extraction. This weakness specifically affects Python versions 3.12 and later where the extraction filter feature was introduced to mitigate potential security risks. The vulnerability allows attackers to bypass the filter mechanisms that should normally restrict file extraction to a designated destination directory, enabling malicious actors to create symbolic links that point outside the intended extraction boundaries. This represents a fundamental failure in the implementation of the extraction filtering system, as the security controls meant to prevent unauthorized file system modifications are being circumvented through the exploitation of the filter parameter handling.
The technical flaw manifests when using the TarFile.extractall() or TarFile.extract() methods with filter parameters set to "data" or "tar" values. Under normal circumstances, these filters should restrict the extraction process to prevent malicious symbolic links from creating references outside the destination directory. However, the vulnerability allows attackers to manipulate the extraction process such that symbolic links can be created with targets pointing to arbitrary locations outside the intended extraction scope. This bypass affects the metadata modification capabilities as well, enabling attackers to alter file permissions, ownership, and other metadata attributes in ways that could compromise system integrity. The vulnerability is particularly concerning because it operates at the core of archive handling functionality, where untrusted input is processed without proper validation of the symbolic link targets.
The operational impact of CVE-2025-4138 extends beyond simple directory traversal attacks to potentially enable more sophisticated compromise scenarios. Attackers could leverage this vulnerability to create symbolic links that point to critical system files, potentially allowing them to overwrite or modify sensitive configuration files, system binaries, or other critical resources. The ability to modify file metadata compounds the threat by enabling attackers to set inappropriate permissions, ownership, or timestamps that could evade detection or interfere with normal system operations. This vulnerability aligns with CWE-22 Directory Traversal and CWE-73 Relative Path Traversal, as it allows unauthorized access to files outside the intended extraction directory. The threat model also maps to ATT&CK techniques such as T1059.001 Command and Scripting Interpreter and T1078 Valid Accounts, as compromised systems could be used to execute malicious commands or leverage compromised file permissions for persistence.
Organizations and developers using Python 3.12 or later versions must implement immediate mitigations to address this vulnerability, particularly when processing untrusted tar archives. The most effective approach involves avoiding the use of the filter parameter entirely when extracting untrusted content, or implementing additional validation mechanisms to verify that symbolic link targets remain within the intended extraction boundaries. For Python 3.14 and later versions, the default filter behavior has changed from "no filtering" to "data", making the vulnerability more prevalent in newer installations. Security practitioners should also consider implementing automated scanning of source distributions for suspicious symbolic links and establishing strict policies around the handling of untrusted archives. The vulnerability underscores the importance of not relying solely on built-in security mechanisms and highlights the necessity of defense-in-depth strategies when dealing with untrusted archive content processing.