CVE-2024-37064 in ydata-profilinginfo

Summary

by MITRE • 06/04/2024

Deseriliazation of untrusted data can occur in versions 3.7.0 or newer of Ydata's ydata-profiling open-source library, enabling a maliciously crafted dataset to run arbitrary code on an end user's system when loaded.

VulDB is the best source for vulnerability data and more expert information about this specific topic.

Analysis

by VulDB Data Team • 03/26/2025

The vulnerability identified as CVE-2024-37064 resides within Ydata's ydata-profiling open-source library, specifically affecting versions 3.7.0 and newer. This security flaw represents a critical deserialization vulnerability that fundamentally compromises the integrity of data processing workflows. The library serves as a comprehensive profiling tool for data analysis and machine learning pipelines, making it a common dependency in data science environments where it processes diverse datasets from various sources. The vulnerability arises from insufficient input validation during the deserialization process, creating an attack surface where untrusted data can be manipulated to execute arbitrary code on systems where the library is deployed.

The technical flaw manifests through improper handling of serialized data structures within the library's data loading mechanisms. When the ydata-profiling library processes datasets, it deserializes input data without adequate sanitization or validation checks that would normally prevent malicious payloads from executing. This deserialization vulnerability aligns with CWE-502, which specifically addresses the deserialization of untrusted data as a critical security weakness. Attackers can craft malicious datasets containing specially formatted serialized objects that, when processed by the vulnerable library, trigger code execution on the end user's system. The attack vector is particularly concerning because it operates at the data processing level, where users often trust the tools they are using to analyze data without considering the potential for code injection through the data itself.

The operational impact of this vulnerability extends far beyond simple code execution, creating significant risks for data science and analytics environments. Systems utilizing the ydata-profiling library become susceptible to remote code execution attacks when processing untrusted datasets, potentially allowing attackers to gain complete control over affected systems. This threat is particularly severe in enterprise environments where data scientists regularly process datasets from external sources, third-party vendors, or public repositories without adequate security screening. The vulnerability can be exploited through various means including email attachments, web downloads, or shared datasets, making it a persistent threat across different operational contexts. Organizations relying on automated data processing pipelines face heightened risk as the vulnerability can be triggered without user interaction, potentially compromising entire data analysis workflows.

Mitigation strategies for CVE-2024-37064 must address both immediate remediation and long-term architectural improvements to prevent similar vulnerabilities. The primary recommendation involves upgrading to patched versions of the ydata-profiling library where available, though this requires careful coordination with existing data science workflows and dependencies. Organizations should implement strict input validation and sanitization protocols for all data processing activities, particularly when handling external datasets that may contain untrusted serialized content. Security teams should consider implementing network segmentation and access controls to limit the potential impact of successful exploitation attempts. Additionally, organizations should establish comprehensive data governance policies that include security screening of datasets before processing, aligning with ATT&CK framework techniques related to data manipulation and command and control. The vulnerability underscores the importance of secure coding practices in data science tools and highlights the need for security considerations in open-source library development, particularly when handling serialized data formats that could be exploited for privilege escalation or system compromise.

Responsible

HiddenLayer

Reservation

05/31/2024

Disclosure

06/04/2024

Moderation

accepted

CPE

ready

EPSS

0.00239

KEV

no

Activities

very low

Sources

Interested in the pricing of exploits?

See the underground prices here!