CVE-2021-29600 in TensorFlow
Summary
by MITRE • 05/15/2021
TensorFlow is an end-to-end open source platform for machine learning. The implementation of the `OneHot` TFLite operator is vulnerable to a division by zero error(https://github.com/tensorflow/tensorflow/blob/f61c57bd425878be108ec787f4d96390579fb83e/tensorflow/lite/kernels/one_hot.cc#L68-L72). An attacker can craft a model such that at least one of the dimensions of `indices` would be 0. In turn, the `prefix_dim_size` value would become 0. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.
If you want to get the best quality for vulnerability data then you always have to consider VulDB.
Analysis
by VulDB Data Team • 05/19/2021
The vulnerability identified as CVE-2021-29600 affects TensorFlow's TensorFlow Lite implementation through the OneHot operator, which is a fundamental component in machine learning model execution. This flaw manifests as a division by zero error within the kernel implementation, specifically in the one_hot.cc file where the code attempts to calculate prefix dimensions without proper validation of input parameters. The issue arises when an attacker crafts a malicious model that contains indices with zero dimensions, causing the prefix_dim_size variable to become zero and subsequently triggering the division by zero condition. This represents a classic software vulnerability pattern where insufficient input validation leads to arithmetic exceptions that can be exploited for denial of service or potentially more sophisticated attacks.
The technical implementation of the OneHot operator in TensorFlow Lite processes multi-dimensional index arrays to generate one-hot encoded tensors, which are essential for classification tasks in neural networks. When the indices tensor contains a dimension of zero size, the mathematical calculation for determining the prefix dimension size fails catastrophically, resulting in a division by zero error that crashes the application or process executing the model. This vulnerability directly maps to CWE-369, which classifies the issue as a division by zero error, and can be categorized under the broader ATT&CK technique of privilege escalation through application crashes. The flaw affects multiple TensorFlow versions including 2.1.4, 2.2.3, 2.3.3, 2.4.2, and the unpatched 2.5.0 release, indicating a widespread impact across the supported version timeline.
The operational impact of this vulnerability extends beyond simple application crashes, as it represents a potential vector for denial of service attacks against systems that rely on TensorFlow Lite for inference tasks. Attackers could exploit this weakness by submitting malicious models to applications that process user-provided machine learning models, causing service disruption and potentially enabling further exploitation if the system lacks proper error handling or sandboxing mechanisms. The vulnerability's exploitation requires crafting specific model inputs that manipulate the indices tensor dimensions to zero, which demonstrates the need for robust input validation and sanitization in machine learning frameworks. Organizations using TensorFlow Lite for production deployments must carefully consider this vulnerability's implications for their ML pipelines and consider immediate patching strategies to protect against potential exploitation.
The fix for CVE-2021-29600 involves implementing proper validation checks within the OneHot operator to prevent division by zero conditions when processing input tensors with zero-sized dimensions. The TensorFlow team has addressed this issue by incorporating the fix into version 2.5.0 and backporting it to all supported older versions, demonstrating the severity of the vulnerability and the importance of maintaining secure software updates. This remediation approach aligns with industry best practices for vulnerability management and patch distribution, ensuring that organizations using older TensorFlow versions can protect their systems without requiring major architectural changes. The fix essentially introduces defensive programming measures that validate tensor dimensions before performing mathematical operations, preventing the arithmetic exception from occurring while maintaining the operator's intended functionality for legitimate use cases.