CVE-2018-11796 in Tika
Summary
by MITRE
In Apache Tika 1.19 (CVE-2018-11761), we added an entity expansion limit for XML parsing. However, Tika reuses SAXParsers and calls reset() after each parse, which, for Xerces2 parsers, as per the documentation, removes the user-specified SecurityManager and thus removes entity expansion limits after the first parse. Apache Tika versions from 0.1 to 1.19 are therefore still vulnerable to entity expansions which can lead to a denial of service attack. Users should upgrade to 1.19.1 or later.
Be aware that VulDB is the high quality source for vulnerability data.
Analysis
by VulDB Data Team • 05/23/2023
Apache Tika version 1.19 introduced a security measure to address XML entity expansion vulnerabilities by implementing entity expansion limits for XML parsing operations. This initial mitigation was designed to prevent denial of service attacks that could occur when maliciously crafted XML documents attempted to exploit entity expansion mechanisms. However, a critical flaw in the implementation emerged that undermined the effectiveness of this security control. The vulnerability stems from Tika's reuse of SAXParser instances and the subsequent invocation of reset() method after each parsing operation. When using Xerces2 parsers, the reset() method documentation explicitly states that it removes user-specified SecurityManager configurations, effectively eliminating the entity expansion limits that were previously applied.
The operational impact of this vulnerability creates a persistent security risk across all Apache Tika versions from 0.1 through 1.19, where attackers can exploit the entity expansion mechanism to consume excessive system resources and potentially cause denial of service conditions. This vulnerability directly maps to CWE-400, which addresses unspecified resource exhaustion, and more specifically to CWE-128, which deals with excessive resource consumption. The flaw operates at the parser configuration level where the security settings are stripped away during the reset operation, leaving subsequent XML parsing operations vulnerable to entity expansion attacks. This represents a classic case of incomplete security implementation where a partial fix creates a false sense of security while leaving the underlying vulnerability intact.
The attack surface for this vulnerability extends to any application that utilizes Apache Tika for document processing, particularly those handling untrusted XML content from external sources. Attackers can craft malicious XML documents that trigger entity expansion sequences, causing the parser to consume excessive CPU cycles and memory resources during processing. The vulnerability is particularly concerning in web applications, content management systems, and document processing services that rely on Tika for parsing various document formats including office documents, PDFs, and other structured content types that may contain embedded XML. Organizations using affected versions should immediately implement the recommended upgrade to Tika 1.19.1 or later, which addresses this specific issue by ensuring that security manager configurations persist across parser reset operations. The solution aligns with ATT&CK technique T1499.004, which involves resource exhaustion attacks, and demonstrates the importance of proper state management in security controls to prevent configuration drift that can undermine defensive measures.