Billion Laughs Attack: The XML That Brings Servers to Their Knees

Billion Laughs Attack: The XML That Brings Servers to Their Knees
How a tiny XML file under 1KB can consume gigabytes of memory and crash your servers through exponential entity expansion.
Introduction: When Laughter Becomes a Weapon
In the world of cybersecurity, some of the most devastating attacks come from the most unexpected places. The Billion Laughs attack—also known as an XML bomb or exponential entity expansion attack—is a perfect example of how a seemingly harmless XML document can bring enterprise servers crashing down. This denial-of-service (DoS) attack exploits a fundamental feature of XML parsers, turning their helpful entity expansion capability into a devastating weapon.
First reported as early as 2002 and gaining widespread attention around 2008, this vulnerability continues to affect modern applications. Recent CVEs in 2024 and 2025, including vulnerabilities in LangChain libraries (CVE-2024-1455) and sitemap parsers (CVE-2025-3225), demonstrate that this attack vector remains relevant more than two decades after its discovery.
Understanding XML Entities and DTDs
Before diving into the attack mechanics, it’s essential to understand how XML entities work. XML (Extensible Markup Language) allows developers to define entities—essentially reusable pieces of content—within a Document Type Definition (DTD). These entities act as variables or constants that can be referenced throughout an XML document.
There are two types of DTDs:
Internal DTD: Defined directly within the XML document itself, embedded between the <!DOCTYPE> declaration and the closing bracket.
External DTD: Declared in a separate file and linked to the XML document via a URI reference.
Entities serve legitimate purposes, such as defining frequently used text strings or referencing external files. However, this flexibility creates a dangerous attack surface when combined with recursive entity definitions.
The Anatomy of a Billion Laughs Attack
The classic Billion Laughs payload is elegantly destructive in its simplicity. Here’s the structure of the attack:
<?xml version="1.0"?>
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ELEMENT lolz (#PCDATA)>
<!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
<!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
<!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
<!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
<!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
<!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
]>
<lolz>&lol9;</lolz>
How the Exponential Expansion Works
The attack defines 10 entities, from lol through lol9. The first entity simply contains the string “lol.” Each subsequent entity references the previous one ten times. When an XML parser processes this document, it encounters &lol9; in the document body.
Here’s where the mathematics become terrifying:
&lol9;expands to 10 instances of&lol8;- Each
&lol8;expands to 10 instances of&lol7; - This continues recursively down to
&lol;
The result? A document containing 10^9 (one billion) copies of the string “lol.” This tiny XML file—under 1 kilobyte in size—expands to approximately 3 gigabytes in memory. The name “Billion Laughs” comes directly from this billion-fold repetition of “lol.”
The Quadratic Blowup Variant
Security researchers and attackers have developed variants to bypass defensive countermeasures. The quadratic blowup attack takes a different approach that evades detection of deeply nested entities.
Instead of using recursive nesting, this variant defines a single large entity containing thousands of characters, then references that entity thousands of times. A payload of approximately 200 kilobytes can expand to 2.5 gigabytes when parsed. This approach avoids triggering parser countermeasures that specifically check for deeply nested entity structures.
Real-World Impact and Historical Incidents
WordPress and Drupal Vulnerability (2014)
One of the most significant incidents occurred in 2014 when millions of WordPress and Drupal installations were found vulnerable to XML quadratic blowup attacks. The vulnerability in PHP’s XML processor, used by XMLRPC implementations, allowed attackers to cause CPU and memory exhaustion, potentially bringing sites offline and overwhelming databases with connection requests.
MediaWiki Vulnerabilities (2015)
MediaWiki versions before 1.24.2 were susceptible to Billion Laughs attacks through SVG uploads and XMP metadata parsing, demonstrating how the attack can propagate through seemingly innocent file uploads.
Modern AI Framework Vulnerabilities (2024-2025)
Even cutting-edge technologies remain vulnerable. CVE-2024-1455 affected the popular LangChain AI library, where XML parsing in certain components could be exploited for Billion Laughs attacks. This highlights how the vulnerability transcends traditional web applications and affects modern AI infrastructure.
Beyond XML: Attack Vectors in Other Formats
While originally targeting XML, the Billion Laughs concept applies to any format supporting macro or entity expansion.
YAML Parsers
YAML parsers face similar risks through anchors (&) and aliases (*), which enable recursive references causing exponential data expansion during deserialization. The PyYAML library, for instance, has documented vulnerabilities related to this attack pattern.
SVG and Image Metadata
SVG files, being XML-based, can carry Billion Laughs payloads. Similarly, XMP metadata in formats like PDF or JPEG can contain XML bombs that exploit metadata extraction tools.
JSON Considerations
While JSON lacks native entity support, analogous denial-of-service attacks occur through deeply nested or recursive structures. The Jackson library in Java, for example, experienced CVE-2020-36518, where unbounded nesting triggered stack overflow exceptions.
Common Attack Scenarios
Attackers can deliver Billion Laughs payloads through multiple vectors:
Web Application POST Requests: Sending malicious XML directly to endpoints that accept XML input, such as SOAP services or REST APIs.
File Uploads: Uploading SVG images, Office documents (DOCX, XLSX), or other XML-containing files that trigger parsing on the server.
SOAP Messages: Enterprise web services using SOAP are particularly vulnerable when processing untrusted XML payloads.
Configuration Files: Applications that accept XML configuration files from users may inadvertently process malicious entities.
Prevention and Mitigation Strategies
Protecting applications from Billion Laughs attacks requires a multi-layered defense approach.
1. Disable DTD Processing Entirely
The most effective defense is completely disabling DTD (Document Type Definition) processing. When DTDs are disallowed, nearly all XML entity attacks become impossible. According to OWASP guidelines, this should be the primary defense mechanism.
2. Limit Entity Expansion
If DTD processing cannot be disabled, configure strict limits on entity expansion. For .NET applications, the MaxCharactersFromEntities property restricts how much content entities can expand to. Java applications can use XMLConstants.FEATURE_SECURE_PROCESSING to enable security restrictions.
3. Use Secure Parser Configurations
Different programming languages require specific configurations:
Java: Set the feature http://apache.org/xml/features/disallow-doctype-decl to true on DocumentBuilderFactory or SAXParserFactory.
Python: Use the defusedxml library or configure parsers with resolve_entities=False.
PHP: For versions before 8.0, explicitly disable external entity loading. PHP 8.0 and newer prevent XXE by default.
.NET: Versions 4.5.2 and later include built-in protections, but older versions require explicit configuration to set DtdProcessing to Prohibit or Ignore.
4. Implement Lazy Entity Expansion
Rather than expanding all entities immediately, configure parsers to expand entities only when and to the extent their content is actually needed. This limits the impact of exponential expansion.
5. Consider Alternative Formats
JSON has become the preferred alternative to XML for data exchange in many scenarios. Since JSON doesn’t support external entity references or entity expansion, it eliminates this entire class of vulnerabilities.
6. Input Validation and Sanitization
Before processing any XML input, validate and sanitize it. Block or escape XML metacharacters like <, >, ", and & when they appear in user-provided content that will be embedded in XML documents.
Testing for Vulnerabilities
Security teams should regularly test applications for XML bomb vulnerabilities using tools like:
- OWASP ZAP: Includes checks for exponential entity expansion vulnerabilities
- Burp Suite: Can inject malicious XML payloads to test parser behavior
- Nikto: Identifies XXE and related vulnerabilities in web applications
Fuzzing techniques that inject malicious entities into XML payloads can reveal parser weaknesses before attackers exploit them.
Framework-Specific Guidance
.NET Framework
Applications using .NET Framework 4.5.1 and earlier are vulnerable by default. The XmlDocument, XPathNavigator, and XMLReader classes all require explicit configuration to disable DTD processing:
XmlReaderSettings settings = new XmlReaderSettings()
{
DtdProcessing = DtdProcessing.Prohibit,
MaxCharactersFromEntities = 1024
};
Java Applications
Java applications should configure parser factories to disallow doctype declarations:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setXIncludeAware(false);
Python Applications
The standard library’s XML modules have varying levels of vulnerability. The official Python documentation includes a dedicated section on XML vulnerabilities, recommending the use of defusedxml for processing untrusted XML.
Conclusion: A Legacy Vulnerability That Persists
The Billion Laughs attack represents a fascinating intersection of clever engineering and devastating impact. A concept first documented over two decades ago continues to affect modern applications, from content management systems to cutting-edge AI frameworks.
The persistence of this vulnerability underscores several important lessons for the security community. First, fundamental design flaws in widely-used technologies can have remarkably long lifespans. Second, security must be built into parser configurations by default, not left as an exercise for developers. Third, even well-understood vulnerabilities require constant vigilance as they appear in new contexts and technologies.
For developers and security professionals, the path forward is clear: disable DTD processing wherever possible, implement strict entity expansion limits when DTDs are necessary, regularly audit applications for XML parsing vulnerabilities, and consider migrating to safer data formats where XML’s features aren’t essential.
The Billion Laughs attack may have a humorous name, but there’s nothing funny about a crashed production server. By understanding this attack’s mechanics and implementing proper defenses, organizations can ensure the last laugh belongs to them—not the attackers.