JavaScript DOM XSS Injection validation

JavaScript DOM XSS Injection validation - javascript

Is this regular expression enough to catch all cross site scripting attempts when embedding HTML into the DOM. eg: Such as with document.write()
(javascript:|<\s*script.*?\s*>)
It is referenced in this document from modsecurity.com
http://www.modsecurity.org/documentation/Ajax_Fingerprinting_and_Filtering_with_ModSecurity_2.0.pdf
Would it catch all <\sscript.?\s*> variants in UTF-8 for instance?

Unfortunately not. There are actually quite a few ways to sneak past that regex if an attacker is really trying. With modern browsers, that regex should do a pretty good job, but its not exhaustive. For example, something along the lines of this could open javascript without explicitly saying script or javascript
<img src="blah.jpg" alt="" onmousedown="alert('a')" />
Check out here (somewhat outdated but gets the point across) and here for more examples

Is this regular expression enough to catch all cross site scripting attempts
Hahahahahahahahahaha.
Sorry. But really... no, that's not even the tip of the iceberg.
Daniel has mentioned one other method of injecting script, but really there are hundreds. It is not at all possible to sanitise HTML using a simple regex. The only approach (and even then it's not trivial) is to properly parse the HTML, throwing out all malformed sequences and element/attribute names except for a few known-safe ones.
Of course this only applies when you are actually deliberately accepting HTML input and you want to limit its potential harm. If the situation is that you're accepting text but forgetting to escape it properly on the way out, you need to fix that HTML-escaping, because no amount of input-sniffing will fix an output-problem.
This is why mod_security is utterly bogus. It is giving you the illusion of improved security by catching a few of the most basic injection techniques, while letting everything else through to a vulnerable application. It won't, in the end, prevent you from being hacked, but the more injection signatures you add, the more it'll deny and mess up legitimate requests. For example it might prevent me from entering this message because it contains the string <script>.

The other respondents are right: there are many contexts through which injection may occur. Remember, the solution must consider both the many contexts in which an injection can occur. Blacklist (or "known bad") approach to filtration won't work because they fall prey to attacks that encode injections using unexpected character sets, creative use of whitespace, and other techniques. For more information, see OWASP DOM Based XSS. That page's Links educate on the 'problem' side.
As for the solution, consider the OWASP XSS DOM Prevention Cheat Sheet, which we just published. There are several tool kits referenced within the Cheat Sheet that help you implement escaping or encoding strategies. Probably MY FAVORITE approach to assuring that server-written client-side code is encoded and escaped appropriately is JXT. From its google code page:
<!-- Automatically escaped content -->
Hello ${user.getName()}!
<!-- Example tag with 3 different contextual encoding requirements -->
<img src="/profile-photo?user=${user.getId()}"
alt="Photo of ${user.getName()}"
onclick="openProfile('${user.getId()}')" />
<!-- Override the default escape, rare, but occasionally needed: -->
<jxt:out value="${user.getProfileHtml()}" escape="none"/>
Note that this includes auto-escaping for contexts but also a custom tag that allows for un-escaped output, in case special elements of your page/app would be broken by a cart blanch encoding regime.

Related

Cross Site Scripting: Is restricting the use of < and > tags an effective way to reduce Cross Site Scripting?

If I want to prevent XSS, would restricting the input of special characters such as < and > in all text entry forms be the best way to prevent it?
I mean, this would prevent the entry of html tags such as <script> , <img> etc. and effectively block XSS.
Would you agree?

No. The best way to prevent it is to ensure that all the information you output onto the page is appropriately encoded.
Some possible examples of why angle brackets (and other special character blocking) is insufficient:
https://security.stackexchange.com/questions/36629/cross-site-scripting-without-special-chars

One of the biggest problems with preventing XSS is that a single webpage has many different encoding contexts, some of which may or may not overlap. There's a reason double-encoding is considered inherently dangerous.
Let's see an example. You prohibit < and >, so I can no longer input a HTML element in your page, right? Well, not quite. For example, if you put the text I loaded into an attribute, it will be interpreted differently:
onload="document.write('<script>window.alert("Gotcha!")</script>')"
There's plenty of such opportunities, and each needs their own variant of correct encoding. Even encoding the input as proper HTML text (e.g. turning < into <) may be a vulnerability if the text is then taken in javascript, and used in something like innerHTML, for example.
The same kind of issue occurs with any kind of URL (img src="javascript:alert('I can't let you do that, Dave')"), or with embedding user input in any kind of script (\x3C). URL is especially dangerous, since it does triple encoding - URL encoding, (X)HTML encoding and possibly JavaScript encoding. I'm not sure if it's even possible to have user input that is safe under those conditions :D
Ideally, you want to limit your area of exposure as much as you can. Do not read from the generated document unless you trust the user (e.g. an admin). Avoid multiple encoding, and always make sure you know exactly where each potentially unsafe encoding goes. In XHTML, you have a great option in CDATA sections, which make encoding potentially dangerous code easy, but that might be interpreted incorrectly by browsers that don't support XHTML correctly. Otherwise, use a proper documented encoding method - in JS, this would be innerText. Of course, you need to make sure that your JS script isn't compromised due to user data.

Security comparison of eval and innerHTML for clientside javascript?

I've been doing some experimenting with innerHTML to try and figure out where I need to tighten up security on a webapp I'm working on, and I ran into an interesting injection method on the mozilla docs that I hadn't thought about.
var name = "<img src=x onerror=alert(1)>";
element.innerHTML = name; // Instantly runs code.
It made me wonder a.) if I should be using innerHTML at all, and b.) if it's not a concern, why I've been avoiding other code insertion methods, particularly eval.
Let's assume I'm running javascript clientside on the browser, and I'm taking necessary precautions to avoid exposing any sensitive information in easily accessible functions, and I've gotten to some arbitrarily designated point where I've decided innerHTML is not a security risk, and I've optimized my code to the point where I'm not necessarily worried about a very minor performance hit...
Am I creating any additional problems by using eval? Are there other security concerns other than pure code injection?
Or alternatively, is innerHTML something that I should show the same amount of care with? Is it similarly dangerous?

tl;dr;
Yes, you are correct in your assumption.
Setting innerHTML is susceptible to XSS attacks if you're adding untrusted code.
(If you're adding your code though, that's less of a problem)
Consider using textContent if you want to add text that users added, it'll escape it.
What the problem is
innerHTML sets the HTML content of a DOM node. When you set the content of a DOM node to an arbitrary string, you're vulnerable to XSS if you accept user input.
For example, if you set the innerHTML of a node based on the input of a user from a GET parameter. "User A" can send "User B" a version of your page with the HTML saying "steal the user's data and send it to me via AJAX".
See this question here for more information.
What can I do to mitigate it?
What you might want to consider if you're setting the HTML of nodes is:
Using a templating engine like Mustache which has escaping capabilities. (It'll escape HTML by default)
Using textContent to set the text of nodes manually
Not accepting arbitrary input from users into text fields, sanitizing the data yourself.
See this question on more general approaches to prevent XSS.
Code injection is a problem. You don't want to be on the receiving end.
The Elephant in the room
That's not the only problem with innerHTML and eval. When you're changing the innerHTML of a DOM node, you're destroying its content nodes and creating new ones instead. When you're calling eval you're invoking the compiler.
While the main issue here is clearly un-trusted code and you said performance is less of an issue, I still feel that I must mention that the two are extremely slow to their alternatives.

The quick answer is: you did not think of anything new. If anything, do you want an even better one?
<scr\0ipt>alert("XSSed");</scr\0ipt>
The ground, bottom line is that there are more ways to trigger XSS than you think there is. All the following are valid:
onerror, onload, onclick, onhover, onblur etc... are all valid
The use of character encoding to bypass filters (null byte highlighted above)
eval falls into another category, however - it is a byproduct, most of the time to obfuscate. If you're falling to eval and not innerHTML, you're in a very, very small minority.
The key to all this is to sanitize your data using a parser that keeps up to date with what pen testers discover. There are a couple of those around. They absolutely need to at least filter all the ones on the OWASP list - those are pretty much common.

innerHTML isn't insecure in and of itself. (Nor is eval, if only used on your code. It's actually more of a bad idea for several other reasons.) The insecurity arises in displaying visitor-submitted content. And that risk applies to any mechanism with which you embed user-content: eval, innerHTML, etc. on the client-side, and print, echo, etc. on the server-side.
Anything you put on the page from a visitor must be sanitized. It doesn't matter a great deal whether you do it when the initial page is being built or added asynchronously on the client-side.
So ... yes, you need to show some care when using innerHTML if you're displaying user-submitted content with it.

Is there a way to "stop script" from running using JavaScript?

How can I stop script from execution in JavaScript? In case of cross-site scripting (XSS) attacks, the fundamental requirements are injection + execution of script.
Imagine a scenario where attacker is able to inject JavaScript in a page & our goal is to stop attacker's script from execution. The injection point as an example can be any user-supplied input area.

Always always escape your input. For XSS, use htmlentities() to escape HTML and JS. Here's a good article on PHP Security
http://www.phpfreaks.com/tutorial/php-security

There are basically two things to be careful of when dealing with XSS:
Escape your output. Escaping the input just takes more resources for nothing. Escape your user-submitted content output. It also means that non-escaped content is in your database, which is a good thing (in case of false positives you can fix that without losing content, in case of a new XSS policy you don't need to modify all your database, etc).
Secure your javascript code. Be very careful not to include some flaw using eval() or something like it.

As others said, the best and easiest way to protect yourself from XSS is validating input and properly escape output depending on the insertion point (HTML, most likely, with entities or JavaScript / CSS blocks -- unlikely and more difficult to properly escape).
However, if your use case is outputting raw user input which is supposed to contain arbitrary HTML and you just want to prevent injected JavaScript to mess with your site, you can either:
1) Frame the content in a different, unique domain (so it cannot share cookies with your main document), e.g. xyz123.usercontent.com (with xyz123 different for any user)
2) Wait for and/or CSP's sandbox directive to be standardized in every browser you support (and, of course, denying access to uncapable browsers).

Your only solution is to prevent scripts from being injected. There's several things you can do to achieve this:
Never trust input from the user. That means form inputs, query string parameters, cookie content, or any other data obtained from an incoming request.
Sanitize everything you render, everywhere you render it. I like to achieve this with two clearly-named rendering functions in templates, render and render_unsafe. Rails has a similar interface since 3.0 which sanitizes all template data unless you specifically ask for unsanitized rendering. Having a clearly-named interface will make it easier to keep your templates in check, and ensuring that unsanitized renders are the exception forces you to make a decision every time you dump data into a template.
If you must allow the user to run functions directly, always do it through a whitelist. Have them supply a function name or some other identifier as a string and their arguments as JSON or some other parseable construct. Have a look at the design for Shopify's Liquid templating system which uses a similar execution-safe whitelisting pattern.
Never trust input from the user. Not ever.

Prevent XSS attacks site-wide

I'm new to ColdFusion, so I'm not sure if there's an easy way to do this. I've been assigned to fix XSS vulnerabilities site-wide on this CF site. Unfortunately, there are tons of pages that are taking user input, and it would be near impossible to go in and modify them all.
Is there a way (in CF or JS) to easily prevent XSS attacks across the entire site?

I hate to break it out to you, but -
XSS is an Output problem, not an Input problem. Filtering/Validating input is an additional layer of defence, but it can never protect you completely from XSS. Take a look at XSS cheatsheet by RSnake - there's just too many ways to escape a filter.
There is no easy way to fix a legacy application. You have to properly encode anything that you put in your html or javascript files, and that does mean revisiting every piece of code that generates html.
See OWASP's XSS prevention cheat sheet for information on how to prevent XSS.
Some comments below suggest that input validation is a better strategy rather than encoding/escaping at the time of output. I'll just quote from OWASP's XSS prevention cheat sheet -
Traditionally, input validation has been the preferred approach for handling untrusted data. However, input validation is not a great solution for injection attacks. First, input validation is typically done when the data is received, before the destination is known. That means that we don't know which characters might be significant in the target interpreter. Second, and possibly even more importantly, applications must allow potentially harmful characters in. For example, should poor Mr. O'Malley be prevented from registering in the database simply because SQL considers ' a special character?
To elaborate - when the user enters a string like O'Malley, you don't know whether you need that string in javascript, or in html or in some other language. If its in javascript, you have to render it as O\x27Malley, and if its in HTML, it should look like O'Malley. Which is why it is recommended that in your database the string should be stored exactly the way the user entered, and then you escape it appropriately according to the final destination of the string.

One thing you should look at is implementing an application firewall like Portcullis: http://www.codfusion.com/blog/page.cfm/projects/portcullis which includes a much stronger system then the built in scriptProtect which is easily defeated.
These are a good starting point for preventing many attacks but for XSS you are going to end up going in by hand and verifying that you are using things like HTMLEditFormat() on any outputs that can be touched by the client side or client data to prevent outputting valid html/js code.

The ColdFusion 9 Livedocs describe a setting called "scriptProtect" which allows you to utilize coldfusion's protection. I've have not used it yet, so I'm not sure how effective it is.
However, if you implement a third-party or your own method of handling it, you would most likely want to put it in the "onRequestStart" event of the application to allow it to handle the entire site when it comes to URL and FORM scope violations (because every request would execute that code).

Besides applying all the ColdFusion hot fixes and patches you can also:
Not full proof but helps, Set the following under CFADMIN > Settings > "Enable Global Script Protection"
Add CSRFToken to your forms http://www.owasp.org/index.php/Cross-Site_Request_Forgery_%28CSRF%29_Prevention_Cheat_Sheet
Check http Referer
Add validation for all User inputs
Use cfqueryparam for your queries
Add HTMLEditFormat() on any outputs
Besides Peter Freitag's excellent blog you should also subscribe to Jason Dean's http://www.12robots.com

nested quotes in javascript

A bit of a noob question here...
I have a javascript function on a list of table rows
<tr onclick="ClosePopup('{ScenarioID}', '{Name}');" />
However, the {Name} value can sometimes contain the character "'" (single quote). At the moment the error Expected: ')' comes up as a result because it is effectivly ending the javascript function early and destroying the syntax.
What is the best way to prohibit the single quotes in {Name} value from effecting the javascript?
Cheers!

You're committing the first mortal sin of insecure web template programming - not escaping the content of the values being rendered into the template. I can almost guarantee you that if you take that approach, your web app will be vulnerable to XSS (cross site scripting) and any third party will be able to run custom javascript in your page, stealing user data and wreaking havoc as they wish.
Check it out. http://en.wikipedia.org/wiki/Cross-site_scripting
The solution is to escape the content. And to do that properly in the javascript, which is also inside html, is a lot more than just putting escape sequences in front of backslashes.
Any decent templating engine out there should provide you a way to escape content as it's written to the template. Your database values can be left as-is, the important part is escaping it at output time. If your template engine or dynamic web app framework doesn't allow for this, change to one that does. :)

In support of the prior comment please read the following to gain a better understanding of why the security advice is so important.
http://eval.symantec.com/mktginfo/enterprise/white_papers/b-whitepaper_web_based_attacks_03-2009.en-us.pdf

I would think that you could kill just about any code injection by, for example, replacing
"Hello"
with
String.fromCharCode(72,101,108,108,111)

Although the security information provided by everyone is very valuable, it was not so relevant to me in this situation as everything in this instance is clientside, security measures are applied when getting the data and rendering the XML. The page is also protected through windows authentication (adminsitration section only) and the web app framework cannot be changed. The answer i was looking for was really quite simple in the end.
<tr onclick='ClosePopup("{ScenarioID}", "{Name}");' />

Develop Reference

JavaScript is the programming language of the Web.