Skip to content
This repository has been archived by the owner on Mar 8, 2018. It is now read-only.

Protecting Against XSS Injection

kheyse-autodesk edited this page May 12, 2016 · 23 revisions

The goal of this page is to explain how Drywall (by default) helps protect against user XSS injection. Surely there is more we can do to keep Drywall secure, so if you know of a vulnerability that has been overlooked please let us know by opening an issue.

What is XSS injection?

Cross-site scripting (XSS) is a type of computer security vulnerability typically found in Web applications. XSS enables attackers to inject client-side script into Web pages viewed by other users.

Source: http://en.wikipedia.org/wiki/Cross-site_scripting

Security is everyone's responsibility.

Luckily there are great people dedicating their time to create communities like OWASP (The Open Web Application Security Project). What does OWASP do? Well they say it best...

Our mission is to make software security visible, so that individuals and organizations worldwide can make informed decisions about true software security risks.

Relevant Reading: XSS Filter Evasion Cheat Sheet

Should we filter data on the way in or on the way out?

The short answer: it depends.

We found an answer on StackOverflow and this part stuck with us:

You're not supposed to filter input in order to protect HTML output. It will not work or will work by needlessly malforming the data. You're supposed to HTML-escape text in HTML output.

Special Data

With some data, like usernames and emails, we are strict on what is considered valid. Special characters are not allowed in usernames or emails. So it's unlikely that an XSS vulnerability would be found in either of those pieces of data. Obviously we're sanitizing on the way in and thus are more relaxed when rendering this kind of data.

Generic Data

There is a lot of other data however that we loosely validate (like the names of things) or don't validate at all (like notes). So that's where it's more likely that a user could inject something nasty. We like keeping things simple, so littering our code with validation logic or sanitization functions didn't seem like the best fit. For this kind of data we opted to filter on the way out.

Escaping HTML in Templates

Drywall uses Jade templates at the server and Underscore.js templates at the client. Both provide ways to render escaped and unescaped code.

Jade automatically escapes values for you. p= 'This code is' + ' <escaped>!' becomes <p>This code is &lt;escaped&gt;!</p>. You can use the != modifier to render unescaped values. p!= 'This code is <strong>not</strong> escaped!' becomes <p>This code is <strong>not</strong> escaped!</p>.

Underscore.js templates render escaped values using <%- value %>. You can render unescaped values using a slight modification <%= value %>.

At first, we thought the only changes that we needed to make were to go and modify our templates and use the escaped syntax for displaying generic data. We were almost right.

Bootstrapping Data for Models & Collections

Drywall uses Backbone.js on the client to handle the models and views. You can load data into a model by making an Ajax request to the server or by already having the data ready. When you load your page with the first batch of data it's called "bootstrapping".

The Backbone.js website talks about this:

Loading Bootstrapped Models

When your app first loads, it's common to have a set of initial models that you know you're going to need, in order to render the page. Instead of firing an extra AJAX request to fetch them, a nicer pattern is to have their data already bootstrapped into the page.

...

You have to escape </ within the JSON string, to prevent javascript injection attacks.

Source: http://backbonejs.org/#FAQ-bootstrap

So when we embed data into the page to be "bootstrapped" we need to escape it. If we don't and a user throws a </script> tag into a record, that makes the markup invalid and our page won't work as expected.

Escaping & Unescaping Data with JavaScript

Our goal is to simply HTML encode our data. HTML encoding means we need to convert certain characters into their friendly versions:

HTML Character Encoded Version
& &amp;
" &quot;
' &#39;
< &lt;
> &gt;

However JavaScript doesn't have a cross-platform encoder/decoder built in. There are techniques that work in the browser and functions we can make to help with this on the server.

Luckily for us JavaScript does have escape and unescape functions built in, and they work in the browser and on the server. They don't just do HTML encoding though.

escape

Encodes a string, replacing all characters except for ASCII digits, lower and upper case letters, and the characters * + - . / @ _ with a hexadecimal escape sequence.

Source: https://developer.mozilla.org/en-US/docs/Web/API/Window.escape

unescape

Decodes a value that has been encoded in hexadecimal (e.g., a cookie) including characters not escaped by window.escape.

Source: https://developer.mozilla.org/en-US/docs/Web/API/window.unescape

How We're Doing It

Escaping on the Server

So when we want to load bootstrapped data into our template we just escape our JSONified data like so:

...
res.render('admin/categories/index', { data: { results: escape(JSON.stringify(results)) } });
...

And check this out, when our views make Ajax requests for more and/or updated data we don't need to escape since it isn't being put inside of HTML and is transferred with the application/json content type. We just send it raw like so:

...
res.send(results);
...
Unescaping on the Client

Now on the client, in our view's initialize method, we need to unescape the bootstrapped data before parsing it as JSON like so:

...
this.results = JSON.parse( unescape($('#data-results').html()) );
...

Use the Force

I hope this was helpful. If you have questions or think this page should be expanded please contribute by opening an issue or updating this page.