Having worked as a developer for over ten years before branching into pentesting, I always aimed to bake security into my work, even if I didn’t have the wealth of knowledge that I now have in order to do my day job.

One of the things that always struck me was that avoiding injection flaws such as Cross-Site Scripting (XSS) and SQL Injection (SQLi) also meant getting the product right from a functional point of view. I mean, how good is a web application that doesn’t let the user set their surname to O'leary or use many of the other fields that may legitimately contain “special characters” in order to function properly? This branches right across the board of possible inputs and application functionality, from storing code snippets (think Stack Overflow) to binary or other script data.

To explain, say that the website outputs the following into an HTML page:

<script>
var surname = '<user supplied>';
</script>

And say Des O’Connor dutifully comes to your site to register. Now, if your app isn’t taking into account the “special character” in his name, this JavaScript falls over:

<script>
var surname = 'O'Connor';
</script>

Uncaught SyntaxError: Unexpected identifier

Being aware that characters can be input that will cause disruption your site is the first step to ensuring your application is secure against injection flaws. Notice the word can. Of course, users that just want to use your app rather than attack it won’t do this on purpose, but you should never trust all users to act in such a way, and as demonstrated above functional flaws, rather than security flaws, can be unearthed even from trusted users.

The same applies with SQL injection if prepared statements are not used.

INSERT INTO user(surname) VALUES('O'Connor');

You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'Connor')' at line 1

In some of the applications I have tested the developers do not seem to have realised that their application can be broken so easily, even by a non-malicious user simply setting their name.

Remediation

So if the first step is realising that your app can be broken by such input, the second is knowing how to fix it. On first take, it appears that the JavaScript problem can be solved by escaping ' to \'.

<script>
var surname = 'O\'Connor';
</script>

(Fiddle)

Problem solved?

Not quite, if the above is rendered for another user (say an admin to make things more juicy), then a malicious user could set their surname to O\';alert("i haz XSS'd you")// to achieve stored XSS. If it is only the current user that is affected, yes we have XSS, but only self-XSS:

So in our admin scenario we get this rendered:

<script>
var surname = 'O\\';alert("i haz XSS\'d you")//';
</script>

(Fiddle)

Of course you need to use your imagination here and realise the attacker won’t just inject a simple alert, the are more likely to use something such as BEEF or their own custom script to compromise the session (frightfully easy if there are session cookies without the HttpOnly flag).

So the next step is to escape \ first:

<script>
var surname = 'O\\\';alert("i haz XSS\'d you")//';
</script>

(Fiddle)

All well and good then, the only alert box shown is the one the application developers want, and not the attacker injected one. Problem solved this time?

Then the web’s most prolific hacker comes along

and stuffs in </script><script>alert("All your base are belong to us")</script>.

<script>
var surname = '</script><script>alert("All your base are belong to us")</script>';
</script>

(Fiddle)

which breaks things again. Although our escaping of ' and \ solved escaping in a JavaScript context, we need to be aware that the browser processes HTML before JavaScript, and therefore a closing </script> tells the browser to switch from a JavaScript context back to HTML, where as the attacker we can simply start another script block.

That’s why it’s important to follow the OWASP XSS prevention guidelines as closely as possible, in this case Rule #3 applies:

Except for alphanumeric characters, escape all characters less than 256 with the \xHH format

<script>
var surname = '\x3c\x2fscript\x3e\x3cscript\x3ealert\x28\x22All\x20your\x20base\x20are\x20belong\x20to\x20us\x22\x29\x3c\x2fscript\x3e';
</script>

(Fiddle)

Notice from the fiddle how everything within our application’s alert is properly formatted. The output is functionally correct, as well as being secure at preventing injections. The great thing about the OWASP recommendation is that it works for untrusted data within <script> tags, and for untrusted data within HTML attributes, whether double quoted, single quoted or unquoted (as long as it’s in the “Rule #3” context of course). Other solutions like HTML encoding will only work in <script> tags and not in attributes, this is because the HTML processor will run before the JavaScript processor and this processes HTML entities (e.g. ") in attributes where it does not for <script> tags. So if you move some working code from within a <script> to an event handler, you could be vulnerable.

Conclusion

I guess what I’m saying is that many injection flaws can be identified in an application by the fact that it is not properly functional in the first place.

Indicators of the security flaw can be “smelt” by the following application behaviours:

Odd behaviour when single or double quotes are entered:
- Display/UI issues (possible XSS).
- Items not saving properly (possible SQLi).
Markup (e.g. &) being displayed in the UI.
- This could indicate the wrong type of encoding is being used (HTML encoding instead of hex entity encoding) which may be vulnerable to injection.

And once found it is important to fix them in the correct way. As shown above, fixing the functional issues can lead to edge cases that only an attacker would be likely to use (</script> breakout), although other “fixes” may include client-side escaping or filtering rather than fixing the problem server-side using output encoding or parameterised queries.