Re: [xsl] XSL Injection, is it possible?

Subject: Re: [xsl] XSL Injection, is it possible?
From: "G. T. Stresen-Reuter" <tedmasterweb@xxxxxxx>
Date: Tue, 30 May 2006 12:56:27 +0100

Thank you for the excellent reply. See my comments below.

On May 29, 2006, at 11:53 PM, David Carlisle wrote:

Currently my sanitizing function just escapes <, >, ', and " in the
If you are taking in a string and want to ensure that it is encoded in
XML as itself (in character data) rather than markup then you need
to escape < and & (and > if it follows ]]) you don't need to escape " or
' unless you are putting the string in attribute values.

Excellent clarification. In some instances users are allowed to insert XHTML and for those instances I'm running standard HTML input sanitization routines (encoding potentially dangerous elements as entities, for example).

Are these characters recognized by the XSLT engine
if they are hex or unicode encoded?

All XML text is unicode encodes in one way or another, so it's not quite
clear what you mean there. Encoding issues are resolved by the XML
parser before XSLT really sees the input. If you are taking unknown text
you should be escaping & as &amp; so then a character ref such as &#a0;
would be escaped tp &amp;#a0;.

It's not clear what I mean because the whole unicode/utf-n is unclear to me, in spite of how much I read about it, but I understand what you're saying and you seem to have understood where I'm coming from. The bottom line is I want to avoid the kinds of attacks that are common in URLs, where the less-than and greater-than symbols of a SCRIPT element can be URL encoded and in some browsers/servers, go undetected.

but I was wondering if anyone knows of other vectors by which
attackers can enter

attacks are as likely to come from what is inserted into XML character data as from any XML markup that is inserted. Specifically if the stylesheets are generating html then if there is a danger of script being inserted you need to quote (or disable) possible script syntax.

Yes. These situations are handled with standard HTML sanitizing routines prior to insertion, but it did make me wonder what other doors I might leaving open by providing users with completely valid XHTML on the output. This article, in particular, opened my eyes to what is possible with JavaScript. Now that more and more browsers are shipping with XSLT processors built in (or could ship that way), it opens the door for client-side processing with somewhat unpredictable results, doesn't it?

Thanks again for your concise reply!

Ted Stresen-Reuter

Current Thread