Re: [xsl] Saxon vulnerability

Subject: Re: [xsl] Saxon vulnerability
From: "Martin Honnen martin.honnen@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 8 Mar 2025 13:30:01 -0000
On 08/03/2025 13:55, Roger L Costello costello@xxxxxxxxx wrote:
Hi Folks,



Below is my writeup of the vulnerability. Please let me know of any
inaccuracies. /Roger


I think that an XML document can have entities declared that access the file system is a known feature, not necessarily a security vulnerability, but these days the use of DTDs/DOCTYPEs is often considered an entry into XXE attacks, so in some XML stacks (e.g. .NET) DTD use is often prohibited or ignored by default.

This particular vulnerability is more about Saxon having a configuration
property allowedProtocols that you can set to e.g. "https,http" to that
way explicitly allow only HTTPS and HTTP URI to be resolved in the
context of XML/XSLT/XQuery/XPath while file URI access should fail;
however, currently, while I think which such a setting doing e.g.

B unparsed-text('file:///etc/password')

is blocked, the resolver (chain?) Saxon sets up detects the prohibited
file URI access in the entity resolution but somehow fails to block the
parsing of the XML (or at least the external entity referencing a local
file).





Below is an XSLT program that reads the Windows/win.ini file. A bad actor
could use the program to read and display the contents of any file on your
machine. This is a vulnerability. The SAXON team is working to fix this
vulnerability.



Explanation of how the vulnerability works




Sometimes you write an XSLT program that dynamically builds XML. The
XML--which is a string--may then be dynamically processed using the XPath
parse-xml(string) function. Let's dig into dynamically generated XML that can
read arbitrary files on your machine.



Recall that XML has five built-in entities: lt for the < symbol, gt for the symbol, amp for the ampersand symbol, quote for the " symbol, and apos for
the ' symbol. You can create your own user-defined entities using <!ENTITY
args>, where args is the name of the new entity--e.g., xxe (not a very
readable entity name, that's okay)--followed by the value for the entity. The
value may be given in-line as a string, or a file may be referenced to provide
the value. Let's assign xxe the value of the Windows/win.ini file. Follow xxe
with the keyword SYSTEM and then the location to the file. Here's how to
create a user-defined xxe entity whose value is the content of the
Windows/win.ini file:



<!ENTITY xxe SYSTEM "file:///Windows/win.ini">




Place that entity declaration inside a DOCTYPE declaration:



<!DOCTYPE root [

<!ENTITY xxe SYSTEM "file:///Windows/win.ini">

]>



The DOCTYPE comes before the XML document's root element. Here is XML which
uses--displays--the value of the xxe entity:



<root>&xxe;</root>




With that technical background, the following XSLT program should be
understandable.

----------------------------------------------------------------------


XSLT program that could be exploited

to read--and output--any file on

your machine.

----------------------------------------------------------------------

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";

xmlns:xs="http://www.w3.org/2001/XMLSchema";

exclude-result-prefixes="#all"

version="3.0">



<xsl:template match="/">

<Results>

<xsl:sequence select="

parse-xml(

'

&lt;!DOCTYPE root [

&lt;!ENTITY xxe SYSTEM
&quot;file:///Windows/win.ini&quot;>

]>


&lt;root>&amp;xxe;&lt;/root>

'

)

"/>

</Results>

</xsl:template>

</xsl:stylesheet>

Current Thread