RE: [xsl] double escaping problem [re-visited]

Subject: RE: [xsl] double escaping problem [re-visited]
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 13 Nov 2007 09:14:21 -0000
> Hmmm.  I was afraid of that.  I am still baffled as to how to 
> go about telling my stylesheet that the input it gets from a 
> particular source tree by way of the document() function that 
> it will have already been escaped and therefore that '&amp;' 
> need not be escaped again (making it '&amp;amp;').
> 

The document() function invokes an XML parser and it can only do what an XML
parser does.

In fact an XML parser removes one level of escaping, and a serializer adds
it back. So the parser turns "&amp;" into "&" and "&amp;amp"; into "&amp;",
and the serializer turns them back into "&amp;" and "&amp;amp;"
respectively, unless d-o-e is set, in which case they are turned into "&"
and "&amp;" respectively. All the evidence is that your XML source as read
by the parser was actually double-escaped. This quite often happens when you
have fragments of XML stored in a database: if you try to extract it as XML,
and the database software doesn't realise that it's already in XML format,
then the database software adds a level of escaping that you don't want. The
way to get rid of it is to change the way you do the database query.

Michael Kay 

Current Thread