RE: [xsl] Variables and HTML

Subject: RE: [xsl] Variables and HTML
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Thu, 10 Mar 2005 23:47:11 -0000
> <xsl:variable name="italicOpen">&lt;i></xsl:variable>

I think that in general double-markup (markup disguised as text) is a bad
idea, because it's very confusing and not well supported by tools. It's much
better wherever possible to exploit the fact that XML is fully hierarchic,
so markup can always be nested.

However, the problem does come up quite often. Sometimes its done for bad
reasons, but there are also some plausible reasons:

(a) the inner markup is HTML and is not well-formed-XML

(b) the inner markup represents a complete XML document containing a DTD
internal subset

The process of turning angle brackets into nodes is called parsing. Parsing
also turns an &lt; escape sequence into an angle bracket. If nested markup
like this is to be manipulated by XSLT, then it needs to be turned into
nodes. To get from &lt; via < to a node you need to parse it twice: you can
do this using an extension such as saxon:parse().

The reverse of parsing is serialization. During serialization, nodes are
turned into angle brackets, and angle brackets are turned into escape
sequences (entities).

You want the &lt; in the input to become < in the output. This means you
either need to parse it twice and serialize it once, or you need to parse it
once and bypass the usual action of the serializer - which is what
disable-output-escaping does.

Disable-output-escaping is generally derided because it's so frequently
misused by beginners who haven't understood that XSLT is dealing with trees.
But there are use cases for it, and a prime one is extracting HTML documents
or fragments that have been wrapped in an XML wrapper. It's very problematic
architecturally because it distorts the interface between the transformer
and the serializer, and that's why not every processor supports it. However,
there are cases where it's the best solution available.

Michael Kay

Current Thread