Re: [xsl] url encoding of ampersands

Subject: Re: [xsl] url encoding of ampersands
From: Mike Brown <mike@xxxxxxxx>
Date: Mon, 26 Feb 2001 10:22:37 -0700 (MST)
Sivan Mozes wrote:
> Question: if the xslt processor was passed this character from the xml
> parser, how is it selected for conversion as opposed to other characters, by
> range?

Yes. It is up to the processor implementation, actually. For compatibility
with Netscape, they tend to emit entity references for characters 128-255
and numeric character references for 256 and higher. This is only when the
output method is HTML.

> Shouldn't there be a way to be more specific in HTML output mode in regards
> to ampersand handling? (using xerces/xalan).

The idea is to make it so you don't have to think about such things, and
can always know that your output, be it XML or HTML, will always be
well-formed (or HTML's equivalent of well-formed).

That is why the disable-output-escaping on xsl:text and xsl:value-of is
held in low regard; it was added to the spec at the last minute and makes
it possible to force a processor to emit something that can't be read back
in by a conforming XML parser or HTML user agent.

> I eventually wrapped all my entities in CDATA so I can later on encode the
> ampersands, and did another assignment pass w/ disable-output-escaping to
> parse these entities for displaying link content.
> Although it works, I don't like the idea of introducing an exception into
> the xml itself, which is being handled by non-techies.

...and of course, CDATA sections strip whatever is in them of any logical
structure other than just being a run of character data. For example,
&foo; in a CDATA section means the 5 characters & f o o ; rather than an
entity reference to a general parsed entity named foo. You're saying you
want them to be characters, not markup, so ideally you don't want your
serializer to ever emit them in such a way that they would be
misinterpreted as being markup. But of course, that *is* what you are
asking for.. I'm just explaining why you don't have the level of control
you want.

> Only for specific elements CDATA needs to be used for entities while
> everywhere else, entities are handled in a standard manner. I don't
> think this can be specified in the DTD.

That's correct.

   - Mike
Mike J. Brown, software engineer at            My XML/XSL resources: in Denver, Colorado, USA    

 XSL-List info and archive:

Current Thread