Re: [xsl] Can I suppress entity substitution in XSLT?

Subject: Re: [xsl] Can I suppress entity substitution in XSLT?
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Sat, 26 Jul 2003 14:27:31 -0400
Taro,

To add to the picture Ken describes:

Since XSLT operates only on the XPath tree, which knows nothing of entities, how the characters are then represented in an output file written by a serializer of an XPath tree, is generally outside the scope of the XSLT processor. This separation gives a clean interface between the "tree transformation" XSLT performs, and the business of file-writing (XML, HTML or anything else). The down side of this is that XSLT cannot, directly, answer requirements like "keep my entity references (to characters)". The up side is that we know where to turn if we *do* want to represent characters in serialized output in an unorthodox way (not "keep" them, but anyhow write them out where we can) -- namely, the serializer.

Not all XSLT processes terminate by serializing their output, and writing the output tree to a file is a necessary and desirable operation only some (maybe most) of the time. However, if you are doing this, you can do any of three things:

1. Remove the problem from XSLT's scope altogether. Use a post-processing routine, such as a Perl or Python script, or sed, to perform substitutions of characters with named entities in the output files after the XSLT processor writes them. Maintaining the well-formedness of your output is your concern.

2. Hack the serializer. You could thus extend XSLT with, for example, a custom output method. Your serializer would intercept characters you wanted to represent with entities, and take care of the escaping. Back when James Clark's XT was the de-facto reference implementation for XSLT, he demonstrated how to do something not unlike this.

3. Drive the serializer from within XSLT. Perhaps considered by some to be bad form since it crosses the line between XSLT processing and serialization, you can nonetheless do this with the optional disable-output-escaping feature, if you have it (it's available in Xalan). I'd call this a legitimate use of d-o-e, assuming of course you understand this kind of processing is limited in its application (a) to scenarios where file-writing is part of the pipeline, and (b) in the engines it will work on -- i.e., it's not as portable as XSLT generally. For this reason and others, you may want to implement this in a separate stylesheet from your main transform, and run it as the terminal process in a chain.

See http://www.biglist.com/lists/xsl-list/archives/200110/msg00115.html for more hints.

Cheers,
Wendell

At 07:40 PM 7/25/2003, you wrote:
At 2003-07-25 18:03 -0400, Taro Ikai wrote:
I sometimes want to keep the character entities as they are.
Is that outside of XSLT specification?

Keeping the characters represented by the entities is within the XSLT 1.0 specification.


Keeping the syntax used to represent the character entities in the file used to create the XPath node tree is not within the XSLT 1.0 specification.

For all entities, the substitutions are made by the XML processor used by the XSLT processor and the XSLT processor isn't told of any syntax that might have been used for any of the markup ... all it sees is the information as understood by the XML processor after accommodating the markup.

I hope this helps.

......................... Ken

___&&__&_&___&_&__&&&__&_&__&__&&____&&_&___&__&_&&_____&__&__&&_____&_&&_ "Thus I make my own use of the telegraph, without consulting the directors, like the sparrows, which I perceive use it extensively for a perch." -- Thoreau


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list



Current Thread