Subject: [xsl] XSLT is not an editor [ was: [xsl] retaining entity declarations while converting from one xml format to another]
From: "steve.majewski@xxxxxxxxx" <steve.majewski@xxxxxxxxx>
Date: Wed, 16 Dec 2009 10:53:22 -0500
I had a similar problem, and while searching thru the archives
( since I was sure it must be a FAQ ) I found a thread with a
good explanation by Ken Holman of why XSLT is NOT an EDITOR
that helped me to think about these problems:



The short version:
* XSLT constructs a new output (write-only) document from an input (read-only) document.
* XSLT does not process XML syntax -- it processes XML infoset trees.

I suggest anyone who is confused about why you can't do this sort of thing
in XSLT read that thread. See also:
Amelia Lewis: http://www.stylusstudio.com/xmldev/200905/post10080.html
Michael Kay: http://www.stylusstudio.com/xmldev/200905/post30080.html

However, despite the fact that XSLT was not designed for this sort of job,
it does get used as an editor or filter of xml documents, for lack of a better
solution. ( Is XQuery with update intended to fill this gap ? )

[1] We have tried using a perl script that:
grabs and saves everything before the root element,
translates '&' character into some other character that doesn't occur in the document.
( settable on the command line )
calls out to do an xslt transform ( with Saxon or xsltproc )
pastes the saved prefix onto the output
and does the reverse character translation back to '&'
( There may be some document cases where those steps would fail, but it works
for our selection. )

[2] It's probably possible to do something similar to the above strictly in XSLT
by reading the document as both XML and unparsed text. See Michesl Saxon's article
on "Up-conversion using XSLT" :
for input tips. Pasting the two output streams back together may also require some tricks.
[ This is probably more complex that the perl solution above, but doing it all
in XSLT may have workflow advantages in some cases. ]

[3] Other have mentioned Andrew Welch's LexEv, which is an XMLReader that converts lexical
events (syntax) into markup:


This seems like a much cleaner way to handle the pass thru of syntax elements,
but it requires explicitly encoding the inverse operation in the stylesheet.
Some examples in:


[4] Before finding the simpler solution (#1 above) I considered writing a tool that
would pass thru all of the xml unchanged except for fragments that matched an
xpath expression -- those fragments would be processed thru an xslt stylesheet.

Does anyone have any other tips on how to handle this sort of problem ?
Are there any XML editors that can work in batch mode using XPath expressions ?
Is XQuery with update intended to solve this problem ? ( and are there any good
implementations for this use ? )
[ This may be a bit off topic for xsl-list, but it does seem to be a FAQ. ]

-- Steve Majewski / UVA Alderman Library

