[xsl] Preseving character entities

Subject: [xsl] Preseving character entities
From: "Richard Lewis" <richardlewis@xxxxxxxxxxxxxx>
Date: Thu, 25 Nov 2004 13:10:55 +0000
Hello List,

I'm just trying to create some temporary XSL to convert my website
content from one *BIG* XML document to a directory structure populated
with lots of little documents. I've written two XSLT 2 stylesheets which
first generate a shell script to create a directory structure based on
the structural hierachy of the original document and next (using the
xsl:result-document element) divide the original XML up into files in
the new filesystem hierachy. Great.

The problem is that, when the XSLT parser (Saxon 8) parses the document,
it resolves all the character entities (like people with foreign
characters in their names) to the actual characters. What I really want
is to have the character entities from the original document left intact
in the new documents.

I've written a little Python script which generates a big
xsl:character-map with all the UTF characters above #130 mapped to
literal character entities:

#!/usr/bin/python
print "<?xml version=\"1.0\" ?>"
print "<xsl:stylesheet
xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\"; version=\"1.0\">"
print "<xsl:character-map name=\"preserve-entities\">"

for a in range(130, 255):
        print "\t<xsl:output-character character=\"&#%d;\"
        string=\"&amp;#%d;\" />" % (a,a)

print "</xsl:character-map>"
print "</xsl:stylesheet>"

and then tried xsl:including the document which this produces in the
stylesheet which generates the new XML documents. The stylesheet runs
fine but the foreign characters don't come out as entity references,
just as normal characters.

Should what I'm trying to do work? Is there a way of doing it which does
work?

Thanks,
Richard
-- 
  Richard Lewis
  richardlewis@xxxxxxxxxxxxxx

Current Thread