RE: [xsl] Preseving character entities

Subject: RE: [xsl] Preseving character entities
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Thu, 25 Nov 2004 13:39:17 -0000
Perhaps you forgot to do use-character-maps in your xsl:output declaration?

But why do you need the character map? Just specify:

<xsl:output encoding="us-ascii"/>

Michael Kay
http://www.saxonica.com/ 

> -----Original Message-----
> From: Richard Lewis [mailto:richardlewis@xxxxxxxxxxxxxx] 
> Sent: 25 November 2004 13:11
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] Preseving character entities
> 
> Hello List,
> 
> I'm just trying to create some temporary XSL to convert my website
> content from one *BIG* XML document to a directory structure populated
> with lots of little documents. I've written two XSLT 2 
> stylesheets which
> first generate a shell script to create a directory structure based on
> the structural hierachy of the original document and next (using the
> xsl:result-document element) divide the original XML up into files in
> the new filesystem hierachy. Great.
> 
> The problem is that, when the XSLT parser (Saxon 8) parses 
> the document,
> it resolves all the character entities (like people with foreign
> characters in their names) to the actual characters. What I 
> really want
> is to have the character entities from the original document 
> left intact
> in the new documents.
> 
> I've written a little Python script which generates a big
> xsl:character-map with all the UTF characters above #130 mapped to
> literal character entities:
> 
> #!/usr/bin/python
> print "<?xml version=\"1.0\" ?>"
> print "<xsl:stylesheet
> xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\"; version=\"1.0\">"
> print "<xsl:character-map name=\"preserve-entities\">"
> 
> for a in range(130, 255):
>         print "\t<xsl:output-character character=\"&#%d;\"
>         string=\"&amp;#%d;\" />" % (a,a)
> 
> print "</xsl:character-map>"
> print "</xsl:stylesheet>"
> 
> and then tried xsl:including the document which this produces in the
> stylesheet which generates the new XML documents. The stylesheet runs
> fine but the foreign characters don't come out as entity references,
> just as normal characters.
> 
> Should what I'm trying to do work? Is there a way of doing it 
> which does
> work?
> 
> Thanks,
> Richard
> -- 
>   Richard Lewis
>   richardlewis@xxxxxxxxxxxxxx

Current Thread