RE: [xsl] RETAINING THE ENTITY NAMES IN THE RESULTING TREE

Subject: RE: [xsl] RETAINING THE ENTITY NAMES IN THE RESULTING TREE
From: "Michael Kay" <mhk@xxxxxxxxx>
Date: Fri, 6 Aug 2004 09:35:48 +0100
This is a FAQ.

Entity references are expanded by the XML parser, and the XSLT processor
(Saxon in this case) never gets to see them. There is no way the
transformation can preserve the original entity references, because they
have already gone before the transformation starts.

The characters in the output are not junk. They only look like junk because
you are displaying them using software that doesn't understand them. They
are encoded in UTF-8, and you can display them correctly by selecting UTF-8
format in your text editor, or by changing the <xsl:output> in your
stylesheet to select a different encoding, e.g. encoding="iso-8859-1".

If you want the serializer to generate entity references for specific
characters, you can do this (in XSLT 2.0) by defining a character map. See
http://www.saxonica.com/documentation/xsl-elements/character-map.html

You will have to make sure that any such entities are declared in the
DOCTYPE of the result document, the system won't do this for you.

Michael Kay


> -----Original Message-----
> From: Arul Kumar [mailto:arulxml@xxxxxxxxxxxx]
> Sent: 06 August 2004 09:28
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] RETAINING THE ENTITY NAMES IN THE RESULTING TREE
>
> Greetings all.
>
> My task is transforming the XML into another XML tree. During the
> transformation I am loosing the generic names of the special
> characters.
> For example, in my source xml I have an entity called
> "&alpha;" but in
> the resulting tree I am getting only junk characters instead.
> But I need
> to maintain the '&alpha;' as is in the resulting xml.
>
> I am using Saxon 8.0 for this transformation, and the codes
> are given below:
>
> MY INPUT XML (main.xml)
> <?xml version="1.0"?>
> <?xml-stylesheet href="main.xsl" type="text/xsl"?>
> <!DOCTYPE root SYSTEM "ent.dtd">
> <root>
> <center><h2>sss</h2></center>
> <center><h2>&alpha;</h2></center>
> <center><h2>&beta;</h2></center>
> <center><h2>&ndash;</h2></center>
> </root>
>
> MY XSL (data.xsl)
> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
> version='1.0'>
> <xsl:output method="xml" indent="yes"/>
> <xsl:template match="root">
> <html>
> <body>
> <xsl:for-each select="center/child::*">
> <center>
> <xsl:element name="{name(.)}">
> <xsl:if test="@*">
> <xsl:attribute name="{name(@*)}"><xsl:value-of
> select="string(@*)"/></xsl:attribute>
> </xsl:if>
> <xsl:apply-templates/>
> </xsl:element>
> </center>
> </xsl:for-each>
> </body>
> </html>
> </xsl:template>
> </xsl:stylesheet>
>
>
> MY RESULTING XML
> <?xml version="1.0" encoding="UTF-8"?>
> <html>
> <body>
> <center>
> <h2>sss</h2>
> </center>
> <center>
> <h2>N1</h2>
> </center>
> <center>
> <h2>N2</h2>
> </center>
> <center>
> <h2>b"</h2>
> </center>
> </body>
> </html>
>
> MY COMMAND LINE ARGUMENT
> E:\>java -jar saxon8.jar -ds data.xml main.xsl
>
> Please advice, how to retain the generic entity names in the
> resulting xml.
>
>
> Thanks and regards
> Arul Kumar

Current Thread