Re: [xsl] Sorting Hex v. Decimal

Subject: Re: [xsl] Sorting Hex v. Decimal
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 2 Apr 2015 07:28:32 -0000
Start by using <xsl:sort lang="en"/> and see whether the results are
satisfactory. If not, try some other language more appropriate to the
data-set. If you want to refine it further, define a collation. For example

<xsl:sort collation="http://saxon.sf.net/collation?ignore-case=yes"/>

Information on Saxon collations is at

http://www.saxonica.com/documentation/index.html#!extensibility/config-extend
/collation/implementing-collation


Michael Kay
Saxonica
mike@xxxxxxxxxxxx
+44 (0) 118 946 5893




On 2 Apr 2015, at 01:25, Charles O'Connor charles.oconnor@xxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

> Hi all,
>
> As a test for something a bit more complex, I am trying to do a simple sort
of names, some of which start with character entity references:
>
> <root>
>    <author><surname>&#x00D6;born</surname></author>
>    <author><surname>Jones</surname></author>
>    <author><surname>Edwards</surname></author>
>    <author><surname>Osgood</surname></author>
>    <author><surname>&#x00C8;meraldo</surname></author>
>    <author><surname>Smith</surname></author>
> </root>
>
> Using this transform:
>
> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
>    version="1.0">
>    <xsl:output encoding="ASCII"/>
>    <xsl:template match="/">
>        <html>
>            <body>
>                <h1>Author List</h1>
>                <xsl:for-each select="//author">
>                    <xsl:sort select="surname"/>
>                        <p>
>                         <xsl:value-of select="surname"/>
>                       </p>
>                </xsl:for-each>
>            </body>
>        </html>
>    </xsl:template>
> </xsl:stylesheet>
>
> I want two things, the entities to come out in hex and the sort to treat
characters with diacriticals as equivalent to same characters without
diacriticals. So, e with an acute accent should be sorted equivalently with e
without an acute accent.
>
> Using Oxygen, the sort works as intended when the transformer is Saxon
6.5.5, but the entities come out as decimal.
>
> <html>
>   <body>
>      <h1>Author List</h1>
>      <p>Edwards</p>
>      <p>&#200;meraldo</p>
>      <p>Jones</p>
>      <p>&#214;born</p>
>      <p>Osgood</p>
>      <p>Smith</p>
>   </body>
> </html>
>
> If I change the transformer to Saxon 9.4.0.4, I get hex, but all the author
names that start with a character entity reference get stuck at the end.
>
> <html>
>   <body>
>      <h1>Author List</h1>
>      <p>Edwards</p>
>      <p>Jones</p>
>      <p>Osgood</p>
>      <p>Smith</p>
>      <p>&#xc8;meraldo</p>
>      <p>&#xd6;born</p>
>   </body>
> </html>
>
> Like anyone else, I'd like to have my cake and eat it too. But, how?
>
> Thanks,
> Charles

Current Thread