RE: [xsl] case-sensitivity in xml

Subject: RE: [xsl] case-sensitivity in xml
From: "Pawson, David" <David.Pawson@xxxxxxxxxxx>
Date: Wed, 26 Jan 2005 11:05:51 -0000
    -----Original Message-----
    From: Michael Kay
    > What (in that query string) tells me that its case-unaware?
    >   OK, its English, I can infer that.
    >    Primary strength?


    The notion of the "strength" of a collation is explained in
    the Unicode Collation Algorithm.
<snip/>

 A collation defined with
    strength=primary considers two characters to be different
    only if they have a primary difference; a secondary
    difference (case) doesn't count. If you want to ignore
    accents but not case, specify strength=secondary.
Ah! Now it makes sense. Thanks Michael.
A bit subtle though!
http://pistos.pe.kr/javadocs/etc/icu4j2_4/j2h/com/ibm/icu/text/Collator.java.
html
Gives a list from
primary to identical,
and
http://pistos.pe.kr/javadocs/etc/icu4j2_4/j2h/com/ibm/icu/dev/test/collator/C
ollationMonkeyTest.java.html

(beware broken url) shows some testing.



    > Any references where I might look?

    The Unicode Collation Algorithm is a good place. Collation
    machinery based on this is now built into a number of
    software environments, such as Java, Oracle, the IBM ICU
    toolkit, the Windows platform, and so on. Note however that
    the XSLT/XPath specs don't restrict the choice of
    collations to those that conform to the UCA.

Eliots 2002 paper for Extreme
(google link is
http://64.233.183.104/search?q=cache:I4OBiTB9wJoJ:www.isogen.com/downloads/wh
ite_papers/botb-index-i18n.pdf+java+collation+machinery&hl=en  but google on
isogen papers for his site)

Uses xslt 1.0 and saxon, but explains the general approach though
not using this idea of strengths.

It would be nice to have an approach documented for unusual collations,
whatever form they take.

My googling hasn't found enough... yet.

I'll keep looking.

Thanks Michael.

regards DaveP




--
DISCLAIMER:

NOTICE: The information contained in this email and any attachments is
confidential and may be privileged.  If you are not the intended
recipient you should not use, disclose, distribute or copy any of the
content of it or of any attachment; you are requested to notify the
sender immediately of your receipt of the email and then to delete it
and any attachments from your system.

RNIB endeavours to ensure that emails and any attachments generated by
its staff are free from viruses or other contaminants.  However, it
cannot accept any responsibility for any  such which are transmitted.
We therefore recommend you scan all attachments.

Please note that the statements and views expressed in this email and
any attachments are those of the author and do not necessarily represent
those of RNIB.

RNIB Registered Charity Number: 226227

Website: http://www.rnib.org.uk

Current Thread