Re: [xsl] XSLT script to report Unicode characters and code blocks in file?

Subject: Re: [xsl] XSLT script to report Unicode characters and code blocks in file?
From: David Carlisle <davidc@xxxxxxxxx>
Date: Fri, 30 May 2008 12:47:03 +0100

> Yes. XML Schema (and hence XPath) regular expressions.

They don't help do they?

Take alpha U+0391.  The UCD says that is Lu so it matches \p(Lu) but that
just tells you it's a lower case letter, it doesn't tell you it's in the
      <block start="00370" end="003FF" name="Greek and Coptic"/>
does it? The code I pointed to in the message you replied to would take
an alpha, get its code point, and find the string "0039" as being the
first four digits of a five digit hex representation of the codepoint,
then find this block element in unicode.xml, and thus (for example) to
which is the pdf file which has the alpha glyph example.

Actually regexp could help, you could take the block range information
and build a regexp that matches each block by generating teh required
charater range expressions, but I think it's more natural to do that as
an xpath query rather than forcing it through the regexp engine.


The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs. 

Current Thread