Re: [xsl] XSLT script to report Unicode characters and code blocks in file?

Subject: Re: [xsl] XSLT script to report Unicode characters and code blocks in file?
From: David Carlisle <davidc@xxxxxxxxx>
Date: Fri, 30 May 2008 12:47:03 +0100

> Yes. XML Schema (and hence XPath) regular expressions.

They don't help do they?

Take alpha U+0391.  The UCD says that is Lu so it matches \p(Lu) but that
just tells you it's a lower case letter, it doesn't tell you it's in the
      <block start="00370" end="003FF" name="Greek and Coptic"/>
does it? The code I pointed to in the message you replied to would take
an alpha, get its code point, and find the string "0039" as being the
first four digits of a five digit hex representation of the codepoint,
then find this block element in unicode.xml, and thus (for example) to
which is the pdf file which has the alpha glyph example.

Actually regexp could help, you could take the block range information
and build a regexp that matches each block by generating teh required
charater range expressions, but I think it's more natural to do that as
an xpath query rather than forcing it through the regexp engine.


