| Subject: Re: [xsl] XSLT 2.0 : Unicode hex notation in regular  expressions From: David Carlisle <davidc@xxxxxxxxx> Date: Thu, 12 Aug 2004 12:14:56 +0100 | 
> Sorry to insist : why don't they work ?
Because that's life:-)
>  Aren't they supposed to do ?
No the syntax in xslt is (except where otherwise noted) that of w3c xml
schema, and that doesn't have any notation like that.
> If so, is it a Saxon-related problem or a more general one that would 
> indicate that UTS #18 is still to be implemented, is irrelevant or 
> whatever ?
The _semantics_ of unicode regexp comes from there eg the predefined
character classes (you may prefer to use a character class refering to
the arabic block for example rather than use explict code points) but (I
would guess) the U notation wasn't supported as that is the unicode
standard way of accessing characters by code point reference in plain
ascii text and that is never used in an XML context. U+06FF is legal XML
character data but it is those 6 characters, if you want to refer to
character hex 06ff you always use & # x 0 6 F F ; in XML.
  How, for example, to use a useful syntax like 
  matches(.,'\p{Script:Arabic}+') ?
schema-2 says: http://www.w3.org/TR/xmlschema-2/#regexs
[Definition:] [Unicode Database] groups code points into a number of
blocks such as Basic Latin (i.e., ASCII), Latin-1 Supplement, Hangul
Jamo, CJK Compatibility, etc. The set containing all characters that
have block name X (with all white space stripped out), can be identified
with a block escape \p{IsX}. The complement of this set is specified
with the block escape \P{IsX}. ([\P{IsX}] = [^\p{IsX}]).
...
For example,
the 7block escape7 for identifying the ASCII characters is \p{IsBasicLatin}. 
so that would be \p(IsArabic)
David
________________________________________________________________________
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________
| Current Thread | 
|---|
| 
 | 
| <- Previous | Index | Next -> | 
|---|---|---|
| Re: [xsl] XSLT 2.0 : Unicode hex no, Pierrick Brihaye | Thread | Re: [xsl] XSLT 2.0 : Unicode hex no, Pierrick Brihaye | 
| Re: [xsl] extract xpath locator, Nicolas Mazziotta | Date | Re: [xsl] XSLT 2.0 : Unicode hex no, David Carlisle | 
| Month |