|
Subject: Re: [xsl] Filtering, xslt 2.0 From: "C. M. Sperberg-McQueen cmsmcq@xxxxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Wed, 2 Nov 2022 17:50:43 -0000 |
"Liam R. E. Quin liam@xxxxxxxxxxxxxxxx"
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> writes:
> [a brief somewhat pedantic side-track]
[And .. a brief thoroughly pedantic side-track from the side-track:]
> On Wed, 2022-11-02 at 14:19 +0000, Graydon graydon@xxxxxxxxx wrote:
>> ... This is true, though I would note that in general, the Unicode
>> character category,
>>
>> tokenize($param,',\p{Zs}*')
>>
>> can be safer. \s usually matches a space, a tab, a carriage return, a
>> line feed, or a form feed, but what the exact match is depends on the
>> regular expression implementation.B
>
>
> For XSLT 2 and later it's defined as equivalent to the character class
> [ \t\n\r] by XML Schema so there should not be any variation.
>
> Unicode properties., however, are defined by the Unicode Consortium and
> can vary over time - usually by additions.
>
> (actually XSD omits the "&" but i think we can safely say that's a typo
> and i seem to remember there may be an erratum about it.
For what it's worth, not a typo. The XSD spec uses hash mark + 'x' +
hexadecimal number to refer to Unicode code points. This is explained
in a note in section 4.3.6:
Note: The notation #xA used here (and elsewhere in this
specification) represents the Universal Character Set (UCS) code
point hexadecimal A (line feed), which is denoted by U+000A.
This notation is to be distinguished from 
, which is the
XML character reference to that same UCS code point.
--
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: [xsl] Filtering, xslt 2.0, Liam R. E. Quin liam | Thread | Re: [xsl] Filtering, xslt 2.0, David Carlisle d.p.c |
| Re: [xsl] Filtering, xslt 2.0, Liam R. E. Quin liam | Date | Re: [xsl] Filtering, xslt 2.0, Wendell Piez wapiez@ |
| Month |