Subject: Re: [xsl] Filtering, xslt 2.0 From: "C. M. Sperberg-McQueen cmsmcq@xxxxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Wed, 2 Nov 2022 17:50:43 -0000 |
"Liam R. E. Quin liam@xxxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> writes: > [a brief somewhat pedantic side-track] [And .. a brief thoroughly pedantic side-track from the side-track:] > On Wed, 2022-11-02 at 14:19 +0000, Graydon graydon@xxxxxxxxx wrote: >> ... This is true, though I would note that in general, the Unicode >> character category, >> >> tokenize($param,',\p{Zs}*') >> >> can be safer. \s usually matches a space, a tab, a carriage return, a >> line feed, or a form feed, but what the exact match is depends on the >> regular expression implementation.B > > > For XSLT 2 and later it's defined as equivalent to the character class > [ \t\n\r] by XML Schema so there should not be any variation. > > Unicode properties., however, are defined by the Unicode Consortium and > can vary over time - usually by additions. > > (actually XSD omits the "&" but i think we can safely say that's a typo > and i seem to remember there may be an erratum about it. For what it's worth, not a typo. The XSD spec uses hash mark + 'x' + hexadecimal number to refer to Unicode code points. This is explained in a note in section 4.3.6: Note: The notation #xA used here (and elsewhere in this specification) represents the Universal Character Set (UCS) code point hexadecimal A (line feed), which is denoted by U+000A. This notation is to be distinguished from 
, which is the XML character reference to that same UCS code point. -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Filtering, xslt 2.0, Liam R. E. Quin liam | Thread | Re: [xsl] Filtering, xslt 2.0, David Carlisle d.p.c |
Re: [xsl] Filtering, xslt 2.0, Liam R. E. Quin liam | Date | Re: [xsl] Filtering, xslt 2.0, Wendell Piez wapiez@ |
Month |