Subject: Re: [xsl] unpacking percent-escaped URI components From: "Graydon graydon@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Mon, 7 Nov 2022 21:53:14 -0000 |
On Mon, Nov 07, 2022 at 09:36:47PM -0000, Martin Honnen martin.honnen@xxxxxx scripsit: > On 11/7/2022 10:29 PM, Martin Honnen martin.honnen@xxxxxx wrote: > > On 11/7/2022 9:55 PM, Graydon graydon@xxxxxxxxx wrote: > > > Unpacking RFC 4122 percent-escaped strings for code points less than > > > 256 is straightforward -- > > > > > > tokenize($value,'%')[normalize-space()] ! local:H2D(.) ! > > > codepoints-to-string(.) > > > > > > where local:H2D is a hex-digits-to-decimal-integer function. > > > > > > When the escaped value goes above 255, as with U+201C and U+201D, bb, > > > the escapes start being multi-octet UTF-8, so %E2%80%9C and %E2%80%9D. > > > > > > Is there a useful way to turn those multi-octet escapes back into single > > > characters in XPath or XSLT? > > > > > I wonder whether with Saxon PE or EE you can use e.g. > > > > (tokenize($value,'%')[normalize-space()] ! local:H2D(.)) => > > saxon:octets-to-hexBinary() => saxon:hexBinary-to-string('UTF8') > > > Yes, now tested e.g. > > B (226, 128,B 156, 226, 128,B 157) => saxon:octets-to-hexBinary() => > saxon:hexBinary-to-string('UTF8') > > gives bb I am impressed! (And only feeling a little bit like a cabbage.) Thank you! -- Graydon Saunders | graydonish@xxxxxxxxx CC&s oferC)ode, C0isses swC! mC&g. -- Deor ("That passed, so may this.")
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] unpacking percent-escaped, Martin Honnen martin | Thread | Re: [xsl] unpacking percent-escaped, Eliot Kimber eliot.k |
Re: [xsl] unpacking percent-escaped, Martin Honnen martin | Date | Re: [xsl] unpacking percent-escaped, Eliot Kimber eliot.k |
Month |