[xsl] unpacking percent-escaped URI components

Subject: [xsl] unpacking percent-escaped URI components
From: "Graydon graydon@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 7 Nov 2022 20:54:47 -0000
Hello --

Unpacking RFC 4122 percent-escaped strings for code points less than
256 is straightforward --

tokenize($value,'%')[normalize-space()] ! local:H2D(.) ! codepoints-to-string(.)

where local:H2D is a hex-digits-to-decimal-integer function.

When the escaped value goes above 255, as with U+201C and U+201D, bb,
the escapes start being multi-octet UTF-8, so %E2%80%9C and %E2%80%9D.

Is there a useful way to turn those multi-octet escapes back into single
characters in XPath or XSLT?

Thanks!

-- 
Graydon Saunders  | graydonish@xxxxxxxxx
CC&s oferC)ode, C0isses swC! mC&g.
-- Deor  ("That passed, so may this.")

Current Thread