Subject: Re: [xsl] Safe-guarding codepoints-to-string() from wrong input From: Abel Braaksma <abel.online@xxxxxxxxx> Date: Wed, 20 Dec 2006 16:18:31 +0100 |
If you are receiving strings containing literal control characters then they're almost definitely encoded in Windows-1252 - just parse them using that and you'll be ok.
If the string contains control characters as character references, then its a bit harder because the references get expanded using unicode codepoints, and not those specified in the Windows-1252 mappings... So you need to parse/serialize the string to expand the references (I personally use JTidy with the CharEncoding set to Configuration.RAW which forces the Tidy to output the bytes instead of a reference)
Its a pain....
Thanks, -- Abel
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Safe-guarding codepoints-, Andrew Welch | Thread | Re: [xsl] Safe-guarding codepoints-, Florent Georges |
Re: [xsl] Safe-guarding codepoints-, Florent Georges | Date | RE: [xsl] Safe-guarding codepoints-, Michael Kay |
Month |