Subject: Re: [xsl] Character Encoding Problem From: "Tony Graham tgraham@xxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Thu, 25 Sep 2014 15:11:32 -0000 |
On Thu, September 25, 2014 11:32 am, Tony Graham tgraham@xxxxxxxxxx wrote: > On Tue, September 23, 2014 9:32 pm, Michael Kay mike@xxxxxxxxxxxx wrote: >> On 23 Sep 2014, at 21:23, Craig Sampson craig.sampson@xxxxxxx >> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: >>> Were trying to create a java properties file using XSLT 2.0 in >>> Saxon. >>> The input is XML encoded as UTF-8. The properties file needs to be >>> encoded as ISO-8859-1. The character giving the problem, in the input >>> file, is “ which is a left hand double quote. Looking at the >>> ISO-8859-1 character set the closest character appears to be a double >>> quote with no hand (left/right). > > To move the goalposts, Since I inadvertently ended up repeating most of Wolfgang Laun's advice, let me try again with something more original: ---- <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:m="http://www.mentea.net/namespace" version="2.0" exclude-result-prefixes="m xs"> <xsl:output method="text" encoding="ISO-8859-1" /> <xsl:template match="text()"> <xsl:analyze-string select="." regex="[Ā-]"> <xsl:matching-substring> <xsl:value-of select="m:escape(.)" /> </xsl:matching-substring> <xsl:non-matching-substring> <xsl:value-of select="." /> </xsl:non-matching-substring> </xsl:analyze-string> </xsl:template> <xsl:function name="m:escape" as="xs:string"> <xsl:param name="char" as="xs:string" /> <xsl:variable name="hex-chars" select="m:to-hex(string-to-codepoints($char))" as="xs:string+" /> <xsl:sequence select="string-join(('\u', substring('000', count($hex-chars)), $hex-chars), '')" /> </xsl:function> <xsl:function name="m:to-hex" as="xs:string+"> <xsl:param name="codepoint" as="xs:decimal" /> <xsl:sequence select="if ($codepoint >= 16) then m:to-hex(floor($codepoint div 16)) else ()" /> <xsl:sequence select="substring('0123456789ABCDEF', ($codepoint mod 16) + 1, 1)" /> </xsl:function> </xsl:stylesheet> ---- (though it does borrow from and correct http://www.biglist.com/lists/xsl-list/archives/200012/msg00426.html). Regards, Tony Graham tgraham@xxxxxxxxxx Consultant http://www.mentea.net Chair, Print and Page Layout Community Group @ W3C XML Guild member -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- Mentea XML, XSL-FO and XSLT consulting, training and programming
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Character Encoding Proble, Tony Graham tgraham@ | Thread | Re: [xsl] Character Encoding Proble, Wolfgang Laun wolfga |
Re: [xsl] Character Encoding Proble, Tony Graham tgraham@ | Date | Re: [xsl] Character Encoding Proble, Wolfgang Laun wolfga |
Month |