|
Subject: Re: [xsl] Character Encoding Problem From: "Tony Graham tgraham@xxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Thu, 25 Sep 2014 15:11:32 -0000 |
On Thu, September 25, 2014 11:32 am, Tony Graham tgraham@xxxxxxxxxx wrote:
> On Tue, September 23, 2014 9:32 pm, Michael Kay mike@xxxxxxxxxxxx wrote:
>> On 23 Sep 2014, at 21:23, Craig Sampson craig.sampson@xxxxxxx
>> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>>> Were trying to create a java properties file using XSLT 2.0 in
>>> Saxon.
>>> The input is XML encoded as UTF-8. The properties file needs to be
>>> encoded as ISO-8859-1. The character giving the problem, in the input
>>> file, is “ which is a left hand double quote. Looking at the
>>> ISO-8859-1 character set the closest character appears to be a double
>>> quote with no hand (left/right).
>
> To move the goalposts,
Since I inadvertently ended up repeating most of Wolfgang Laun's advice,
let me try again with something more original:
----
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:m="http://www.mentea.net/namespace"
version="2.0"
exclude-result-prefixes="m xs">
<xsl:output method="text" encoding="ISO-8859-1" />
<xsl:template match="text()">
<xsl:analyze-string select="."
regex="[Ā-]">
<xsl:matching-substring>
<xsl:value-of select="m:escape(.)" />
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="." />
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
<xsl:function name="m:escape" as="xs:string">
<xsl:param name="char" as="xs:string" />
<xsl:variable name="hex-chars"
select="m:to-hex(string-to-codepoints($char))"
as="xs:string+" />
<xsl:sequence
select="string-join(('\u',
substring('000', count($hex-chars)),
$hex-chars),
'')" />
</xsl:function>
<xsl:function name="m:to-hex" as="xs:string+">
<xsl:param name="codepoint" as="xs:decimal" />
<xsl:sequence
select="if ($codepoint >= 16)
then m:to-hex(floor($codepoint div 16))
else ()" />
<xsl:sequence select="substring('0123456789ABCDEF',
($codepoint mod 16) + 1, 1)" />
</xsl:function>
</xsl:stylesheet>
----
(though it does borrow from and correct
http://www.biglist.com/lists/xsl-list/archives/200012/msg00426.html).
Regards,
Tony Graham tgraham@xxxxxxxxxx
Consultant http://www.mentea.net
Chair, Print and Page Layout Community Group @ W3C XML Guild member
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Mentea XML, XSL-FO and XSLT consulting, training and programming
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: [xsl] Character Encoding Proble, Tony Graham tgraham@ | Thread | Re: [xsl] Character Encoding Proble, Wolfgang Laun wolfga |
| Re: [xsl] Character Encoding Proble, Tony Graham tgraham@ | Date | Re: [xsl] Character Encoding Proble, Wolfgang Laun wolfga |
| Month |