[xsl] RE: [xsl] Re: [xsl] Problems transforming currency values using British pound (£) and Euro (€) signs

Subject: [xsl] RE: [xsl] Re: [xsl] Problems transforming currency values using British pound (£) and Euro (€) signs
From: "Michael Kay" <mhk@xxxxxxxxx>
Date: Thu, 9 Oct 2003 16:57:34 +0100
> I would be grateful if someone could help me with my problem 
> transforming pound signs and Euro signs.

First point: currency symbols are not legal in the picture of
format-number(), though some processors may accept them.
(format-number() was defined by reference to the DecimalFormatSymbols
class of JDK 1.1, and currency symbols were added to the JDK at a later
version than this).
> Scenario:
> A format string (£#,##0.00) is used to format a number, using 
> the built-in
> format-number() XSLT function, into a currency value. The 
> format value is obtained from the data XML file that is being 
> transformed, and looks a little like this:
> <Data TotalRows="4">
>   <ColumnHeadings>
>     <ColumnHeading Number="1">Value</ColumnHeading>
>     <ColumnHeading Number="2">Label</ColumnHeading>
>   </ColumnHeadings>
>   <Rows>
>     <Row Number="1" GUID="{F553FCD9-10C1-D511-91D9-000629864A98}">
>       <Column Number="1" Type="Number" Format="£#,##0">12</Column>
>       <Column Number="2" Type="Standard">A &amp; F GRANT LTD</Column>
>     </Row>
>     <Row Number="2" GUID="{CD4B980A-9ADC-D511-91EB-000629864A98}">
>       <Column Number="1" Type="Number" Format="£#,##0">6</Column>
>       <Column Number="2" Type="Standard">ACME Products</Column>
>     </Row>
>     <Row Number="3" GUID="{C18BEED0-956B-D611-9211-000629864A98}">
>       <Column Number="1" Type="Number" Format="£#,##0">87</Column>
>       <Column Number="2" Type="Standard">ABAC SERVICES LTD</Column>
>     </Row>
>     <Row Number="4" GUID="{3C2A26E3-5D51-D611-920B-000629864A98}">
>       <Column Number="1" Type="Number" Format="£#,##0">1</Column>
>       <Column Number="2" Type="Standard">ABACUS SERVICES LTD</Column>
>     </Row>
>   </Rows>
> </Data>
> Using this data, I generate an SVG bar chart, which is 
> essentially another XML document.
> Problem:
> The resulting transformed XML document does not parse 
> properly, complaining of illegal characters, when viewed in 
> IE. (Error: An invalid character was found in text content. 
> Error processing resource bla..)

It should not be possible to produce output from an XML transformation
that can't be parsed by an XML parser. So the question is, how did you
output the result of the transformation, and what did you do to it
before parsing it?

(I might also ask, why did you serialize the output, if all you wanted
to do was to parse it again? Why didn't you just write the output of the
transformation directly to a DOM?)
> Theory:
> This is as a direct result of the pound (£) sign, since using 
> a dollar ($) sign works fine. Research has shown this 
> character to be a part of the Latin-1 (extended latin) 
> character set, which is not part of ASCII, the default 
> characterset used by Windows. (Please correct me if my facts 
> aren't accurate). This means that, while the XML above looks 
> well formed, the pound character needs to have been converted 
> to a Unicode charcacter which, I believe looks something like 
> this: £ (that is an accented letter A together with the 
> pound sign), although in some editors it still appears as one 
> character.

There are many inaccuracies in the above. Firstly, "the default
character set used by Windows" is not ASCII, in fact there is no
default. It all depends on how you have configured Windows and which
software you are using.

Secondly, a Unicode £ sign looks like "£" if you display it correctly
using software that knows it is Unicode. It only looks like £ if you
(a) encode it using the UTF-8 encoding of Unicode, and (b) display it
using software that doesn't understand UTF-8 encoding.

> 1st Attempt:
> So, armed with this theory, I attempted to create the data 
> file by allowing the MSXML4 parser to convert the values to 
> Unicode for me (at least that's what I hoped for), by setting 
> the XML encoding. To do this I tried to set the document 
> encoding (<?xml version="1.0" encoding="iso-8859-1"?>) prior 
> to building the rest of the document, in the hope that the 
> parser would understand that I'm trying to input my values in 
> a characterset other than ASCII, in Latin-1 or iso-8859-1.

If the encoding of the file is iso-8859-1, which it probably is if you
have a Western European version of Windows, and a run-of-the-mill text
editor, and if you avoid the special characters that Microsoft has added
to 8859-1, then you should put this XML declaration at the start of the
file. If it isn't, then you shouldn't. 

> used the VB syntax:
>   objDOM.appendChild objDOM.createProcessingInstruction("xml",
> "version=""1.0"" encoding=""iso-8859-1""")

Firstly, the XML declaration is not a processing instruction. Secondly,
character encoding applies only to an unparsed document. Once the data
is in a DOM, it consists of characters not bytes, and the encoding is
none of your concern any more. Once the data has been parsed, if the
encoding was wrongly labelled then there is no way of undoing the
> 2nd Attempt:
> The next thing I attempted was to create the data XML file 
> without changing the encoding and, before consuming the XML 
> contents, append the declaration to the front of the XML string.
>   strXML = "<?xml version=""1.0"" encoding=""iso-8859-1""?>" 
> & objDOM.xml
> Again, the results of the transform failed, but this time the 
> data file contents were visible in IE without causing error.

You should'nt be mucking around with the output of the serializer. It's
the serializer's job to output an XML declaration that reflects the
encoding it is actually using.
> 3rd Attempt:
> Using the method described in the 1st attempt, I then called 
> the the save method of the DOM to save the contents to a 
> file. This gave me the same results as mentioned in my 2nd 
> attempt, in that the data was visible in IE without causing 
> error, but the results of the transform still failed.
Either your original XML file is incorrectly encoded, or you are doing
something odd to the output of the transformation before passing it back
into an XML parser.

Michael Kay

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Current Thread