[xsl] RE: [xsl] RE: [xsl] Re: [xsl] Problems transforming currency values using British pound (£) and Euro (€) signs

Subject: [xsl] RE: [xsl] RE: [xsl] Re: [xsl] Problems transforming currency values using British pound (£) and Euro (€) signs
From: "Kaine Varley" <kaine.varley@xxxxxxxxxxxx>
Date: Thu, 9 Oct 2003 18:27:07 +0100
Michael,


> First point: currency symbols are not legal in the picture of
> format-number(), though some processors may accept them.
> (format-number() was defined by reference to the DecimalFormatSymbols
> class of JDK 1.1, and currency symbols were added to the JDK at a later
> version than this).

Thanks, I was not aware of that. However, the MS parser seems to cope
adequately with this.



> It should not be possible to produce output from an XML transformation
> that can't be parsed by an XML parser. So the question is, how did you
> output the result of the transformation, and what did you do to it
> before parsing it?

The XML data and the result of the transform can both be loaded into a DOM,
but, if saved as a file with a .XML extension and viewed using IE, generate
the error described.



> (I might also ask, why did you serialize the output, if all you wanted
> to do was to parse it again? Why didn't you just write the output of the
> transformation directly to a DOM?)

We use n-tier architecture where a backend object compiles the data. It
never passes objects between this boundary, so the XML is serialized and
handed across as a string. 



> There are many inaccuracies in the above. Firstly, "the default
> character set used by Windows" is not ASCII, in fact there is no
> default. It all depends on how you have configured Windows and which
> software you are using.

I thought I was chancing my arm a bit there. I was trying to interpret Dave
Pawson's page on special characters
(http://www.dpawson.co.uk/xsl/characters.html) and must have got my wires
crossed a bit.


> Secondly, a Unicode £ sign looks like "£" if you display it correctly
> using software that knows it is Unicode. It only looks like £ if you
> (a) encode it using the UTF-8 encoding of Unicode, and (b) display it
> using software that doesn't understand UTF-8 encoding.

That confirms some of my thoughts. After saving the file using XML Spy, I
ended up with two, identical looking files, but one worked and the other
didn't. Only after a great deal of messing did I finally land on the WinDiff
application that compares two files. It, I guess, doesn't understand UTF-8,
which is how I was eventually able to distinguish the differences between
the files.


> If the encoding of the file is iso-8859-1, which it probably is if you
> have a Western European version of Windows, and a run-of-the-mill text
> editor, and if you avoid the special characters that Microsoft has added
> to 8859-1, then you should put this XML declaration at the start of the
> file. If it isn't, then you shouldn't. 

Ah, I have avoided using this encoding on transforms throughout my
application, rather relying on the default value, UTF-8 I believe. Since I'm
based in the UK and using UK settings, will the omission on the
[encoding="iso-8859-1"] attribute on my transforms be detrimental? 



> Firstly, the XML declaration is not a processing instruction. 

I was aware of that and concerned about it as well. However, it appears that
that is the recommended way of doing it in the Microsoft help files.


> Secondly,
> character encoding applies only to an unparsed document. Once the data
> is in a DOM, it consists of characters not bytes, and the encoding is
> none of your concern any more. Once the data has been parsed, if the
> encoding was wrongly labelled then there is no way of undoing the
> damage.

This confirms my findings. My feeling is that the data, not the XSLT is
being incorrectly loaded. My aim was to provide a way of inputting the data
in the correct format, and once in, not having to worry about it again.



> You should'nt be mucking around with the output of the serializer. It's
> the serializer's job to output an XML declaration that reflects the
> encoding it is actually using.


I was trying anything and everything to find a solution.



> Either your original XML file is incorrectly encoded, or you are doing
> something odd to the output of the transformation before passing it back
> into an XML parser.

It is my hunch that it is the data, not the output that is wrong. The output
of the transform is simply perpetuating the problem.


Thanks again for your help. Sometimes I find myself hitting my head against
a wall several times before the penny drops. I hope I don't have to hit it
to many times more ;-).


Kaine




 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread