[xsl] Problems transforming currency values using British pound (£) and Euro (€) signs

Subject: [xsl] Problems transforming currency values using British pound (£) and Euro (€) signs
From: Kaine Varley <kaine.varley@xxxxxxxxxxxx>
Date: Thu, 9 Oct 2003 15:00:30 +0100
Hello,

I would be grateful if someone could help me with my problem transforming
pound signs and Euro signs.

Scenario:
A format string (£#,##0.00) is used to format a number, using the built-in
format-number() XSLT function, into a currency value. The format value is
obtained from the data XML file that is being transformed, and looks a
little like this:

<Data TotalRows="4">
  <ColumnHeadings>
    <ColumnHeading Number="1">Value</ColumnHeading>
    <ColumnHeading Number="2">Label</ColumnHeading>
  </ColumnHeadings>
  <Rows>
    <Row Number="1" GUID="{F553FCD9-10C1-D511-91D9-000629864A98}">
      <Column Number="1" Type="Number" Format="£#,##0">12</Column>
      <Column Number="2" Type="Standard">A &amp; F GRANT LTD</Column>
    </Row>
    <Row Number="2" GUID="{CD4B980A-9ADC-D511-91EB-000629864A98}">
      <Column Number="1" Type="Number" Format="£#,##0">6</Column>
      <Column Number="2" Type="Standard">ACME Products</Column>
    </Row>
    <Row Number="3" GUID="{C18BEED0-956B-D611-9211-000629864A98}">
      <Column Number="1" Type="Number" Format="£#,##0">87</Column>
      <Column Number="2" Type="Standard">ABAC SERVICES LTD</Column>
    </Row>
    <Row Number="4" GUID="{3C2A26E3-5D51-D611-920B-000629864A98}">
      <Column Number="1" Type="Number" Format="£#,##0">1</Column>
      <Column Number="2" Type="Standard">ABACUS SERVICES LTD</Column>
    </Row>
  </Rows>
</Data>


Using this data, I generate an SVG bar chart, which is essentially another
XML document.


Problem:
The resulting transformed XML document does not parse properly, complaining
of illegal characters, when viewed in IE. (Error: An invalid character was
found in text content. Error processing resource bla..) 


Theory:
This is as a direct result of the pound (£) sign, since using a dollar ($)
sign works fine. Research has shown this character to be a part of the
Latin-1 (extended latin) character set, which is not part of ASCII, the
default characterset used by Windows. (Please correct me if my facts aren't
accurate). This means that, while the XML above looks well formed, the pound
character needs to have been converted to a Unicode charcacter which, I
believe looks something like this: £ (that is an accented letter A together
with the pound sign), although in some editors it still appears as one
character.

I arrived at this conclusion after noticing that the XML data, above, when
viewed in XML Spy and saved to a file, without editing, worked fine. Using
MS WinDiff utility highlighted this double character difference, which I
suspect is Unicode.


1st Attempt:
So, armed with this theory, I attempted to create the data file by allowing
the MSXML4 parser to convert the values to Unicode for me (at least that's
what I hoped for), by setting the XML encoding. To do this I tried to set
the document encoding (<?xml version="1.0" encoding="iso-8859-1"?>) prior to
building the rest of the document, in the hope that the parser would
understand that I'm trying to input my values in a characterset other than
ASCII, in Latin-1 or iso-8859-1. I used the VB syntax:

  objDOM.appendChild objDOM.createProcessingInstruction("xml",
"version=""1.0"" encoding=""iso-8859-1""")

This had no effect, and the data file and the transformed document both
reported errors when viewed in IE. The encoding attribute of the XML
declaration did not appear in the data file when the objDOM.xml was called
even though it was used.


2nd Attempt:
The next thing I attempted was to create the data XML file without changing
the encoding and, before consuming the XML contents, append the declaration
to the front of the XML string. 

  strXML = "<?xml version=""1.0"" encoding=""iso-8859-1""?>" & objDOM.xml

Again, the results of the transform failed, but this time the data file
contents were visible in IE without causing error.


3rd Attempt:
Using the method described in the 1st attempt, I then called the the save
method of the DOM to save the contents to a file. This gave me the same
results as mentioned in my 2nd attempt, in that the data was visible in IE
without causing error, but the results of the transform still failed.


Conclusion:
I would have expected the parser to convert the input from whatever
characterset I used, providing I let it know my characterset, into Unicode,
but this doesn't appear to be happening. Simply stating the correct
characterset encoding in the XML source doesn't help since loading this
again doesn't appear to convert it into Unicode, and thus the results of the
transform contains invalid characters.

What am I doing wrong? Am I wrong in my assumption that the parser is
required to convert the input to Unicode? Should I be converting the input
to Unicode before adding it into the DOM?



Kaine



PROACTIS Group Limited
Holtby Manor, Stamford Bridge Road, York, YO19 5LL
Tel +44 (0)1904 481999 Fax +44 (0)1904 481666
Visit us at www.proactis.com <http://www.proactis.com/> 

PROACTIS ® - Control spend.  Streamline purchasing.

************************************************************
The information contained in this email is intended only for the individual
to whom it is addresses.  It may contain privileged and confidential
information.  If you have received this message in error or there are any
problems, please notify the sender immediately and delete the message from
your computer.  The unauthorised use, disclosure, copying or alteration of
this message is forbidden.  PROACTIS Group Limited will not be liable for
direct, special, indirect or consequential damage as a result of any virus
being passed on, or arising from alteration of the content of this message
by a third party.
************************************************************

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread