Re: [xsl] How to convert XML doc from UTF-8 to ISO-8859-1 char encoding?

Subject: Re: [xsl] How to convert XML doc from UTF-8 to ISO-8859-1 char encoding?
From: "G. Ken Holman" <gkholman@xxxxxxxxxxxxxxxxxxxx>
Date: Mon, 11 Jan 2010 09:50:52 -0500
At 2010-01-11 15:30 +0100, Ben Stover wrote:
Assume I have a XML doc which is UTF-8 encoded.

Can I convert it somehow to ISO-8859-1 encoding?

You can create a new document using XSLT, copying the nodes, reconstituting the document using the new encoding (if your XSLT processor accepts your request for encoding for the result).

And how to encode it the opposite direction?

Same way.


Normally the encoding is defined in an attribute in the top most <xml> tag.

False.


The XML Declaration at the beginning of XML
documents informs an XML processor about the
syntax of the XML document.  It is not a tag.  It does not have attributes.

The XML Declaration has the same syntax as a
processing instruction, and the parameters of the
declaration borrow the same syntax as attributes,
but the XML Declaration does not show up in the
XPath data model for XML because it isn't data ... it is syntax.

Is there a way to detect if this declaration is
true and corresponds with the real encoding in the full XML doc?
Or if it is faked/misplaced by mistake?

If the encoding used in the document does not match the encoding implied by the XML Declaration then at best your XML processor will abend and report an error, or at worst your XML processor will not encounter an encoding error and you will end up with the wrong characters in your data model for your XML and you won't know they are wrong.

For example, if you have ever seen a capital A
with diaeresis "D" followed by a copyright
symbol, that is because the UTF-8 encoded
copyright symbol has been successfully
interpreted as two ISO-8859-1 characters.  Which
is wrong, but it is not in error in the
ISO-8859-1 encoding, so you get wrong data with no error message.

Character encoding integrity is not the XML
processor's responsibility, but the creator's
responsibility.  The XML processor must take the XML Declaration at face
value.

I hope this helps.

. . . . . . . . . . . . . Ken

--
UBL and Code List training:      Copenhagen, Denmark 2010-02-08/10
XSLT/XQuery/XPath training after http://XMLPrague.cz 2010-03-15/19
XSLT/XQuery/XPath training:   San Carlos, California 2010-04-26/30
Vote for your XML training:   http://www.CraneSoftwrights.com/s/i/
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/s/
Training tools: Comprehensive interactive XSLT/XPath 1.0/2.0 video
Video lesson:    http://www.youtube.com/watch?v=PrNjJCh7Ppg&fmt=18
Video overview:  http://www.youtube.com/watch?v=VTiodiij6gE&fmt=18
G. Ken Holman                 mailto:gkholman@xxxxxxxxxxxxxxxxxxxx
Male Cancer Awareness Nov'07  http://www.CraneSoftwrights.com/s/bc
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal

Current Thread