Re: [xsl] How to read the encoding of an XML document

Subject: Re: [xsl] How to read the encoding of an XML document
From: Joerg Pietschmann <joerg.pietschmann@xxxxxx>
Date: Fri, 26 Oct 2001 09:32:31 +0200
James Garriss <jpgarriss@xxxxxxxx> wrote:
> Ok.  If you recall, I started this discussion by mentioning that I am
> receiving XML documents from several European countries.  So the pertinent
> question for me is "if UTF-8 and/or UTF-16 will be the output encoding set
> I must use, will they handle charcters from the languages I care about?"

See it the other way around: XML handles, basically, Unicode only
(ignore nitpicks at this level), so if someone wants you to handle
stuff which cannot be expressed in Unicode characters, you can't use
XML/XSL anyway.

> So it seems to me that I should be safe outputing my data to UTF-16.  That
> make sense?

Actually you could use whatever encoding you like, all text will be
output in a form any other XML tool will understand (if the other
tool understands the encoding you use). The only real difference
is the size of the output file: if you use mainly western european
languages, the file will be smallest if you use ISO-8859-1 or UTF-8,
for east asean languages UTF-16 may yield smaller output files.
All XML tools are required to understand UTF-8 and UTF-16. Most
modern Browsers understand UTF-8 and UTF-16.

The caveat is that people tend to view XML documents with basic
editors which don't care about encodings, and complain if the
editor displays stuff they don't expect. Recent Windows releases
have an editor shipped with them which can be configured to
understand UTF-8.

HTH
J.Pietschmann

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread