RE: [xsl] This can't be right, XML with no root element: Saxon & XT vs. Xalan

Subject: RE: [xsl] This can't be right, XML with no root element: Saxon & XT vs. Xalan
From: Tony Graham <Tony.Graham@xxxxxxxxxxxxxxx>
Date: Mon, 30 Jul 2001 11:07:30 +0100 (BST)
Dylan Walsh wrote at 27 Jul 2001 14:37:16 +0100:
 > Thank you for that detailed answer. Two years working with XML/XSLT and

Happy to be of service.

 > Is it correct then to use this in a text declaration?:
 > <?xml encoding="utf-8"?>

Yes, even though the text declaration is unnecessary when the encoding 
is UTF-8.

 > On the more general issue, I take it that either Xalan or Cocoon has a
 > bug in this area (a bug I like, but a bug all the same). I'm wondering
 > about Schematrons output. While it may not be incorrect, wouldn't the
 > best thing be to use the text output mode? It is just text afterall.

Possibly it would, but by using the XML output mode, you get something
that you can include in an XML document as an external parsed entity.
If there were any unescaped '&' or '<' characters in your text output,
you couldn't then use the output as part of an XML document.

 > When do you use external parsed entities with plain text? Do all
 > external parsed entities have to have the declaration, so you cannot
 > read in a simple ASCII file, for example?

An external parsed entities does not need to have a text declaration
if it's in UTF-8 or UTF-16, because the XML processor can recognise
those encodings and, therefore, doesn't need the one required part of
the text declaration: the encoding declaration.  Conversely, if it's
not in UTF-8 or UTF-16, you have to have the text declaration so you
can signal the encoding to the XML processor.

Yes, you can have an external parsed entity that is "a simple ASCII
file", since ASCII is a subset of UTF-8 so you don't need the text
declaration.  (More correctly, UTF-8 represents the characters in the
ASCII character set the same way as they are represented in ASCII.
UTF-8, being Unicode, also uses the same code points for the
characters in ISO 8859-1 as ISO 8859-1 does, but represents them
differently, and you definitely can't treat ISO 8859-1 text as UTF-8,
or vice-versa, the way that you can with ASCII and UTF-8 over the
range of ASCII characters.)

Note that your "simple ASCII file" has to escape any '&' or '<'
characters, otherwise your external parsed entity will be an
unparsable external parsed entity.


Tony Graham
Tony Graham                           mailto:tony.graham@xxxxxxxxxxxxxxx
Sun Microsystems Ireland Ltd                       Phone: +353 1 8199708
Hamilton House, East Point Business Park, Dublin 3            x(70)19708

 XSL-List info and archive:

Current Thread