Subject: Re: ISO-8859-1 encoding and XmlDecl omision (was Re: [xsl] Looking up keys in a separate xml file) From: David Carlisle <davidc@xxxxxxxxx> Date: Wed, 7 Jan 2004 15:19:34 GMT |
Andrew I would have thought that as ascii is a subset of utf-8, the processor could happily leave the declartion out knowing that any future parsing of the document would use utf-8 (by default) and could correctly read the file. The reason why I had the reference to this part of the spec to hand was that I had quoted it in an comment on the offical comment list: http://lists.w3.org/Archives/Public/public-qt-comments/2003Nov/0050.html According to http://www.w3.org/TR/2003/WD-xslt-xquery-serialization-20030502/#N400318 The omit-xml-declaration parameter should be ignored if the standalone parameter is present, or if the encoding parameter specifies a value other than UTF-8 or UTF-16. There is one other case where it would be very useful to omit the declaration (or at least to use a value of utf-8) namely iso-646 (aka ASCII aka US-ASCII). It may be politically incorrect to say that ascii characters are still more interoperable than non-ascii characters, but in practice this is still the case. Especially in XML which specifies that a charset specified in the mime headers takes precedence it is hard to give (say) a utf8 file to someone to serve from their website without first finding out what http server they use, and how to make sure it won't serve the thing as latin 1 resulting in a non-well formed document. (See current discussion on W3C'S TAG list about this). One style of producing XML files that avoids these problems is to produce files that don't have an xml declaration (or have one that specifies utf-8) but to encode all non-ascii characters as numeric character references. Currently in an XSLT 1 usage in production I use <xsl:output encoding="US-ASCII"/> with saxon and post process with sed to remove the US-ASCII encoding declaration (which stops the file being parsed on several XML systems I have locally) I think that it would be very desirable if <xsl:output encoding="iso-646" omit-xml-declaration="yes"/> was defined to work, and produce files of the form described above. Failing that it would be good if it would be allowed by the specification if the system understood that encoding. -- http://www.dcarlisle.demon.co.uk/matthew ________________________________________________________________________ This e-mail has been scanned for all viruses by Star Internet. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________ XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: ISO-8859-1 encoding and XmlDecl, Andrew Welch | Thread | Re: ISO-8859-1 encoding and XmlDecl, David Carlisle |
RE: ISO-8859-1 encoding and XmlDecl, G. Ken Holman | Date | RE: ISO-8859-1 encoding and XmlDecl, John Meyer |
Month |