Re: [xsl] text extraction

Subject: Re: [xsl] text extraction
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Thu, 12 Oct 2006 19:23:56 +0200
Andrew Welch wrote:


"The value of the encoding attribute provides the value of the
encoding parameter to the serialization method. The default value is
implementation-defined, but in the case of the xml and xhtml methods
it must be either UTF-8 or UTF-16."

(http://www.w3.org/TR/xslt20/#element-output)

...which took me a little by surprise - It seems to say that when the
output method is xml or xhtml the encoding MUST be either UTF-8 or
UTF-16?  Saxon doesn't seem to mind...

I remember somewhere that I read that all XML capable applications MUST at least be capable of handling UTF-8 and UTF-16. (Strange, I never see UTF-32 anywhere mentioned). Since an XSLT processor must parse XML and write it, it follows that at the very *least* it must understant UTF-8/16, I think.


Reading that one above, I understand that for Text and Html outputs, it is possible that UTF-8/16 is not the default. Good to know, sure I must set it manually, I almost only do text processing with UTF-8 (but I'm afraid I'm getting adrift from the subject now...)

-- Abel

Current Thread