Re: [xsl] CJK UTF-16 test

Subject: Re: [xsl] CJK UTF-16 test
From: Benjamin Franz <snowhare@xxxxxxxxxxx>
Date: Wed, 28 Mar 2001 09:00:53 -0800 (PST)
On Wed, 28 Mar 2001, David Carlisle wrote:

> 
> > as I don't have any parser that will swallow UTF-16. 
> 
> utf-16 support is _mandated_ by the XML spec. If you have anything that
> calls itself an XML parser it must be able to read utf-16.

XML does NOT support UTF-16 since UTF-16 includes the surrogates - that is
in fact what *distinguishes* it from UCS-2. That the XML 1.0 spec ('scuse
me, 'Recommendation') *says* that it requires support for UTF-16 is in
fact an error in the text since it explicitly forbids surrogates (aka
UTF-16) in the allowed char range spec. It is like saying 'We require
Japanese support, except you can't use *any* Japanese.' It's a nonsense
statement.

  "Character Range

   [2]   Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
                  [#x10000-#x10FFFF]

  /* any Unicode character, excluding the surrogate blocks, FFFE,
  and FFFF. */

         The mechanism for encoding character code points into bit
  patterns may vary from entity to entity. All XML processors must accept
  the UTF-8 and UTF-16 encodings of 10646;
                   ^ 
                   |
           The Error. What it actually requires is a
           specifified subset of UTF-8 and UCS-2 encodings.

-- 
Benjamin Franz

"Real programmers can write assembly code in any language." 
                                 -- Larry Wall 


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread