Re: [xsl] are all strings in a sequence valid potential QNames

Subject: Re: [xsl] are all strings in a sequence valid potential QNames
From: Justin Johansson <procode@xxxxxxxxxxx>
Date: Sat, 06 Feb 2010 06:11:10 +1030
Hi Liam,

Thanks for your kind explanation, particularly the historical
context of the reasons leading to 5e names.

As a C/C++ developer, having now noted that both libxml and
expat libraries seem to have adopted 5e names as the default
(though allowing compile switch for pre-5e names), and the broad
adoption for 5e names at large, it's beginning to appear that the
whole issue was a storm in a tea cup.

It is with a certain amount of regret that I now feel unreasonably
"swayed by the blogs".  Further, I now understand that there was
little other choice that could have been made in giving the users
what they wanted.  Hopefully over time, any specs that remain
out of sync will be brought into line.

To conclude my interest in the topic though, can you please say
exactly what will happen to the XML 1.1 rec.  Will it now be
depreciated or repealed?

With kind regards,

Justin Johansson

Liam R E Quin wrote:
On Thu, 2010-02-04 at 21:48 +1030, Justin Johansson wrote:

With respect, Liam, I find your comments a tad dismissive of the problem.
I'm sorry, I do not mean to sound dismissive.

What the issue is about is the question of validating source data
in (to the XML parser) and writing valid result data upon serialization.
As many others have pointed out, in the absence of an explicit XML version
identifier it's pretty difficult to unambiguously determine whether you
are validating an XML 1.0 source document against pre- or post- 1.0 5th

As with any spec with errata, or with multiple editions, one should normally implement the most recent. The most important thing is that the 5e names are a strict superset of 4e names: that is, ever document that is well-formed according to 4e remains well-formed according to 5e, and similarly for validity.

So for reading, if you accept 5e you should be fine, and for writing,
the only time it makes a difference is if there are names that use
characters that were not allowed in earlier editions - 5e says that
you can serialize them in that case, and the only alternative is
to go wrong, since there's no escaping mechanism [1].

If there is a version of XML that would be a pleasure to support that
would have to be XML 1.0.5. Is it really too late?

What would you like XML 1.0.5 to be? (although that's for sure in danger of getting off-topic...)

Recall that (unfortunately) many XML 1.0 processors will in practice
refuse to accept an input document with a version declaration that's
other than "1.0", so a 1.0.5 is in no better position in that regard
than XML 1.1.

Originally I went to the XML Core Working Group and suggested that they
(1) relax the rule making it a fatal error if the version in the
    version declaration is not 1.0 (which they did)
(2) produce an XML 1.2 that brought XML up to date with Unicode. and
(3) consider deprecating XML 1.1.
But they felt - justifiably - that an XML 1.2 had little or not better
chance to succeed than 1.1, even with (1) in place, and conversations
with some of the implementors seemed to bear this out.

For my part, I don't think we should try a whole lot more tinkering
with XML, or at least with XML 1.x, and I don't think the world is
ready for an XML 2.x. But, I am still very interested in hearing from
others in this regard, and yes, I was aware of the blogs. Sometimes
what's practical to get done isn't what you'd like to see happen, and
for me at least, 5e was certainly one of those cases, although I know
I'm also paying a penalty for mistakes I (and others) made with XML 1.1.

PS: [1] - I did consider that if 5e did not get adoption, we could move
to using "unichar", in which (as I propose), the lowercase letter "u"
is followed by one or more hex digits, up to a q, so that u38q would
work a bit like &#38;, except of course that it would be allowed in
names. A literal "u" in a name would have to be replaced by u75q.

Current Thread