Re: [xsl] Re: XPath incompatibilities

Subject: Re: [xsl] Re: XPath incompatibilities
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Thu, 3 Jan 2002 18:35:52 +0000
Joerg Pietschmann wrote:
>> I guess that one possibility that would help in the case of < would
>> be to convert the nodes to whatever type they look like. If they
>> look like numbers, treat them as numbers; if they look like dates,
>> treat them as dates and so on.
>
> I beg to disagree: is 02020202 a number or a date? It is better to
> flag an error instead of trying to read minds. If some data is meant
> to be a date, it is a good idea to have this expressed somewhere,
> either in a type declaration in a schema, or by using explicit
> conversion functions in the expression. Explicit type conversions
> might also be good for optimizations.

Well, 02020202 is clearly a number - dates in XML Schema and XPath 2.0
are in the format YYYY-MM-DD.

It isn't possible to tell whether 02020202 is an float, double,
decimal or any one of the other numeric data types, but that doesn't
matter when doing comparisons between numbers. You *could* say that
02020202 is a gYear, I suppose, but again the comparison will be the
same whether you interpret as a gYear or a number.

There are data types that aren't distinguishable from each other. The
data types that can be compared with <, according to the F&O WD
are:

  - numbers
  - durations
  - datetimes
  - strings

Numbers, datetimes and durations have mutually exclusive lexical
representations, so the only possible source of confusion between them
is the fact that the lexical representations of all three types could
be strings.  I would argue that if you're comparing two nodes whose
values both look like numbers, you're probably after a numeric
comparison, and similarly for durations and datetimes.

Also, in the case of comparisons, because the fallback is always a
comparison between strings, you wouldn't ever get an error (as you
suggest), you'd silently get a possibly unexpected result. I think
this is more dangerous than trying to "read minds" - imagine comparing
two untyped nodes whose values range between 1 and 10. Usually the
comparisons work just as expected (because lexicographic comparison
gives the same results as numeric comparisons between 1 and 9), but
when one of them is 10 then suddenly you get a strange error.

But I guess it comes down to whether you expect two nodes whose values
look like numbers to compare numerically or alphabetically. I can see
arguments both ways. The point of my suggestion was to provide a way
out of the problem of backwards incompatibility due to lexicographic
comparisons.

And having thought about it, introducing lexicographic comparisons
here at all is a problem for backwards compatibility, because if you
have:

  <range min="foo" max="bar" />

and do:

  @min > @max

then in XPath 1.0 the result is false because both 'foo' and 'bar' are
converted to NaN and compared (and NaN > NaN is false). With
lexicographic comparisons in XPath 2.0, it would be true, since 'foo'
is alphabetically after 'bar'.

I think that not overloading > with lexicographic comparisons is
therefore the sensible answer.

This still leaves what to do about nodes that could be numbers,
durations or datetimes, but since there's no overlap between the three
lexical representations, I don't think there's a problem with doing an
implicit conversion.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread