RE: [xsl] Is this a sorting bug in xalan 6.4.0?

Subject: RE: [xsl] Is this a sorting bug in xalan 6.4.0?
From: "Roger Glover" <glover_roger@xxxxxxxxx>
Date: Fri, 7 Feb 2003 16:20:38 -0600
Michael Kay wrote:

> > I am sure there are lots of people here who can trump my
> > experience handily, but is my expectation that spaces will be
> > significant in string sorting, at least by default, really
> > *that* parochial?
>
> We all have parochial assumptions about how strings should be sorted,
> and they are all different. So the trend is for modern software systems
> to provide localisation support.

Right.  Java, for example, has extensive support for internationalization,
localization, and (through its "Comparable", "Comparator", and "Collator"
interfaces) customization of order relationships.  However, the default
behavior when ordering two String objects is that all characters, including
visible characters, whitespace, and even unprintable characters, are
significant.


> Most of the programming languages you referred to probably use
> "codepoint" collation, where strings are sorted according to their
> numeric character codes.

Right.  Donald Knuth, in his definitive book, _The Art of Computer
Programming, Vol. 3:  Sorting and Searching_, calls that ordering
"character-code-based lexicographic ordering".


> It looks as if Xalan is trying to do a more
> intelligent sort.

Right.  And Rational Rose is being "intelligent" when it resizes and
reorders diagram elements that I have not touched.  But that does not mean
that it is welcome, correct, or (especially) wise for it to do so.


> Looking in my dictionary, I find that "ad hoc" appears
> between "adhibit" and "adiabatic": so you can't simply say that it's
> wrong to ignore spaces, it's clearly what some users want.

What?!  Your dictionary doesn't have "ad hominem"?   :^)

Anyway, if the average dictionary user or library card catalog user were
writing XSLT, I could give that argument more credence.  However, I believe
that the average XSLT writer is, at some level, a programmer, no doubt with
a programmer's intuition and expectations about how strings should be
sorted.  "Intelligent" sorting mechanisms are fine and extremely useful, as
*options*.  When such mechanisms are default behavior:
    - I am unable to predict program behavior
    - I am distracted from my main task to compensate for implementation
differences
    - I start doubting the implementation stability of other fundamental
behaviors
Predictable and well-defined ordering/sorting behavior is *far* more
important to me than "intelligent" sorting behavior.


> XSLT 2.0 (and Saxon 7.3) give you extensive control over the collation,
> recognizing that there is no one answer that suits everybody.

That is very welcome news!  Thank you.


-- Roger Glover
   glover_roger@xxxxxxxxx

***   "Luckily, you have come to exactly the right place with
***    your interesting problem, for there is no such word as
***    'impossible' in my dictionary.  In fact, everything
***    between 'herring' and 'marmalade' appears to be missing."
***        -- Douglas Adams
***           _Dirk Gently's Holistic Detective Agency_



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread