Re: [xsl] XSLT vs Perl

Subject: Re: [xsl] XSLT vs Perl
From: David Tolpin <dvd@xxxxxxxxxxxxxx>
Date: Thu, 5 Feb 2004 09:26:43 +0400 (AMT)
> > XSLT 2.0 does).
> String manipulation in XSLT 2.0 is certainly inadequate for some of
> the up-translations that you might want to do, such as parsing
> non-regular languages such as HTML or LaTeX (both of which might
> reasonably appear embedded within an XML document). In these cases,
> using Perl (or another language) to pre-parse the embedded language
> into XML structures seems a very reasonable way to proceed.
> Did you have any other tasks in mind where the string-manipulation
> support in XSLT 2.0 is inadequate?

String manipulation is inadequate in XSLT 2.0. It is taken from
a language which allows to build, compose and transform the regular expressions
themselves freely and conveniently; and inserted into another one
almost without modifications. Using regular expressions in XSLT 2.0
is just like writing hexadecimal codes for the target processor.

Writing 'analyse-string' for a thing as simple as addr-spec RFC2822
(160 ascii characters in length for the regular expression) is a task
for a good regexp hacker. In Perl, C, Java, one can compose a regexp
from parts, store them in a data structure accessible by keys, provide
as many abstraction layers above parsing as required.

A powerful language for string analysis is available for almost 30 years
and called 'lex' (lex(1)). 

> Can you expand on the kinds of grouping problems that you have that
> <xsl:for-each-group> can't handle? I'm not particularly enamoured with
> <xsl:for-each-group> myself, and it sounds as if you have another
> design in mind; what is it?

WHy grouping is easy in Lisp, Java, C, and Ruby? Do those other languages
have grouping constructs? Why grouping at all is a specific family of
algorithms? In fact, it is not.

To make grouping easy, very few things are required, and these few things are
- two kinds of containers, sorted and unsorted
- proper tail recursion
- cutoff (or call/cc, or return)
- first-class templates (which are already possible through reflection hack
  in XSLT, and the only thing that needed is to po provide a more readable
  syntax for the hack).

These four points would provide everything that is required to implement
easily-usable grouping library, as well as libraries for other classes
of problems commonly found in coursebooks on algorithms.

Yet the XSLT 2.0 working draft has decided to go with a dozen of new
keywords just to support one particular kind.
> Pragmatically, I think XPath 2.0 does the right thing by having
> date/duration arithmetic support built into the language. In EXSLT,
> the date-related functions are among the most commonly used; "how do I
> insert the current date/time into my document" is one of the most
> frequent FAQs.

THe most frequent questions are 'how can I modify a variable and
how can I update the XML document'. Are assignments and updates
in the draft? Have I missed something? 

> Note that XSLT 2.0's date support comes via XPath 2.0, which is itself
> based on XML Schema, which is based on ISO 8601. ISO 8601 dates are
> different from "English dates"; XSLT 2.0 has support for formatting
> such dates using various calendars and languages, but you have to
> write your own code (probably using <xsl:analyze-string>) to parse
> them.

XML Schema is based on ISO 8601 but failed to conform to it in many ways.
Do you really think that analyse-string is an appropriate tool for
parsing dates?

My criticism is as much about XPath 2.0 as about XSLT 2.0. 
Why not just provide library mechanism, whith clearly separated
domains, and describe a few standard library modules outside
the core languages?

WHy does XPath 2.0 need to repeat a good half of perl flow control

David Tolpin

 XSL-List info and archive:

Current Thread