After a period of serious doubt about XSLT 2 and it's advantages over
XSLT 1.0, driven largely by concerns about the implications of
schema-aware processing and datatypes, I must say that I am now firmly
convinced of the value of XSLT 2.
One thing I realized is that the datatyping stuff is only an issue if
you do schema-aware XSLT processing. But for a lot of stuff you don't
need schema awareness so it's not an issue. In particular, processing
schema-based documents does not necessarily imply schema-aware XSLT
processing (but it does enable it).[1]
Here at Innodata we've started working on a project that involves
XSLT-based data processing that is much more involved than our typical
work-a-day techdoc-to-FO transforms. We made an engineering decision to
use XSLT 2 and I'm glad we did. The new features in XSLT 2 have made
many difficult problems so much easier.
I must also congratulate both the XSLT committee and the editorial team
for producing a clear, easy-to-understand, and easy-to-use
specification. I've had little difficulty coming up to speed on the new
features of XSLT, many of which are quite sophisticated and non-obvious,
using just the specifications involved (although I have to admit that
the regular expression stuff in the schema spec is a little harder to
follow, but that's a side effect of the need for precision--it's not
intended to be a tutorial).
In particular, I'm finding that the for-each-group, namespace aliasing,
and string handling features are making many problems much much easier
than they were with 1.0. In addition, the ability to create XSLT
functions makes it easier to have a more object-oriented style of code
without limiting your XSLT to a single or small set of engines, as was
the case with XSLT 1. In addition, the new concept of sequence
constructors in XSLT 2 makes many operations, especially in the context
of functions, much clearer and more reliable.
The for-each-group stuff is just brilliant. I didn't really appreciate
just how powerful it was until I started using it. For example, here's
a script that uses for-each-group to report the set of unique element
types used in a document:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0"
>
<xsl:template match="/">
<xsl:message>
Unique Element Types:
</xsl:message>
<xsl:for-each-group select="//*"
group-by="name()"
>
<xsl:sort select="name()"/>
<xsl:message><xsl:value-of
select="current-grouping-key()"/></xsl:message>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
I think that's pretty cool (not the script, but the fact that you can do
this with such little code).
Anyway, my hat is off to the XSLT 2 Working Group and my thanks go out
to implementors of XSLT 2 engines, especially Mike Kay for Saxon 8.
Cheers,
Eliot
-----
[1]Schema-aware XSLT processing is *not* the same as applying an XSLT to
a schema-based document where the document is validated on input to the
XSLT. In this case, the XML processor validates the document against its
schema and augments the info set passed to the XSLT engine to reflect
things like schema-defined default values, but the XSLT engine will
still treat all attribute values and element content as the generic "any
type". It's only in the case where the XSLT processor *also* accesses
the schema in order to determine the full datatype of attributes and
content that you can get weirdness in your XSLT as a result of values
being returned as types you didn't expect. The implication here is that
you either write an XSLT script as completely schema unaware or as
completely schema aware and make sure you configure your XSLT processor
appropriately.
--
W. Eliot Kimber
Professional Services
Innodata Isogen
9390 Research Blvd, #410
Austin, TX 78759
(512) 372-8155
ekimber@xxxxxxxxxxxxxxxxxxx
www.innodata-isogen.com