Re: [xsl] XSLT/XPath 2.0 (was "Identifying two tags...")

Subject: Re: [xsl] XSLT/XPath 2.0 (was "Identifying two tags...")
From: Dan Holmsand <holmsand@xxxxxxxxxxxxx>
Date: Mon, 13 May 2002 21:22:15 +0200
Hi Stuart,

Stuart Celarier wrote:
The concern with complexity in technology is quite valid. To continue
with the C++ comparison, there is fair criticism that it is more
difficult to learn object-oriented programming from C++ than it is from
Java, Smalltalk, etc., because C++ is more complex. For this reason,
most college OOP courses are taught using Java. Much has been written
about how to teach C++ effectively in order to emphasize the OOP since
that is the hardest part for most students to comprehend.

That said, I still am going to distinguish between the audience a
specification and the audience for the technology represented in that
specification. The former are implementers of the technology; the latter
are the users of the technology.

I (partly) agree with the audience description. But I stubbornly maintain that complexity in a specification will not magically disappear at the user level. C++ is complex because it *is* complex, not because it is explained poorly to students.


[ excellent summary of xml omitted ]

If I use the XML parser to validate an XML document against XML Schema
metadata, then the data exposed by SAX or DOM is the PSVI, the Post
Schema Validation Infoset. That says that the data in the XML document
is valid according to the metadata. In the PSVI, type has already been
associated with the data; all data in the document is valid against the
metadata.

Ahum. The SAX and DOM I use do not expose any PSVI contributions from XML Schema. I have, so far, only seen one API that does, as far as I can tell (Xerces XNI, see http://xml.apache.org/xerces2-j/javadocs/xni/org/apache/xerces/xni/psvi/package-summary.html ), unless you count the XML serialization of the PSVI, which might be a bit awkward to use.


When programming with XML, we can define programmatic types based on XML
Schema metadata. Then when we get data from the PSVI, we know that it is
valid against the metadata and we can use that data to construct an
object of that type, and do so without error. My data and all parts in
it conform to the requirements of the type.

Well, strictly speaking this only applies if all elements in the PSVI has a validity property that is equal to "valid" (if I've got this straight)...


Let's imagine that we want to write an XSLT processor. An XSLT processor
does not work with documents: it uses an XML parser to parse the
documents and present them through an API like DOM or SAX. The XSLT
processor only sees infosets. If we use a validating XML parser and the
documents have metadata in XML Schema, all the data that the XSLT
processor sees is PSVI. So it makes sense that the specifications should
be written in terms of the data provided to the application.

Again, this means that we must use a parser that somehow exposes the PSVI. But what if we don't? And what if there is no "parser"? I frequently produce SAX events myself for the benefit of the XSLT processor; so in your model I would now have to produce a full PSVI instead, according to some as yet unknown API. I do not fancy the thought of this much...


Returning to your observation, "the dependency on the complexities of
XML Schema gives me precious little benefit, compared with the
headaches...", I am trying to make the case that the benefit of the
specification making use of PSVI is that XSLT implementers can program
to that requirement and thus produce XSLT processors that are
interchangeable. To specify otherwise would let XSLT processor behavior
diverge, which would spell chaos. Count that as a big benefit to you.

I don't get this. How does the requirement to use PSVI make XSLT processors interchangeable? I would, in fact, argue the opposite: the more complex the specification, the more room for errors and misinterpretations. Even with the simple XSLT 1.0 model, there has been a fair amount of differences among XSLT processors. I don't quite get how these kinds of differences will disappear by demanding compliance with XML Schema.


XSLT 1.0 and XPath 1.0 became W3C Recommendations in November 1999. XML
Schema became a Recommendation in May 2001. That explains why XSLT 1.0
makes no reference to Schemas or the PSVI. But with XML Schema now in
place as a cornerstone of XML technology, it is important to make the
XSLT 2.0 and XPath 2.0 specifications consistent with the XML Schema.

Here it is: I don't agree that XML Schema is a cornerstone of XML. XML is quite useful as it is. Relax NG is a good schema language. Schematron provides a good way of validating XML, that is far more powerful than w3c-schema. In other words: XML worked fine for me both with and without XML Schema - until XPath 2 came along.


Does that make XSLT harder to describe? Not really, unless you plan to
use the XSLT specifications as a textbook. If you think that XSLT should
not be hard to describe, why not write about it yourself? The next great
book on XSLT is waiting to be written.

This is, of course, a tempting offer - but no thanks :-)


I don't think that XSLT2/Xpath2 *can* be described in a simple way. I happen to think that the WG has made an effort to make the specification as readable as possible. XSLT 2 *is* complex in and of itself.

/dan


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list



Current Thread