Re: [xsl] schemas and xslt 2.0 (was something else)

Subject: Re: [xsl] schemas and xslt 2.0 (was something else)
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Mon, 13 Sep 2004 16:44:17 +0100
Hi Bruce,

> On Sep 11, 2004, at 2:21 PM, Michael Kay wrote:
>> Saxon-SA implements the "schema-aware" facilities of the XSLT 2.0
>> working draft. This means you can:
>>
>> - validate your input documents against a schema
>
> How tied to XML Schema is this?
>
> While I generate XML Schemas, I author in RELAX NG, and often rely
> on features unsupported elsewhere (such as attribute-based
> validation).

XPath and XSLT 2.0 are pretty tightly tied to XML Schema,
unfortunately: the wording of several sections of XPath 2.0 and XSLT
2.0 rules out using any other schema language. The main reasons are
largely non-technical: the XSL and XQuery WGs felt (wrongly, in my
opinion) that W3C technologies should use other W3C technologies
rather than technologies originated from elsewhere.

There are technical reasons why supporting RELAX NG isn't
straight-forward:

First, unlike XML Schema, the annotation/augmentation of documents was
a non-requirement for RELAX NG, which was instead designed primarily
for validation. (Even things like IDs and defaulted attributes are
only supported via extensions to RELAX NG.) XML Schema has particular
rules that guarantee that a validator can assign a single type to a
particular element without looking ahead to see the attributes/content
of that element; RELAX NG doesn't have those rules (which is why you
can use it for attribute-based validation), and therefore it's harder
for processors to assign types to elements and attributes.

An XPath 2.0 processor that used RELAX NG to create an XPath data
model instance would probably have to restrict the kinds of grammars
that it supported in order to make its life easier. (RelaxNGCC faces a
similar problem and solves it by only supporting unambiguous
grammars.)

Second, the semantics of RELAX NG are somewhat different from those in
XML Schema. Whereas XML Schema has the concepts of "types", split into
"complex types" and "simple types", RELAX NG has patterns against
which particular (sub)trees are matched. RELAX NG doesn't have any
equivalent for type hierarchies or substitution group hierarchies;
without them, a lot of the power of using schemas with XSLT 2.0
disappears.

I think it would have been possible for XPath 2.0 and XSLT 2.0 to be
more general, had their been the will. It wouldn't have taken much to
change the emphasis and tweak a few parts of the data model and the
XPath and XSLT 2.0 languages to open the door to other schema
languages providing the same set of information as you get from XML
Schema (with RELAX NG only providing a small subset). It would have
been more work (but possible, I think) to try to come up with an
abstraction for XML documents that incorporated type information
without being tied to a particular schema language.

Anyway, it's all too late now. Your best bet is to use Trang to
transform the RELAX NG schemas into XML Schema schemas. In fact, Trang
does an amazing job of creating type and substitution group
hierarchies from RELAX NG schemas, so you will probably get more by
using the XML Schema schemas that Trang generates than you would if
XPath/XSLT 2.0 specified how RELAX NG schemas should be used.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/

Current Thread