Crazy idea

Subject: Crazy idea
From: "Oren Ben-Kiki" <oren@xxxxxxxxxxxxx>
Date: Mon, 6 Sep 1999 11:56:16 +0200
Mulling over the namespace/XSchema issue, I came up with the following
notion. I wonder if it has ever been suggested, and what its chances are.

A DTD/XSchema, as defined today, performs two tasks. One is to validate a
document. The output of this is a single Boolean value. The other is to
"complete" the document - that is, add defaults.

The latter part seems like a transformation task to me, which raises the
question: why not specify this as an XSLT stylesheet? Once you consider
this, it becomes obvious that the first part can also be specified as a
stylesheet, given that one introduces some way for a stylesheet to indicate
an error in the input (say, <xsl:error>). Actually, that's a good idea by
itself. I already simulate it in my stylesheets by emitting an obtrusive
error boilerplate, but that's a kludge.

Objections:

- You need an XSLT processor instead of just an XSchema processor.

Given that an XML system is likely to have an XSLT processor anyway, this
might be viewed as removing a component instead of adding one.

- Doing full XSLT processing is more expensive then "just" XSchema
processing.

Is it? The XSchema spec is already quite complex, what with schemas
importing one another and so on. There's also the issue that if we'd be only
using XSLT, then the effort that would have gone for writing XSchema
processors would go instead for optimizing XSLT.

- XSLT is be "too strong". After all, it is Turing complete.

XSchemas are already stronger then DTDs, for a good reason, so this might be
an advantage. Otherwise, it might be useful to define a subset of XSLT to
use in "schema" stylesheets.

- People would abuse it.

That is, people would create "schema" sheets which would transform the
document to something completely different then what you started with. There
are several answers to this, ranging from "so what? anything good can be
abused" to "define a strict subset of XSLT which can't be abused". My
suggested compromise is to require that a "Schema" XSLT stylesheet should
only emit "fixpoint" documents - that is, applying it a second time must be
a no-op. This is trivially testable and satisfies most objections as to the
difference between "validation/completion" and "transformation".

- The XSchema WG would be out of job :-)

Someone should consider what modifications, if any, are required for using
XSLT in this capacity. We'd need <xsl:error> or something equivalent (a
construct which is useful in general). There might be a need to pick a
subset of XSLT. There may be other issues. The whole concept of using XSLT
for validation would justify a separate spec. So the Schema WG would
actually be quite busy formulating it.

The benefits are obvious:

- KISS.

Just one spec, instead of two; just one implementation, instead of two. In
fact, I'd consider this an overriding consideration. The burden of proof
should be on the other side to justify the need for a separate spec :-)

- Time to recommendation.

The recommendation could be released very soon after the XSLT one. That
should be much faster then the current expected time for XSchema to finalize
(I think). This can be done since A lot of sticky issues in the XSchema spec
are automatically resolved. How to reuse XSchema "modules"; specifying
intelligent defaults; mixing namespaces within a document; etc.

- Time to implementation.

We get a strong, flexible, powerful schema mechanism with robust
implementations, immediately. I don't doubt the ability of the XSchema WG to
come up with a good recommendation, but how long would it take until we see
it implemented properly?

- Market acceptance.

By tying schemas to XSLT, every time a company uses one, it automatically
can make use of the other. This would increase the market penetration of
both recommendations. The schema recommendation stands to make most of the
advantage; given that XSLT processors would be available in browsers and
servers as a matter of course, this would significantly lower the barrier
for performing validation, as compared to obtaining a separate schema
processor

How about it?

    Oren Ben-Kiki


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread