Re: [xsl] XSLT vs Schematron Decision: Sanity Check

Subject: Re: [xsl] XSLT vs Schematron Decision: Sanity Check
From: Eliot Kimber <ekimber@xxxxxxxxxxxx>
Date: Wed, 12 Oct 2011 13:35:57 -0500
I would tend to lean toward Schematron on the principle of separation of
concerns, where the ownership of the rules for the data validation is likely
different from the implementation of the transformation rules.

Schematron allows somebody who is XML savy but not an XSLT engineer to
define complex validation rules.

Since the main Schematron implementation works by generating an XSLT you
should be able to make it work in a .NET environment (especially if you can
use the .NET version of Saxon so you have XPath and XSLT 2 available).



On 10/12/11 12:25 PM, "Norm Birkett" <Norm.Birkett@xxxxxxxxx> wrote:

> I'm a bit of an XML neophyte who's jumped/fallen into the deep end of
> the pool, and so I'd like to subject a piece of my thinking to the
> informed criticism a group like this can provide--a chance to learn from
> other people's experiences rather than learning from my own mistakes.
> I'll try to distill the problem/question down to its relevant essence.
> Here goes:
> The crux: I'm trying to build a validating-and-transforming XML filter
> in such a way as to yield human-readable documentation of the input XML
> language.
> The context:
> (1) The goal here is to replace a nasty sprawl of legacy code. I'll call
> the replacement "NAI" (for "new and improved").
> (2) NAI will receive XML documents, produced by various people, systems,
> and organizations, representing a variety of more or less complex
> financial transactions in pretty gruesome detail.
> (3) The input documents are written in an XML language referred to
> locally as GENERIC (for reasons lost to history).
> (4) This GENERIC language is poorly documented.
> (5) It would be very useful if GENERIC were well documented (because it
> is constantly growing, and we are constantly adding more people and
> organizations who want to feed documents written in GENERIC to NAI).
> (6) The first thing NAI must do when it gets an input document is to
> validate it.
> (7) If the input document is valid, then the next thing NAI does is to
> convert it to a different XML language, which I will call INTERNAL. (It
> is the internal data representation of the big system into which NAI
> serves as a gateway.)
> (8) The INTERNAL language is even more poorly documented--but that is a
> subject for another day.
> The proposed design of NAI:
> (Step 1) "Loosely" validate the input document using a schema written in
> RELAX NG's compact language ("RNC").
> (Step 2) If document passes Step 1, "tightly" validate the input
> document using a schema written in Schematron.
> (Step 3) If document passes Step 2, transform the input document into
> The proposed process to produce the human-readable documentation of the
> input language:
> (A) Translate the RNC used in Step 1 above into RELAX NG's XML language
> ("RNG").
> (B) Use XSLT to translate the RNG into a simple HTML depiction of the
> elements and structure of the GENERIC language.
> (C) Use XSLT to augment that simple HTML depiction with the tight
> validation rules (represented in Schematron--see Step 2 above) that
> further define the GENERIC language.
> (D) Use XSLT to further augment that increasingly less simple HTML
> depiction with links into the HTML depiction of the INTERNAL language.
> An important underlying assumption:
> (A1) It is an unyielding law of nature that documentation cannot
> accurately describe program code unless it is itself a part of or
> derived from that code. (And in "that code" I do not include comments,
> though I am grudgingly willing to include error messages.)
> My question:
> ===========
> In Step 2 of the proposed design of NAI, I find myself asking myself,
> "Look--you already have to acquire a lot of XSLT expertise to pull off
> the rest of this stunt. And XSLT can be used to represent the same sorts
> of validations as Schematron can represent. Why on earth do you want to
> introduce yet another language/technology into this mix? Just write your
> tight validation rules in XSLT, and have one less thing to learn and
> worry about."
> To which myself replies (you see why I need a sanity check here):
> "Schematron is tidy and small, which will make (C) [see above] much,
> much simpler. It also means that learning Schematron isn't such a big
> cost. Plus it was designed for validation, whereas XSLT is for
> transformations. Use tools for what they're designed for--validators for
> validation, transformers for transformations."
> I'm leaning pretty strongly toward the-myself-that-favors-schematron's
> view of the matter, for the reasons just articulated.
> But I can't be said to have mastered Schematron OR XSLT yet, so I'm a
> bit skeptical of my ability to compare their capabilities.
> It should be added that I'm working in a .NET environment, where the XSL
> tools seem to be a bit more numerous than the Schematron tools.
> So: What say you, XSLers? Does Step 2 sound to you like a job for
> Schematron or for XSLT? Does the .NET environment affect that decision?
> Are there any pitfalls you would urge me to watch out for? Any thoughts
> you want to share will be appreciated.
> Norm Birkett

Eliot Kimber
Senior Solutions Architect
"Bringing Strategy, Content, and Technology Together"
Main: 512.554.9368

Current Thread