Re: [xsl] Verifying large XSL transform output

Subject: Re: [xsl] Verifying large XSL transform output
From: Graydon <graydon@xxxxxxxxx>
Date: Tue, 11 Feb 2014 11:15:37 -0500
On Tue, Feb 11, 2014 at 10:36:29AM -0500, Matthew Stoeffler scripsit:
> I have a question about verifying XSL transform output.  
[snip]
> The transform scripts are large.  I know my results are valid in the
> new format; I'm now trying to confirm that I'm capturing all the
> content.  I've done analysis of ID's from source to output.  I have
> contemplated ways of counting text nodes, or text string length, as
> another possible approach.

Are you trying to tell if the transformation worked to the specification
(as distinct from producing valid output!), or if you didn't lose any
text content?

If the former, and you've got any cases where elements become attributes
or vice-versa (or PIs become attributes, etc.) it gets pretty difficult
to check the results in an automated way.  I've found it can be
worthwhile to modify the transform (or to add transform steps) to add
ids to all the source document elements, to maintain ids through the
transform, and to then check the result documents for those IDs, which
you then have to scrub out of the final delivered content.  No help for
attributes but it at least lets you know where your paragraphs went.

Comparing all the text nodes seems to work better from the leaves up.
As soon as you've got anything that isn't a directly, one-to-one mapping
from source elements to result elements, any comparison of text nodes
can become complicated.

But in general this sort of thing is much harder than it sounds.  It can
be easier to start producing a list of assertions that you check by some
non-XSL means (Schematron or XQuery), and add to that list as problematic
cases are discovered.

-- Graydon

Current Thread