Subject: Re: [xsl] comparing XML document structure From: Graydon <graydon@xxxxxxxxx> Date: Wed, 17 Aug 2011 21:35:06 -0400 |
On Thu, Aug 18, 2011 at 12:44:02AM +0100, Tony Graham scripsit: > On Wed, August 17, 2011 11:48 pm, Wendell Piez wrote: > > It sounds like you want to infer content models on the fly and then > > validate against them. I can imagine approaches to this, but I doubt > > that I'd trust many algorithms that actually attempted it -- not because > > of XSLT, but because of the problem specifying the problem. > ... > > But why not use a schema? There are processors such as Trang that can > > infer schemas from documents. > > What Wendell said. Using trang to generated a schema from the DTD in question has historically tended to fail. (Not a whole lot, but some; generally usable for creating a schema to get saxon to validate the output, but not usable on the fly for structure.) So I've got a relatively fixed content model, in the form of a comprehensive DTD and a much less comprehensive example of how to use that DTD for a particular content type. Initially, what I want to do is eat the exemplar, use it to generate a parent child list -- so I'd have section/num, section/para, and section/subsection -- and then take an output file and get the same list from it, then compare the lists and produce a message for mis-matches. So if a particular output file had section/num, section/subsection, and section/list in it, for example, there should be an exception noted for the presence of the list. (Valid, but not expected.) > ... > > On 8/17/2011 5:57 PM, Graydon wrote: > ... > >> The desired goal is to be able to programmatically pull the structure, > >> at least to the extent of parent-child element pairs, from the > >> semantics-defining file, and compare that to each output file in turn. > >> > >> So if the semantics-defining file gives an example section element, > >> which has num, para, and subsection element children, what I want to be > >> able to do is create a sequence of axis relationships and test the > >> section elements of the output for axis relationships that are not > >> members of that sequence. > > It would help the rest of us wrap our heads around the problem if you > could provide a sample fragment of the "semantics-defining file" so we can > see what you are dealing with. It would, but the whole NDA thing rears its ugly head. It's just a document, to the same DTD as the output. Instead of having actual content in it, it has things like <para>This para is optional; if present, it should contain introductory text</para> in it. > You may be able to create the tests you want in Schematron, but it's a bit > hard to tell without having an example to look at. (If you can generate > Schematron from your definitions, you could directly create XSLT for the > axis tests about as easily, but the advantage could be that there are > tools such as XML IDEs that already understand the Schematron report > format.) Schematron is certainly something to look at, yes. Thanks! Graydon
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] comparing XML document st, Tony Graham | Thread | Re: [xsl] comparing XML document st, Graydon |
Re: [xsl] comparing XML document st, Tony Graham | Date | Re: [xsl] comparing XML document st, Graydon |
Month |