[xsl] Does the new structure include the same text content?

Subject: [xsl] Does the new structure include the same text content?
From: "ian.proudfoot@xxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 22 Jan 2021 11:28:43 -0000
Hi everyone,

 

I am working on a project to convert several thousand SGML files (S1000D
1.7) into a more recent XML version (S1000D 4.1). My finished XSLT style
sheet does the job that is expected.  However during the development I did
run into a problem where an error in the stylesheet allowed the output to
pass schema validation but by omitting some content! For me that's very bad
news and I was lucky to notice it.  Ultimately the final output will be
verified by the subject matter experts, but I really don't want to give them
any reason to doubt the reliability of the conversion.

 

This got me thinking about ways to verify the output text content against
the input despite significantly different structure. Is there an established
way to do that? If so what is it called and how well does it work? 

Perhaps it's something that I should build into the XSLT as it is written?
Or perhaps it could be run as a post process batch comparison operation?

 

My initial thought is to output normalized text from input and output and
compare the resulting text files.  

 

I've searched the archives, but I probably don't know the correct
terminology to get any useful results.

 

Thanks in advance for all responses.

Ian

 

Ian Proudfoot

Bembridge

Isle of Wight 

United Kingdom

Current Thread