Re: [xsl] How do you ensure that data is not altered/corrupted in a transformation?

Subject: Re: [xsl] How do you ensure that data is not altered/corrupted in a transformation?
From: "BR Chrisman brchrisman@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 19 May 2023 18:45:46 -0000
If this is a case where you have <alt>12000 feet</alt> and you want to be
able to validate after potentially 3rd party transformations that should
affect 'alt' including transforming it to 'altitude', while being able to
validate after the fact that no text node in 'alt' was changed, maybe place
hashes/checksums in another namespace for final-step verification
(presuming the additional namespace and such are passed through... a big
assumption).
ie, <alt check:text-node-hash="0ab4">....
If at the end of a transformation pipeline, that namespace was evaluated
(and probably stripped), you could verify that text nodes haven't been
altered.
I've seen and done this before in a way with exclusive canonical xml and
placing signatures in attributes in other namespaces for validation of
subtrees (ie, read the signature, remove the signature-namespace and
everything in it, and then verify the signature) etc.
Not completely sure this helps your use case though.

On Fri, May 19, 2023 at 10:11b/AM C. M. Sperberg-McQueen
cmsmcq@xxxxxxxxxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

>
> "Roger L Costello costello@xxxxxxxxx" <
> xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> writes:
>
> > In certain domains loss of life may occur if data is altered/corrupted
> in any way.
> >
> > ...
> >
> > I have heard of people doing a hash on the data prior to the
> > transformation, a hash on the data after the transformation, and then
> > comparing the hashes. Is that what you would do when lives are on the
> > line? What is your recommendation?
>
> When lives are on the line, I would be inclined to attempt a formal
> proof that the transformation has in fact preserved all the information
> it is critical to preserve, has dropped only information it was intended
> to drop (if any), and has added only information it was intended to add
> (if any).
>
> The general approach is described in
>
>   C. M. Sperberg-McQueen, bWhat constitutes successful format
>       conversion? Towards a formalization of J;intellectual contentb.b
>       International Journal of Digital Curation (IJDC) 6.1 (2011):
>       153-164.
>
>       http://www.ijdc.net/article/view/170/238
>       http://www.ijdc.net/article/download/170/238/0
>
> A concrete but simple example is described in
>
>   Sperberg-McQueen, C. M., Yves Marcoux and Claus Huitfeldt. bDocument
>       lattices: Equivalence, compatibility, and contradiction in
>       document markup.b Presented at Balisage: The Markup Conference
>       2014, Washington, DC, August 5 - 8, 2014. In Proceedings of
>       Balisage: The Markup Conference 2014. Balisage Series on Markup
>       Technologies, vol. 13
>       (2014).
>
>       https://doi.org/10.4242/BalisageVol13.Sperberg-McQueen01
>
>
https://balisage.net/Proceedings/vol13/html/Sperberg-McQueen01/BalisageVol13-
Sperberg-McQueen01.html
>
> The papers mentioned assume that you want to be confident that a
> particular transformation of a particular dataset has not corrupted the
> information you care about.  If you want to be confident that a given
> transformation will never corrupt information you care about, the task
> will be more challenging, both because you will need to construct
> correctness proofs and because you will need to define what it means for
> a transformation to be correct, neither of which are terribly simple.
>
> In cases where lives are not on the line, it will often make sense to
> settle for a less stringent standard of confidence.  For example, as
> Michael Kay suggests, to invest a lot in testing.  And as others have
> already implicitly or explicitly suggested, a good deal of benefit --
> and possibly a good deal of effort -- will accrue already from the task
> of *identifying* exactly what information it is critical to preserve and
> what additions or omissions of information are expected or allowed.
>
> --
> C. M. Sperberg-McQueen
> Black Mesa Technologies LLC
> http://blackmesatech.com

Current Thread