Subject: Re: [xsl] is there a way to hash an element? From: "Dimitre Novatchev dnovatchev@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Fri, 10 Jun 2016 14:27:22 -0000 |
I would also try to use the standard XPath 2.0 function: deep-equal() Yes, it doesn't generate an id for identifying a sub-tree, but still can be used for establishing classes of equivalence. Just a quick thought. Cheers, Dimitre On Thu, Jun 9, 2016 at 3:51 PM, Dimitre Novatchev dnovatchev@xxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > You may even not need a hash function. > > Just use the standard XPath 3.0 function: > > serialize() > > > http://www.w3.org/TR/xpath-functions-30/#func-serialize > > > Cheers, > Dimitre > > On Thu, Jun 9, 2016 at 3:08 PM, Graydon graydon@xxxxxxxxx > <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: >> Hello all -- >> >> So I've got about half a gibabyte of XML messages describing various >> health care actions. Many of these are structural duplicates of each >> other; the top elements differ by their attribute values, but the >> structure and values of the descendant elements is the same. The amount >> of duplication varies from none to thousands. >> >> I've got an apparently useful heuristic based on descendant attribute >> values, but would -- it is health care data -- really like to have a >> more robust way to group the elements into set of equivalent top-level >> names by their structural sameness. (I can't hand-check the whole data >> set.) >> >> So I find myself wanting an equivalent of sha256sum for elements so I >> could generate a grouping key from the descendant elements and their >> associated attributes as a unit. >> >> Is there such a thing? Equivalent approaches? >> >> Thanks! >> Graydon >> > > > > -- > Cheers, > Dimitre Novatchev > --------------------------------------- > Truly great madness cannot be achieved without significant intelligence. > --------------------------------------- > To invent, you need a good imagination and a pile of junk > ------------------------------------- > Never fight an inanimate object > ------------------------------------- > To avoid situations in which you might make mistakes may be the > biggest mistake of all > ------------------------------------ > Quality means doing it right when no one is looking. > ------------------------------------- > You've achieved success in your field when you don't know whether what > you're doing is work or play > ------------------------------------- > To achieve the impossible dream, try going to sleep. > ------------------------------------- > Facts do not cease to exist because they are ignored. > ------------------------------------- > Typing monkeys will write all Shakespeare's works in 200yrs.Will they > write all patents, too? :) > ------------------------------------- > Sanity is madness put to good use. > ------------------------------------- > I finally figured out the only reason to be alive is to enjoy it. > -- Cheers, Dimitre Novatchev --------------------------------------- Truly great madness cannot be achieved without significant intelligence. --------------------------------------- To invent, you need a good imagination and a pile of junk ------------------------------------- Never fight an inanimate object ------------------------------------- To avoid situations in which you might make mistakes may be the biggest mistake of all ------------------------------------ Quality means doing it right when no one is looking. ------------------------------------- You've achieved success in your field when you don't know whether what you're doing is work or play ------------------------------------- To achieve the impossible dream, try going to sleep. ------------------------------------- Facts do not cease to exist because they are ignored. ------------------------------------- Typing monkeys will write all Shakespeare's works in 200yrs.Will they write all patents, too? :) ------------------------------------- Sanity is madness put to good use. ------------------------------------- I finally figured out the only reason to be alive is to enjoy it.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] is there a way to hash an, Graydon graydon@xxxx | Thread | Re: [xsl] is there a way to hash an, Graydon graydon@xxxx |
Re: [xsl] is there a way to hash an, Michael Kay mike@xxx | Date | Re: [xsl] where to look for xsl fol, Wendell Piez wapiez@ |
Month |