Re: [xsl] is there a way to hash an element?

Subject: Re: [xsl] is there a way to hash an element?
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 13 Jun 2016 10:12:05 -0000
> The matching rules are that A and B are considered to be the same if and
> only if they have the same descendant elements in the same document
> order with each element in A having associated with it the same
> attributes and attribute values as the corresponding element in B.
> (There aren't any descendant text nodes, comments, or processing
> instructions.)
>

There are a few gaps in that spec, e.g. it doesn't mention element names, but
I think you could do something like this:

xsl:function name="f:hash-of-string" as="xs:integer"
  xsl:param name="in" as="xs:string"
  xsl:sequence select="sum(for $i in 1 to string-length($in) return ($i *
string-to-codepoints(substring($in, $i, 1))))"

xsl:function name="f:hash-of-element" as="xs:integer"
  xsl:param name="in" as="element()"
  xsl:sequence select="f:hash-of-attributes($in/@*) +
f:hash-of-children($in/*) + f:hash-of-string(local-name($in))"

xsl:function name="f:hash-of-attributes" as="xs:integer"
  xsl:param name="in" as="attribute()*"
  xsl:sequence select="sum(for $a in $in return
f:hash-of-string(local-name($a) * f:hash-of-string(string($a)))"

xsl:function name="f:hash-of-children" as="xs:integer"
  xsl:param name="in" as="element()*"
  xsl:sequence select="sum(for $i in 1 to count($in) return
f:hash-of-element($in[$i]) * $i)"

xsl:function name="f:top-level-hash" as="xs:integer"
  xsl:param name="in" as="element()"
  xsl:sequence select="f:hash-of-children($in/*, 1)"

Of course, it's not guaranteed that if two elements have the same hash value,
then they are "the same". So after grouping by hash value, you'll need to do
an n^2 operation using deep-equal() (or a custom replacement) to eliminate
false friends.



Michael Kay
Saxonica

Current Thread