Re: [xsl] XSLT2 node comparison, wordlists

Subject: Re: [xsl] XSLT2 node comparison, wordlists
From: "G. Ken Holman" <gkholman@xxxxxxxxxxxxxxxxxxxx>
Date: Wed, 24 Oct 2007 14:06:19 -0400
At 2007-10-24 17:56 +0100, James Cummings wrote:
I'm sure this is easy to do in XSLT2 but I've just not got my head
wrapped around how to compare things properly in an efficient manner.

Below is a working solution ... and then I log on to find that Mike has already given it to you! It turns out I ended up doing a serialization as a hash ... I sorted the attributes in the hash to ensure the serialization would be consistent.


Ah well, I hope the code below is of some little help anyway.

. . . . . . . . . . . Ken


t:\ftemp>type cummings.xml <entry xml:id="let22-w27"> <form> <orth type="hw">the</orth> <form type="orthVar"> <orth xml:id="w72">The</orth> <orth xml:id="w3955">The</orth> <orth xml:id="w4513">The</orth> <orth xml:id="w4578">The</orth> <orth xml:id="w4650">The</orth> <orth xml:id="w4672">The</orth> <orth xml:id="w4703">The</orth> <orth xml:id="w4824">The</orth> <orth xml:id="w4830">The</orth> <orth xml:id="w2045">the</orth> <orth xml:id="w2079">the</orth> <orth xml:id="w2101">the</orth> <orth xml:id="w2112">the</orth> <orth xml:id="w2333">the</orth> <orth xml:id="w2400">the</orth> <orth xml:id="w2442">the</orth> <orth xml:id="w1402">T<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="w2422">T<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="w6458">T<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="w7822">T<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="w2097">t<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="w2155">t<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="w2482">t<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="w5887">t<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="w5642">T<ex>h</ex>e</orth> <orth xml:id="w5378">t<ex>h</ex>e</orth> </form> </form> </entry>

t:\ftemp>type cummings.xsl
<?xml version="1.0" encoding="US-ASCII"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                xmlns:xsd="http://www.w3.org/2001/XMLSchema";
                xmlns:c="urn:x-cummings"
                exclude-result-prefixes="xsd c"
                version="2.0">

<xsl:template match="/">
  <xsl:variable name="filename"
                select="tokenize(document-uri(.),'/')[last()]"/>
  <xsl:variable name="id-base" select="entry/@xml:id"/>
  <div>
    <xsl:apply-templates/>

    <div type="links">
      <linkGrp xml:id="{/*/@xml:id}-lg">
        <xsl:for-each-group select="entry/form/form/orth"
                            group-by="c:deep-structure-hash(.)">
          <xsl:text>
          </xsl:text>
          <link>
            <xsl:attribute name="targets">
              <xsl:text/>#<xsl:value-of select="$id-base"/>-v<xsl:text/>
              <xsl:number format="A" value="position()"/>
              <xsl:for-each select="current-group()">
                <xsl:text> </xsl:text>
                <xsl:value-of select="concat($filename,'#',@xml:id)"/>
              </xsl:for-each>
            </xsl:attribute>
          </link>
        </xsl:for-each-group>
      </linkGrp>
      <xsl:text>
      </xsl:text>
    </div>
  </div>
</xsl:template>

<xsl:template match="form[@type='orthVar']">
  <xsl:variable name="id-base" select="ancestor::entry/@xml:id"/>
  <form n="{count(distinct-values(orth))}">
    <xsl:copy-of select="@*"/>
    <xsl:for-each-group select="orth"
                            group-by="c:deep-structure-hash(.)">
      <xsl:text>
      </xsl:text>
      <orth>
        <xsl:attribute name="xml:id">
          <xsl:value-of select="$id-base"/>-v<xsl:text/>
          <xsl:number format="A" value="position()"/>
        </xsl:attribute>
        <xsl:apply-templates/>
      </orth>
    </xsl:for-each-group>
    <xsl:text>
    </xsl:text>
  </form>
</xsl:template>

<xsl:function name="c:deep-structure-hash">
  <!--create a hash based on the structure of the element's descendants-->
  <xsl:param name="node"/>
  <xsl:variable name="return">
    <xsl:apply-templates select="$node/node()" mode="c:deep-structure-hash"/>
  </xsl:variable>
  <!--return the hash as a string value-->
  <xsl:sequence select="string($return)"/>
</xsl:function>

<!--mimic markup in the creation of the deep hash-->
<xsl:template mode="c:deep-structure-hash" match="*">
  <xsl:value-of select="concat('&lt;',name(.))"/>
  <xsl:for-each select="@*">
    <xsl:sort select="name(.)"/>
    <xsl:value-of select="concat(' ',name(.),'=&quot;',.,'&quot;')"/>
    <xsl:text>></xsl:text>
  </xsl:for-each>
  <xsl:apply-templates mode="c:deep-structure-hash"/>
  <xsl:value-of select="concat('&lt;/',name(.),'>')"/>
</xsl:template>

<xsl:template match="@*|*"><!--identity for all other nodes-->
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>
t:\ftemp>call xslt2 cummings.xml cummings.xsl cummings.out

t:\ftemp>type cummings.out
<?xml version="1.0" encoding="UTF-8"?><div><entry xml:id="let22-w27">
<form>
<orth type="hw">the</orth>
<form n="2" type="orthVar">
<orth xml:id="let22-w27-vA">The</orth>
<orth xml:id="let22-w27-vB">the</orth>
<orth xml:id="let22-w27-vC">T<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="let22-w27-vD">t<ex>h</ex><hi rend="sup">e</hi></orth>
<orth xml:id="let22-w27-vE">T<ex>h</ex>e</orth>
<orth xml:id="let22-w27-vF">t<ex>h</ex>e</orth>
</form>
</form>
</entry><div type="links"><linkGrp xml:id="let22-w27-lg">
<link targets="#let22-w27-vA cummings.xml#w72 cummings.xml#w3955 cummings.xml#w4513 cummings.xml#w4578 cummings.xml#w4650 cummings.xml#w4672 cummings.xml#w4703 cummings.xml#w4824 cummings.xml#w4830"/>
<link targets="#let22-w27-vB cummings.xml#w2045 cummings.xml#w2079 cummings.xml#w2101 cummings.xml#w2112 cummings.xml#w2333 cummings.xml#w2400 cummings.xml#w2442"/>
<link targets="#let22-w27-vC cummings.xml#w1402 cummings.xml#w2422 cummings.xml#w6458 cummings.xml#w7822"/>
<link targets="#let22-w27-vD cummings.xml#w2097 cummings.xml#w2155 cummings.xml#w2482 cummings.xml#w5887"/>
<link targets="#let22-w27-vE cummings.xml#w5642"/>
<link targets="#let22-w27-vF cummings.xml#w5378"/></linkGrp>
</div></div>
t:\ftemp>rem Done!




--
Comprehensive in-depth XSLT2/XSL-FO1.1 classes: Austin TX,Jan-2008
World-wide corporate, govt. & user group XML, XSL and UBL training
RSS feeds:     publicly-available developer resources and training
G. Ken Holman                 mailto:gkholman@xxxxxxxxxxxxxxxxxxxx
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/s/
Box 266, Kars, Ontario CANADA K0A-2E0    +1(613)489-0999 (F:-0995)
Male Cancer Awareness Jul'07  http://www.CraneSoftwrights.com/s/bc
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal

Current Thread