Re: Fw: [xsl] Question on duplicate node elimination

Subject: Re: Fw: [xsl] Question on duplicate node elimination
From: Hermann Stamm-Wilbrandt <STAMMW@xxxxxxxxxx>
Date: Thu, 26 Aug 2010 16:38:25 +0200
Michael,

> ... Instead, whenever you
> are evaluating an operation that returns a node-set, represent that
> node-set as a string containing the generate-id values of the nodes in
> the node-set, space-separated. Elimination of duplicates then reduces to
> an operation on strings: not trivial, but not especially difficult
either.

as stated in a previous email your space-separated string of generate-id
values was really cool!

I tried to use the id() function on that string because it does the
duplicate node elimination without any help!

After some problems and discussions in the last days on this list I learned
about xml:id, how to use id attributes with an xhtml namespace in browsers
and posted the idxml technique [1] yesterday (have xml:id and id attributes
for every node).

The idcopy of the input generates a working copy of the input document.
See the parentStep as an example how easy duplicate elimination has become
by your idea and using the id() function (from [2]):
  <!--
       Demonstration of a parentStep with duplicate elimination
  -->
  <xsl:template name="parentStep">
    <xsl:param name="root"/>
    <xsl:param name="nodes"/>

    <!-- application of ".." to nodes;
         $aux might contain duplicate id strings
    -->
    <xsl:variable name="aux">
      <xsl:for-each select="exslt:node-set($nodes)/*">
        <xsl:variable name="id" select="@xml:id"/>

        <!-- set context -->
        <xsl:for-each select="$root">
          <xsl:for-each select="id($id)/..">
            <xsl:value-of select="concat(@xml:id,' ')"/>
          </xsl:for-each>
        </xsl:for-each>
      </xsl:for-each>
    </xsl:variable>

    <!-- duplicate elimination step -->
    <xsl:for-each select="$root">
      <xsl:copy-of select="id($aux)"/>
    </xsl:for-each>

  </xsl:template>


I prepared two demos, parent.xml [3] and ancestor.xml [4].
Both display the XML input, the xpath to be applied, the result as well
as the auxiliary idcopy structure in the browser.
Reloading the pages shows the changing generate-id values in the idcopy.
For making the demos work the nameStep has been implemented, too.


These demos work for Firefox, Chrome and Safari browsers (and will for
IE9).
Sadly it does not work for Opera although Opera supports xml:id and id.
[It is unable to apply id() function on exslt:node-set(..)]


Thanks everybody for helping in the various emails making this work!


[1]
http://www.biglist.com/lists/lists.mulberrytech.com/xsl-list/archives/201008/msg00274.html
[2] http://stamm-wilbrandt.de/en/xsl-list/dupelim2/dupelim2.xsl
[3] http://stamm-wilbrandt.de/en/xsl-list/dupelim2/parent.xml
[4] http://stamm-wilbrandt.de/en/xsl-list/dupelim2/ancestor.xml


Mit besten Gruessen / Best wishes,

Hermann Stamm-Wilbrandt
Developer, XML Compiler, L3
WebSphere DataPower SOA Appliances
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



From:       Michael Kay <mike@xxxxxxxxxxxx>
To:         xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Date:       08/24/2010 02:17 PM
Subject:    Re: Fw: [xsl] Question on duplicate node elimination



  I haven't understood your logic in any detail, but I wonder if it
suggests an alternative approach to the problem: namely, avoid creating
RTFs entirely, at least for intermediate results. Instead, whenever you
are evaluating an operation that returns a node-set, represent that
node-set as a string containing the generate-id values of the nodes in
the node-set, space-separated. Elimination of duplicates then reduces to
an operation on strings: not trivial, but not especially difficult either.

Michael Kay
Saxonica

Current Thread