Re: [xsl] Identical entries in different input documents should appear in the output document only once

Subject: Re: [xsl] Identical entries in different input documents should appear in the output document only once
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Fri, 07 Sep 2007 22:27:57 +0200
G. Ken Holman wrote:
At 2007-09-07 17:54 +0200, Meyer, Roland 1. (NSN - DE/Germany - MiniMD) wrote:
I have the following problem. I have a couple of XML documents to merge
to one output document.
Each document has the same structure like this:
[..........].

Is there any other and simpler way to - let's say - memorize the already
written blocks resp. identifiers?

Mike is correct that the variable-based grouping method would work with this, but it is far slower than an XSLT 2 approach. When using XSLT 1.0 I find variable-based grouping acceptable for sub-document and multi-document grouping.


The code would work along the lines of the following, and it assumes that there is an XML structure $files with the list of all the file names:

  <xsl:variable name="items"
                select="document($files/file/@uri,.)/root/block"/>
  <xsl:for-each select="$items">
     <!--walk through all, doing work at first of each unique idTag-->
     <xsl:if test="generate-id(.)=
                   generate-id($items[idTag=current()/idTag])">
        <!--the following executes once for each unique idTag
            value across all the files-->
     </xsl:if>
  </xsl:for-each>


I didn't see your (Ken's) post earlier, I find it interesting to compare this approach to mine. Perhaps this works quite faster than my node-set + load-all-documents-at-once method (but here, too, all documents are loaded at once). The difference is, that in my approach, the varying document root note is removed (by using template match + copy-of and then node-set the root node for the copied documents becomes the same root node for all), which makes it possible to do normal muenchian grouping, instead of look-up grouping (with variables, I mean).


I hope that the added speed by using a key outweighs the overhead for using node-set. Also, using copy-of is an atomic operation and generally performs faster than applying the templates again (but of course, that is only useful when you do not need additional processing on the blocks).

Cheers,
-- Abel Braaksma

Current Thread