Re: [xsl] Identical entries in different input documents should appear in the output document only once

Subject: Re: [xsl] Identical entries in different input documents should appear in the output document only once
From: "G. Ken Holman" <gkholman@xxxxxxxxxxxxxxxxxxxx>
Date: Fri, 07 Sep 2007 09:59:24 -0700
At 2007-09-07 17:54 +0200, Meyer, Roland 1. (NSN - DE/Germany - MiniMD) wrote:
I have the following problem. I have a couple of XML documents to merge
to one output document.
Each document has the same structure like this:

<root>
  <block>
    <oneTag>some value<oneTag>
    <anotherTag>another value<anotherTag>
     ...
    <idTag>setId-itemId<idTag>
  </block>
  <block>
     ...
  </block>
   ...
</root>

I have to interpret the value in the idTag (the setId-itemId) as an
identifier for the complete structure between the block tags.
Within one document this identifying value comes only once, but the same
identifying value can be found in different documents.

What I now need:
My output file should list each block only ones, means although the same
identifying value is present in different input documents, it should
appear only once in the output document.

I can think about some heavy procedures by checking every found
identifier value in the already processed files (because then they are
already written to the output), but this will be very time consuming (I
have around 15 files with each up to 10000 blocks).

Is there any other and simpler way to - let's say - memorize the already
written blocks resp. identifiers?

Mike is correct that the variable-based grouping method would work with this, but it is far slower than an XSLT 2 approach. When using XSLT 1.0 I find variable-based grouping acceptable for sub-document and multi-document grouping.


The code would work along the lines of the following, and it assumes that there is an XML structure $files with the list of all the file names:

  <xsl:variable name="items"
                select="document($files/file/@uri,.)/root/block"/>
  <xsl:for-each select="$items">
     <!--walk through all, doing work at first of each unique idTag-->
     <xsl:if test="generate-id(.)=
                   generate-id($items[idTag=current()/idTag])">
        <!--the following executes once for each unique idTag
            value across all the files-->
     </xsl:if>
  </xsl:for-each>

I hope this helps.

. . . . . . . . . . . . . Ken

--
Upcoming public training: XSLT/XSL-FO Sep 10, UBL/code lists Oct 1
World-wide corporate, govt. & user group XML, XSL and UBL training
RSS feeds:     publicly-available developer resources and training
G. Ken Holman                 mailto:gkholman@xxxxxxxxxxxxxxxxxxxx
Crane Softwrights Ltd.          http://www.CraneSoftwrights.com/s/
Box 266, Kars, Ontario CANADA K0A-2E0    +1(613)489-0999 (F:-0995)
Male Cancer Awareness Jul'07  http://www.CraneSoftwrights.com/s/bc
Legal business disclaimers:  http://www.CraneSoftwrights.com/legal

Current Thread