RE: [xsl] Merging XML structure by comparing elements

Subject: RE: [xsl] Merging XML structure by comparing elements
From: cknell@xxxxxxxxxx
Date: Tue, 19 Jul 2005 14:58:09 -0400
Give thanks to St. Stephen (the Meunch).

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
  <xsl:output method="xml" indent="yes" encoding="UTF-8" />
  <xsl:strip-space elements="*" />
  <xsl:key name="t" match="mynode" use="text()" />

  <xsl:template match="/">
    <nodes>
      <xsl:apply-templates />
    </nodes>
  </xsl:template>

  <xsl:template match="nodes">
    <xsl:apply-templates select="mynode[generate-id(.)=generate-id(key('t',text()))]"/>
  </xsl:template>

  <xsl:template match="mynode">
    <xsl:copy-of select="." />
  </xsl:template>

</xsl:stylesheet>

Meunchian grouping creates the funcitonal equivalent of a perl hash table. It eliminates the duplicates for you.

There are plenty of references available by Googling.
-- 
Charles Knell
cknell@xxxxxxxxxx - email



-----Original Message-----
From:     Karl Koch <TheRanger@xxxxxxx>
Sent:     Tue, 19 Jul 2005 19:59:16 +0200 (MEST)
To:       "Mulberry list" <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
Subject:  [xsl] Merging XML structure by comparing elements

I would like to compare the following node elements (embedded in the higher
level nodes element) in order to create a new merged list. I would like to
do that in a number of ways and I wonder how to do that:

The original content looks like that:

<nodes>
  <mynode>This is a test text 1</node>
  <mynode>This is a test text 2</node>
  <mynode>This is a test text 1</node>
</nodes>


1) I want o compare the entire text of each <mynode>. Only text which are
different should be
completely copied into the output XML file. As a result of this the output
text file should look like that:

<nodes>
  <mynode>This is a test text 1</node>
  <mynode>This is a test text 2</node>
</nodes>

Please keep in mind that for that the XSLT parser somehow would need to
"remember" that "...test text 1" has already been copied, so it always
avoids that this text is copied a second time somewhere later. Also please
keep in mind that my actual content consists of about 10000 nodes which is
an issue of complexity.

2) Only ceck for the first 14 letters (or generally n letters) which would
result in a output consisting of only one node in this case (becasue the
first 14 letters are all the same in all nodes of the example xml file):

<nodes>
  <mynode>This is a test text 1</node>
</nodes>

3) Can I use the last n words for that comparison, too?

Any help would be highly appreciated.

Thank you,
Karl 

-- 
5 GB Mailbox, 50 FreeSMS http://www.gmx.net/de/go/promail
+++ GMX - die erste Adresse fo?=r Mail, Message, More +++

Current Thread