Re: [xsl] document() merge DISTINCT

Subject: Re: [xsl] document() merge DISTINCT
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Thu, 20 Dec 2001 17:04:08 +0000
Hi Alex,


> Hi Jeni,

> sorry for disturbing you again!
> I wanted to try your other suggestion, using the node-set extention
> to get a unique set.
> But I can not get it right!
> I tried this with the microsoft extention:

> test.xsl:

> <?xml version="1.0" encoding="UTF-8" ?>
> <xsl:stylesheet version="1.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
> xmlns:msxsl="urn:schemas-microsoft-com:xslt">

> <xsl:variable name="set">
>     <xsl:for-each select="(document(/files/node/@path))">
>         <xsl:copy-of select="/doc/person" />
>     </xsl:for-each>
> </xsl:variable>
> <xsl:key name="k-id" match="person" use="@id" />
> <xsl:template name="distinct">
>     <xsl:for-each select="(msxsl:node-set($set))[generate-id(.) =
> generate-id(key('k-id',@id)[1])]">
>     <xsl:sort select="@id"/>
>         <xsl:apply-templates />
>     </xsl:for-each>
> </xsl:template>

> <xsl:template match="/">
>     <test>
>         <xsl:call-template name="distinct" />
>     </test>
> </xsl:template>

> <xsl:template match="person">
>     <x>
>         <xsl:copy-of select="."/>
>     </x>
> </xsl:template>
> </xsl:stylesheet>

> data.xml:

> <?xml version="1.0" encoding="utf-8" ?>
> <doc>
> <person name="Alex7" id="7" />
> <person name="Alex8" id="8" />
> <person name="Alex9" id="9" />
> <person name="Alex10" id="10" />
> <person name="Alex11" id="11" />
> <person name="Alex12" id="12" />
> <person name="Alex13" id="13" />
> <person name="Alex14" id="14" />
> <person name="Alex15" id="15" />
> </doc>

> -Alex


> ----- Original Message -----
> From: "Jeni Tennison" <jeni@xxxxxxxxxxxxxxxx>
> To: "Alex Schuetz" <asc@xxxxxx>
> Cc: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
> Sent: Wednesday, December 19, 2001 3:55 PM
> Subject: Re: [xsl] document() merge DISTINCT


>> Hi Alex,
>>
>> > In each file all /person/@id are unique, but different files might
>> > contain the same @id . Now I want to produce a list of all <person>
>> > so that /person/@id is unique.
>>
>> This is certainly more difficult than trying to find distinct values
>> within a single document, because key(), preceding-sibling:: and so on
>> all work within a single document.
>>
>> One method, if you don't mind using an extension node-set() function,
>> is to generate a single result tree fragment containing all the person
>> elements, convert that to a node set, and then work with that new
>> 'document' getting distinct values in the same way as you would
>> normally (e.g. with the Muenchian method).
>>
>> That method has disadvantages because it uses the extension function
>> and because the intermediate node set that you're constructing could
>> be quite large, take up memory and therefore lead to slower
>> performance. If these don't turn out to be issues, though, it's the
>> method that I'd choose because it's easy.
>>
>> The alternative is to use a recursive method. I'd write a template
>> that takes two arguments: a node set of unique people and a node set
>> of remaining people:
>>
>> <xsl:template name="distinct">
>>   <xsl:param name="unique" select="/.." />
>>   <xsl:param name="remaining" select="/.." />
>>   ...
>> </xsl:template>
>>
>> Then work through the remaining people one by one by recursion. If
>> there are people remaining, look at the first one to see whether it
>> should be added to the unique list (its id isn't the same as an
>> existing unique person) or not, and call the template with the new
>> sets:
>>
>> <xsl:template name="distinct">
>>   <xsl:param name="unique" select="/.." />
>>   <xsl:param name="remaining" select="/.." />
>>   <xsl:choose>
>>     <xsl:when test="$remaining">
>>       <xsl:call-template name="distinct">
>>         <xsl:with-param name="unique"
>>           select="$unique | $remaining[1][not(@id = $unique/@id)]" />
>>         <xsl:with-param name="remaining"
>>                         select="$remaining[position() > 1]" />
>>       </xsl:call-template>
>>     </xsl:when>
>>     <xsl:otherwise>
>>       ...
>>     </xsl:otherwise>
>>   </xsl:choose>
>> </xsl:template>
>>
>> If there are no people remaining, then you need to do whatever you
>> want to do to the unique people - apply templates to them for example:
>>
>> <xsl:template name="distinct">
>>   <xsl:param name="unique" select="/.." />
>>   <xsl:param name="remaining" select="/.." />
>>   <xsl:choose>
>>     <xsl:when test="$remaining">
>>       <xsl:call-template name="distinct">
>>         <xsl:with-param name="unique"
>>           select="$unique | $remaining[1][not(@id = $unique/@id)]" />
>>         <xsl:with-param name="remaining"
>>                         select="$remaining[position() > 1]" />
>>       </xsl:call-template>
>>     </xsl:when>
>>     <xsl:otherwise>
>>       <xsl:apply-templates select="$unique" />
>>     </xsl:otherwise>
>>   </xsl:choose>
>> </xsl:template>
>>
>> When you call the template, I'd start off with $unique being set to
>> all the person elements in your first file, since you know that they
>> all have unique IDs. That will save some work. Something like:
>>
>>   <xsl:call-template name="distinct">
>>     <xsl:with-param name="unique"
>>       select="document('sample1.xml')/project/person" />
>>     <xsl:with-param name="remaining"
>>       select="document('sample2.xml')/project/person" />
>>   </xsl:call-template>
>>
>> The expression for $remaining could include more documents, naturally.
>>
>> I hope that helps,
>>
>> Jeni
>>
>> ---
>> Jeni Tennison
>> http://www.jenitennison.com/
>>
>>



I hope that helps,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread