Re: [xsl] Getting a distinct list of node names

Subject: Re: [xsl] Getting a distinct list of node names
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Mon, 15 Dec 2003 19:07:33 -0500

You're trying a bit too hard on this one. Using recursion is overkill, and the distinct() extension function operates on node identity, whereas you are comparing for similar values (as nodes they are always distinct).

But de-duplicating is a common thing to do.

In fact, it's the first step in the oft-recommended Muenchian grouping technique. (De-duplicate nodes by grouping criterion, then group by the set of de-duplicated "flag-bearer" nodes.)

A common way to do this is simply to compare a node's value to other nodes (here it's the value of the node's name that you care about):

node:definition/*[not(name() = name(preceding-sibling::*))]

...which will have poor performance on a large set of nodes (but you get the idea).

You could also do

<xsl:key name="nodes-by-name" match="*" use="name()"/>

and then call for

node:definition/*[count(.|key('nodes-by-name',name())[1] = 1]

which should work better on large sets of input.

Ah, it looks like Ken has another solution....


At 02:22 PM 12/15/2003, you wrote:

Maybe this cannot be accomplished with plain Xslt and my mail is OT, but
I do not know a better place to start asking.

I need to get a distinct list of the node names from all children of one
node. For example, if I have:

        <form:validator />
        <form:validator />
        <form:filter />

I want to be able to retrieve a list of the names of all tags used
within node:definition. However, it should contain each tag name only

('form:validator', 'form:filter')

I tried to accomplish this trying to build a string containing all tag
names but this failed because of the nature of xsl:variables, recursion
did not work, too (I found no way :/).

I also tried the set:distinct method from exslt but it did not work,
too, because you cannot specify a path like "*/name()" and so you cannot
select all the names (I guess it is because paths always only specify
nodes and a node name is a plain string?).

Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.      
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
  Mulberry Technologies: A Consultancy Specializing in SGML and XML

XSL-List info and archive:

Current Thread