[xsl] Re: building a hierarchical classification out of flat and redundant data

Subject: [xsl] Re: building a hierarchical classification out of flat and redundant data
From: "mnews-xsl@xxxxxx" <mnews-xsl@xxxxxx>
Date: Tue, 25 Jul 2006 01:24:11 +0200
Hello,

the following uses a recursive template and tracks upward / downward pointers
with keys. It also works with more than one single root node.

Regards


<?xml version="1.0" encoding="iso-8859-1"?>


<xsl:stylesheet version="1.0"
	xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>

<xsl:output method="xml" indent="yes"/>

<xsl:key name="ptr-up" match="document" use="*[last()-1]"/>
<xsl:key name="ptr-down" match="document" use="*[last()-3]"/>

<xsl:template match="/documents">
	<nodes>
		<xsl:apply-templates select="document[not(key('ptr-up', *[last()-3]))]"/>
	</nodes>
</xsl:template>

<xsl:template match="document">
	<node id="{*[last()-1]}" name="{*[last()]}">
		<xsl:apply-templates select="key('ptr-down', *[last()-1])"/>
	</node>
</xsl:template>

</xsl:stylesheet>

On 7/24/06, Georg Hohmann <georg.hohmann@xxxxxxxxx> wrote:
Dear XSLT-Community,

i have problem with some "strange" type of data which i have to
convert to a hierarchical xml structure.

My source is a huge xml file which represents a decimal
classifikation. It contains so called documents, where each document
represents one node of the classification. Furthermore each documents
shows the direct parents of a node. It's a structure like this
(example taken from http://www.udcc.org):
...
<document>
       <tag1>3</tag1>
       <tag1a>Social Sciences</tag1a>
</document>
<document>
       <tag1>3</tag1>
       <tag1a>Social Sciences</tag1a>
       <tag2>32</tag2>
       <tag2a>Politics</tag2a>
</document>
<document>
       <tag1>3</tag1>
       <tag1a>Social Sciences</tag1a>
       <tag2>32</tag2>
       <tag2a>Politics</tag2a>
       <tag3>326</tag3>
       <tag3a>Slavery</tag3a>
</document>
...
As you can see there is no hierarchical information in it instead of
the names and the sequence of the tags. In my real data i have up to 9
levels, but not every time. My result should look like this (or
something similar):
...
<node id="3" name="Social Science">
  <node id="32" name="Politics">
     <node id="326" name="Slavery"/>
  </node>
</node>
...
I have simply no idea what to start with to archive this result. I
guess the first step would be to get rid of all those redundant
content, but i don't know how. And i even can't figure out how to
build the hierachichal structure the same time.

Has anyone a good starting point for this?

Current Thread