Subject: Re: [xsl] building a hierarchical classification out of flat and redundant data From: "Albert Juhé" <albertjuhe@xxxxxxxxx> Date: Tue, 25 Jul 2006 10:01:12 +0200 |
The last week an amazin brown arrive me, the problem is the same: I have this xml:
<modul> <unit id="1"> <subunit>Rupturas</subunit> <sub-subunit>sistema </sub-subunit> <sub-subunit>incertidumbre</sub-subunit> <subunit>Megatendencias</subunit> <sub-subunit>Caracterizacisn</sub-subunit> <sub-sub-subunit>1.2.1.1.</sub-sub-subunit> <p>Text 1211</p> <param>Text 2 1211</param> <sub-sub-subunit>1.2.1.2.</sub-sub-subunit> <sub-sub-subunit>1.2.1.3.</sub-sub-subunit> <sub-subunit>Vectores</sub-subunit> <sub-sub-subunit>1.2.2.1.</sub-sub-subunit> <sub-sub-subunit>1.2.2.2.</sub-sub-subunit> <sub-sub-subunit>1.2.2.3.</sub-sub-subunit> <subunit>Perspectivas</subunit> <sub-subunit>Ideologmas</sub-subunit> <sub-sub-subunit>1.3.1.1.</sub-sub-subunit> <sub-sub-subunit>1.3.1.2.</sub-sub-subunit> <sub-subunit>controversia</sub-subunit> <sub-sub-subunit>1.3.2.1.</sub-sub-subunit> <sub-sub-subunit>1.3.2.2.</sub-sub-subunit> </unit> <unit id="2"> <p>Desafmos sociolaboral</p> <subunit>Cantidad</subunit> <p>Text Cantidad</p> <sub-subunit>riqueza</sub-subunit> <sub-subunit>paramso</sub-subunit> <sub-subunit>materia</sub-subunit> <sub-subunit>panorama a las perspectivas</sub-subunit> <subunit>Calidad</subunit> <sub-subunit>Polarizacisn</sub-subunit> <sub-subunit>La cara</sub-subunit> <sub-subunit>La cruz</sub-subunit> <sub-subunit>Precarizacisn</sub-subunit> <subunit>experiencia</subunit> <sub-subunit>Ejes</sub-subunit> <sub-subunit>Condiciones</sub-subunit> <sub-sub-subunit>2.3.2.1.</sub-sub-subunit> <sub-sub-subunit>2.3.2.2.</sub-sub-subunit> <sub-sub-subunit>2.3.2.3.</sub-sub-subunit> <subunit>paradigma</subunit> <sub-subunit>civilizacisn</sub-subunit> <sub-subunit>emplemsmo</sub-subunit> <sub-subunit>Agenda</sub-subunit> </unit> </modul>
And I have to convert in a hierarchial xml structure into the unit tag, with this conditions: - Between tag can exists another tags, this tags belongs to the preceding-sibling. - The hierarchi is: unit, subunit,sub-subunit and sub-sub-subunit.
<modul> <unit id="1"> <subunit> <title>Rupturas</title> <sub-subunit> <title>sistema </title> </sub-subunit> <sub-subunit> <title>incertidumbre</title> </sub-subunit> </subunit> <subunit> <title>Megatendencias</title> <sub-subunit> <title>Caracterizacisn</title> <sub-sub-subunit> <title>1.2.1.1.</title> <p>Text 1211</p> <param>Text 2 1211</param> </sub-sub-subunit> <sub-sub-subunit> <title>1.2.1.2.</title> </sub-sub-subunit> <sub-sub-subunit> <title>1.2.1.3.</title> </sub-sub-subunit> </sub-subunit> <sub-subunit> <title>Vectores</title> <sub-sub-subunit> <title>1.2.2.1.</title> </sub-sub-subunit> <sub-sub-subunit> <title>1.2.2.2.</title> </sub-sub-subunit> <sub-sub-subunit> <title>1.2.2.3.</title> </sub-sub-subunit> </sub-subunit> </subunit> <subunit> <title>Perspectivas</title> <sub-subunit> <title>Ideologmas</title> <sub-sub-subunit> <title>1.3.1.1.</title> </sub-sub-subunit> <sub-sub-subunit> <title>1.3.1.2.</title> </sub-sub-subunit> </sub-subunit> <sub-subunit> <title>controversia</title> <sub-sub-subunit> <title>1.3.2.1.</title> </sub-sub-subunit> <sub-sub-subunit> <title>1.3.2.2.</title> </sub-sub-subunit> </sub-subunit> </subunit> </unit> <unit id="2"> <p>Desafmos sociolaboral</p> <subunit> <title>Cantidad</title> <p>Text Cantidad</p> <sub-subunit> <title>riqueza</title> </sub-subunit> <sub-subunit> <title>paramso</title> </sub-subunit> <sub-subunit> <title>materia</title> </sub-subunit> <sub-subunit> <title>panorama a las perspectivas</title> </sub-subunit> </subunit> <subunit> <title>Calidad</title> <sub-subunit> <title>Polarizacisn</title> </sub-subunit> <sub-subunit> <title>La cara</title> </sub-subunit> <sub-subunit> <title>La cruz</title> </sub-subunit> <sub-subunit> <title>Precarizacisn</title> </sub-subunit> </subunit> <subunit> <title>experiencia</title> <sub-subunit> <title>Ejes</title> </sub-subunit> <sub-subunit> <title>Condiciones</title> <sub-sub-subunit> <title>2.3.2.1.</title> </sub-sub-subunit> <sub-sub-subunit> <title>2.3.2.2.</title> </sub-sub-subunit> <sub-sub-subunit> <title>2.3.2.3.</title> </sub-sub-subunit> </sub-subunit> </subunit> <subunit> <title>paradigma</title> <sub-subunit> <title>civilizacisn</title> </sub-subunit> <sub-subunit> <title>emplemsmo</title> </sub-subunit> <sub-subunit> <title>Agenda</title> </sub-subunit> </subunit> </unit> </modul>
<xsl:template match="modul"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:apply-templates/> </xsl:copy> </xsl:template>
<xsl:template match="unit"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:call-template name="process-node"> <xsl:with-param name="node-father" select="name()"/> </xsl:call-template> </xsl:copy> </xsl:template>
<!-- Copy elements --> <xsl:template match="*"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:apply-templates/> </xsl:copy> </xsl:template>
<!-- Test if an element match with the final block using generate-id --> <xsl:template name="get-block"> <xsl:param name="context" select="."/> <xsl:param name="target"/>
<xsl:if test="generate-id($context)!=$target"> <xsl:apply-templates select="$context" mode="copia"/> <xsl:variable name="next-element" select="$context/following-sibling::*[1]"/> <xsl:if test="$next-element"> <xsl:call-template name="get-block"> <xsl:with-param name="context" select="$next-element"/> <xsl:with-param name="target" select="$target"/> </xsl:call-template> </xsl:if> </xsl:if>
<!-- Find a subunit tag --> <xsl:template name="process-node"> <xsl:param name="context" select="*[1]"/> <xsl:param name="node-father"/>
<xsl:choose> <xsl:when test="$context[self::unit or self::subunit or self::sub-subunit or self::sub-sub-subunit]"> <xsl:variable name="node-type" select="name($context)"/> <xsl:element name="{$node-type}"> <title><xsl:value-of select="$context"/></title> <xsl:call-template name="generate-block"> <xsl:with-param name="context" select="$context/following-sibling::*[1]"/> <xsl:with-param name="node-type" select="$node-type"/> </xsl:call-template> </xsl:element>
<xsl:variable name="seguent-node" select="$context/following-sibling::*[name()=$node-type][1]"/>
<xsl:variable name="fathers-name"> <xsl:call-template name="get-pare"> <xsl:with-param name="unitat" select="$node-type"/> </xsl:call-template> </xsl:variable>
<!-- Test if are the same type and have the same father, for continuing processing --> <xsl:if test="$seguent-node and name($seguent-node)=$node-type and (generate-id($seguent-node/preceding-sibling::*[name()=$fathers-name][1])=gen erate-id($context/preceding-sibling::*[name()=$fathers-name][1]))"> <xsl:call-template name="process-node"> <xsl:with-param name="context" select="$seguent-node"/> </xsl:call-template> </xsl:if>
</xsl:when> <xsl:otherwise> <xsl:apply-templates select="$context"/> <xsl:if test="$context/following-sibling::*"> <xsl:call-template name="process-node"> <xsl:with-param name="context" select="$context/following-sibling::*[1]"/> </xsl:call-template> </xsl:if> </xsl:otherwise> </xsl:choose> </xsl:template>
<xsl:template name="generate-block"> <xsl:param name="context"/> <xsl:param name="node-type"/>
<xsl:if test="$context"> <!-- Where stops to process? --> <xsl:variable name="pares"> <xsl:call-template name="get-ordre-unitat"> <xsl:with-param name="unitat" select="$node-type"/> </xsl:call-template> </xsl:variable> <xsl:variable name="node-limit" select="contains($pares,concat('*',name($context),'*'))"/>
<xsl:if test="not($node-limit)"> <xsl:choose> <xsl:when test="$context[self::unit or self::subunit or self::sub-subunit or self::sub-sub-subunit]"> <xsl:call-template name="process-node"> <xsl:with-param name="context" select="$context"/> </xsl:call-template> </xsl:when> <xsl:otherwise> <xsl:apply-templates select="$context"/> <xsl:call-template name="generate-block"> <xsl:with-param name="context" select="$context/following-sibling::*[1]"/> <xsl:with-param name="node-type" select="$node-type"/> </xsl:call-template> </xsl:otherwise> </xsl:choose> </xsl:if> </xsl:if>
<!-- Sets the hierarchial order --> <xsl:template name="get-ordre-unitat"> <xsl:param name="unitat"/>
<xsl:choose> <xsl:when test="$unitat='unit'"> <xsl:value-of select="'*unit*'"/> </xsl:when> <xsl:when test="$unitat='subunit'"> <xsl:value-of select="'*unit*subunit*'"/> </xsl:when> <xsl:when test="$unitat='sub-subunit'"> <xsl:value-of select="'*unit*subunit*sub-subunit*'"/> </xsl:when> <xsl:when test="$unitat='sub-sub-subunit'"> <xsl:value-of select="'*unit*subunit*sub-subunit*sub-sub-subunit*'"/> </xsl:when> </xsl:choose>
<!-- Retorna pare --> <xsl:template name="get-pare"> <xsl:param name="unitat"/>
<xsl:choose> <xsl:when test="$unitat='unit'"> <xsl:value-of select="''"/> </xsl:when> <xsl:when test="$unitat='subunit'"> <xsl:value-of select="'unit'"/> </xsl:when> <xsl:when test="$unitat='sub-subunit'"> <xsl:value-of select="'subunit'"/> </xsl:when> <xsl:when test="$unitat='sub-sub-subunit'"> <xsl:value-of select="'sub-subunit'"/> </xsl:when> </xsl:choose>
Dear XSLT-Community,
i have problem with some "strange" type of data which i have to convert to a hierarchical xml structure.
My source is a huge xml file which represents a decimal classifikation. It contains so called documents, where each document represents one node of the classification. Furthermore each documents shows the direct parents of a node. It's a structure like this (example taken from http://www.udcc.org): ... <document> <tag1>3</tag1> <tag1a>Social Sciences</tag1a> </document> <document> <tag1>3</tag1> <tag1a>Social Sciences</tag1a> <tag2>32</tag2> <tag2a>Politics</tag2a> </document> <document> <tag1>3</tag1> <tag1a>Social Sciences</tag1a> <tag2>32</tag2> <tag2a>Politics</tag2a> <tag3>326</tag3> <tag3a>Slavery</tag3a> </document> ... As you can see there is no hierarchical information in it instead of the names and the sequence of the tags. In my real data i have up to 9 levels, but not every time. My result should look like this (or something similar): ... <node id="3" name="Social Science"> <node id="32" name="Politics"> <node id="326" name="Slavery"/> </node> </node> ... I have simply no idea what to start with to archive this result. I guess the first step would be to get rid of all those redundant content, but i don't know how. And i even can't figure out how to build the hierachichal structure the same time.
Has anyone a good starting point for this?
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] building a hierarchical c, Mukul Gandhi | Thread | [xsl] Re: building a hierarchical c, mnews-xsl@xxxxxx |
Re: [xsl] check if document exists, Todd Baker | Date | [xsl] Creating a padded sort key: e, Yves Forkl |
Month |