Re: [xsl] reproducing the hierarchical structure of a subset of nodes from a document

Subject: Re: [xsl] reproducing the hierarchical structure of a subset of nodes from a document
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Fri, 10 Jun 2011 15:50:39 +0100
Well, I'll forget that you said XSLT 1.0. Wash your mouth out.

In XSLT 2.0, it's a recursive grouping problem (with the depth of recursion equal to the maximum depth of the original document.)

First group the paths by their first token: the grouping keys define the elements at the first level of hierarchy. Within the for-each-group,
strip off the first token, and recurse until the group is empty. Something like this:


<xsl:function name="f:group">
<xsl:param name="paths">
<xsl:for-each-group select="$paths" group-by="tokenize(.)[1]">
<xsl:element name="{current-grouping-key()}">
<xsl:sequence select="f:group(for $p in current-group() return remove(tokenize($p), 1)"/>
</xsl:element>
</xsl:for-each-group>
</xsl:function>


Michael Kay
Saxonica

On 10/06/2011 15:26, trubliphone wrote:
Hello,

I have an algorithmic problem I haven't been able to solve.  I was
hoping somebody on this list could offer me some advice.

I do have a solution in pure XQuery, but that requires recursion
through a potentially massive XML document which is too inefficient
for production use.  So, I am trying to come up with another way and I
wondered if XSL might do the trick.

Suppose I have some arbitrary XML file:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <foo>
        <bar>one</bar>
        <foo>
            <bar>two</bar>
        </foo>
        <foo>
            <bar>three</bar>
        </foo>
    </foo>
    <foo>
        <bar>four</bar>
        <foo>
            <foo>
                <bar>five</bar>
            </foo>
        </foo>
    </foo>
</root>

Now, suppose there is a user-provided XPath expression to find
particular nodes in that file:

$query := "//foo/bar"

I understand that I cannot, in pure XSLT v1.0, easily evaluate that
string against the document and return the desired nodes.  That's
okay, I can do it in other languages.

After evaluating that string, I wind up with the following node sequence:

(<bar>one</bar>,<bar>two</bar>,<bar>three</bar>,<bar>four</bar>,
<bar>five</bar>)

But I need to recreate the original hierarchical structure of those
nodes.  So what I really want is this:

<bar>one
  <bar>two</bar>
  <bar>three</bar>
</bar>
<bar>four
  <bar>five</bar>
</bar>

To help, I can get the "context path" of each node as follows:

one:   /root[1]/foo[1]/bar[1]
two:   /root[1]/foo[1]/foo[1]/bar[1]
three: /root[1]/foo[1]/foo[2]/bar[1]
four:  /root[1]/foo[2]/bar[1]
five:   /root[1]/foo[2]/foo[1]/foo[1]/bar[1]

So I have the following sequence to work with that I can run an XSL template on:

(
  <node cp="/root[1]/foo[1]/bar[1]"><bar>one</bar></node>,
  <node cp="/root[1]/foo[1]/foo[1]/bar[1]"><bar>two</bar></node>,
  <node cp="/root[1]/foo[1]/foo[2]/bar[1]"><bar>three</bar></node>,
  <node cp="/root[1]/foo[2]/bar[1]"><bar>four</bar></node>,
  <node cp="/root[1]/foo[2]/foo[1]/foo[1]/bar[1]"><bar>five</bar></node>
)

My question is how to turn that into a tree that recreates the
original hierarchical structure?

Many thanks for your help.

Current Thread