Re: [xsl] XSLT 2.0 multi-level grouping challenge/problem

Subject: Re: [xsl] XSLT 2.0 multi-level grouping challenge/problem
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Thu, 24 May 2007 22:31:14 +0200
Hi Charles,

I see that you already received some answers, but this afternoon I was trying two different approaches and, even though you have resolved your issue by now, I thought I'd share my solutions with you anyway ;)

Method 1: a variant on tree traversal and grouping, pretty easy to follow, uses next-match, perform-sort and for-each-group

Method 2: an exercise in being concise, (almost) all in one XPath. But it does not add to readability and is basically the same as Andrew's solution

Both methods output exactly the same content. I did the exercise because I actually hoped that the XPath solution could become more clear than the XSLT solution, but by looking at it now, it reminds me of my imperative programming times ;)


Cheers, -- Abel Braaksma


<!-- METHOD 1 -->


<xsl:template match="/">
<xsl:variable name="donors">
<xsl:perform-sort select="donors/donor" >
<xsl:sort select="state" />
<xsl:sort select="city" />
<xsl:sort select="organ" />
</xsl:perform-sort>
</xsl:variable>
<xsl:for-each-group select="$donors/*" group-adjacent="concat(city, organ)">
<xsl:apply-templates select="current-group()[1]" />
<tr>
<td /><td />
<td><xsl:value-of select="organ" /></td>
<td><xsl:value-of select="count(current-group())" /></td>
</tr>
</xsl:for-each-group>
</xsl:template>


<xsl:template match="donor[(preceding-sibling::donor[1]/state, '')[1] != state]" priority="1">
<tr>
<td><xsl:value-of select="state" /></td>
<td /><td /><td />
</tr>
<xsl:next-match />
</xsl:template>


<xsl:template match="donor[(preceding-sibling::donor[1]/city, '')[1] != city]">
<tr>
<td />
<td><xsl:value-of select="city" /></td>
<td /><td />
</tr>
</xsl:template>


<xsl:template match="donor" />





<!-- METHOD 2: terseness -->

<xsl:template match="/" name="main">
<xsl:variable name="donors" select="donors/donor" />
<xsl:copy-of select="
for $state in distinct-values($donors/state)
return (f:makerow($state),
for $city in distinct-values($donors[state = $state]/city) return (f:makerow(('', $city)),
for $organ in distinct-values($donors[state = $state and city = $city]/organ) return f:makerow(('', '', $organ,
count($donors[state = $state and city = $city and organ=$organ])))))" />


</xsl:template>

   <xsl:function name="f:makerow">
       <xsl:param name="values" as="item()*" />
       <tr>
           <xsl:for-each select="($values, '', '', '')[position() le 4]">
               <td><xsl:sequence select="." /></td>
           </xsl:for-each>
       </tr>
   </xsl:function>




cknell@xxxxxxxxxx wrote:
Consider this multi-level grouping problem for XSLT 2.0.

I think I should be able to do this with for-each-group, but I am so far frustrated. I usually find the most baroque way of accomplishing a programming task when I first start, only later learning how to pare it down.

I am told that Blaise Pascal once began a letter to a friend saying, "Please excuse this long letter. I didn't have time to write a short one."

The input document is a list of organ donors.
The donor element lists the city and state of the donor and the organ to be donated.

The output should be a table of four columns.

The table should have a new row when the value of the <state/> element changes with the value of the state appearing in column one.

The table should have a new row when the value of the <city/> element changes with the value of the city appearing in column two.

The table should have a new row when the value of the <organ/> element changes, listing the organ in column three and the count of all organs of that type in the current city and the current state in column four.

The rows of the table should be sorted by state, city, and organ.

I've been puzzling over this for several hours. I think the correct approach is to create an in-memory XML document containing one element for each state, a second in-memory XML document containing an element for each city and state combination, and a third in-memory document for each city, state, and organ combination. I would then process the successive rows in the states table, using the values as a key into the cities table, using the values as a key into the organs table, ... well, I kind of lose track here.

Surely one of you has done something like this. Would you be so kind as to explain how it's done?

Thank you.

<states>
  <state></state>
<states>

<cities>
  <city></city>
  <state></state>
</cities>

<organs>
  <organ></organ>
  <city></city>
  <state></state>
</organs>

<?xml version="1.0" encoding="windows-1252"?>
<donors>
  <donor>
    <city>Pittsburgh</city>
    <state>Pennsylvania</state>
    <organ>liver</organ>
  <donor>
  <donor>
    <city>Pittsburgh</city>
    <state>Pennsylvania</state>
    <organ>liver</organ>
  <donor>
  <donor>
    <city>Pittsburgh</city>
    <state>Pennsylvania</state>
    <organ>lungs</organ>
  <donor>
  <donor>
    <city>Pittsburgh</city>
    <state>Pennsylvania</state>
    <organ>lungs</organ>
  <donor>
  <donor>
    <city>Pittsburgh</city>
    <state>Pennsylvania</state>
    <organ>kidney</organ>
  <donor>
  <donor>
    <city>Akron</city>
    <state>Ohio</state>
    <organ>kidney</organ>
  <donor>
  <donor>
    <city>Akron</city>
    <state>Ohio</state>
    <organ>kidney</organ>
  <donor>
  <donor>
    <city>Columbus</city>
    <state>Ohio</state>
    <organ>lungs</organ>
  <donor>
  <donor>
    <city>Columbus</city>
    <state>Ohio</state>
    <organ>lungs</organ>
  <donor>
  <donor>
    <city>Columbus</city>
    <state>Ohio</state>
    <organ>lungs</organ>
  <donor>
  <donor>
    <city>Cleveland</city>
    <state>Ohio</state>
    <organ>liver</organ>
  <donor>
  <donor>
    <city>Cleveland</city>
    <state>Ohio</state>
    <organ>heart</organ>
  <donor>
</donors>

<table>
  <row> &lt;-- Header Row --&gt;
    <td>State</td>
    <td>City</td>
    <td>Organ</td>
    <td>Count</td>
  </row>
  <row> &lt;-- First appearance of "Ohio" --&gt;
    <td>Ohio</td>
    <td></td>
    <td></td>
    <td></td>
  </row>
  <row> &lt;-- First appearance of "Akron" in "Ohio" --&gt;
    <td></td>
    <td>Akron</td>
    <td></td>
    <td></td>
  </row>
  <row> &lt;-- First appearance of "kidney" in "Akron" in "Ohio" --&gt;
    <td></td>
    <td></td>
    <td>kidney</td> &lt;-- count of all kidneys from Akron, Ohio --&gt;
    <td>2</td>
  </row>
  <row> &lt;-- First appearance of "Columbus" in "Ohio" --&gt;
    <td></td>
    <td>Columbus</td>
    <td></td>
    <td></td>
  </row>
  <row> &lt;-- First appearance of "lungs" in "Columbus" in "Ohio" --&gt;
    <td></td>
    <td></td>
    <td>lungs</td> &lt;-- count of all lungs from Columbus, Ohio --&gt;
    <td>2</td>
  </row>
  <row> &lt;-- First appearance of "Cleveland" in "Ohio" --&gt;
    <td></td>
    <td>Cleveland</td>
    <td></td>
    <td></td>
  </row>
  <row> &lt;-- First appearance of "liver" in "Cleveland" in "Ohio" --&gt;
    <td></td>
    <td></td>
    <td>liver</td>
    <td>1</td> &lt;-- count of all livers from Cleveland, Ohio --&gt;
  </row>
  <row> &lt;-- First appearance of "heart" in "Cleveland" in "Ohio" --&gt;
    <td></td>
    <td></td>
    <td>liver</td>
    <td>1</td> &lt;-- count of all hearts from Cleveland, Ohio --&gt;
  </row>
  <row> &lt;-- First appearance of "Pennsylvania" --&gt;
    <td>Pennsylvania</td>
    <td></td>
    <td></td>
    <td></td>
  </row>
  <row> &lt;-- First appearance of "Pittsburgh" in "Pennsylvania" --&gt;
    <td></td>
    <td>Pittsburgh</td>
    <td></td>
    <td></td>
  </row>
  <row> &lt;-- First appearance of "kidney" in "Pittsburgh" in "Pennsylvania" --&gt;
    <td></td>
    <td></td>
    <td>kidney</td> &lt;-- count of all kidneys from Pittsburgh, Pennsylvania --&gt;
    <td>1</td>
  </row>
  <row> &lt;-- First appearance of "liver" in "Pittsburgh" in "Pennsylvania" --&gt;
    <td></td>
    <td></td>
    <td>liver</td>
    <td>2</td> &lt;-- count of all livers from Pittsburgh, Pennsylvania --&gt;
  </row>
  <row> &lt;-- First appearance of "lungs" in "Pittsburgh" in "Pennsylvania" --&gt;
    <td></td>
    <td></td>
    <td>lungs</td> &lt;-- count of all lungs from Pittsburgh, Pennsylvania --&gt;
    <td>2</td>
  </row>
</table>

Current Thread