Fwd: [xsl] Grouping By Column Heading (by Position, not by ID or element name)

Subject: Fwd: [xsl] Grouping By Column Heading (by Position, not by ID or element name)
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxx>
Date: Thu, 1 May 2014 03:30:03 -0000
Due to Internet weather perturbations related to ongoing list
maintenance, I forward the following, which may not have gone out or
not gone out to everyone -- with apologies for any duplication --

---------- Forwarded message ----------
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxx>
Date: Wed, Apr 30, 2014 at 11:10 AM
Subject: Re: [xsl] Grouping By Column Heading (by Position, not by ID
or element name)
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx


Hi Ted,

To review, here's your key declaration:

<xsl:key name="food-group" match="//table/tr[position() &gt; 1]"
use="ancestor-or-self::table/tr[position() = 1]/td" />

Rewritten and renamed (for clarity and concision):

<xsl:key name="tr-by-header" match="table/tr[position() &gt; 1]"
   use="ancestor-or-self::table/tr[1]/td" />

This does show me I was a bit hurried in my answer yesterday, and
didn't see a problem, namely that all the td elements in the first row
are used as key values for all the other rows. Just as you said in
your followup. (Sorry!)

This is why you need the logic here:

<xsl:for-each select="key('food-group', .)">
   <xsl:element name="food">
      <xsl:value-of select="td[position() = $i]" />
   </xsl:element>
</xsl:for-each>

The for-each is selecting all the rows, since all rows come back for
the value of any td in the first row. That td[position() = $i] is what
is selecting the correct td in each row.

In other words, the key isn't actually being used effectively: you
might as well have said for-each select="../tr" and skipped the key.

This doesn't mean that what you are trying to do is unreasonable (key
the td elements to the value of their column header, i.e. the td in
the same position in the first row). Or at least, it's not prima facie
a bad idea, even if it turns out we don't need it.

Note this means retrieving the td elements with the key, not the tr
elements. It also means that the logic of the key has to account for
the position of the td in the tr -- since that's what actually
(implicitly) identifies the column for each td.

So we need something like:

<xsl:key name="td-by-column-no"
  match="table/tr[position() &gt; 1]/td"
  use="count(. | preceding-sibling::td)" />

This assigns a key value representing the column number (not the
header) for each td. (Note it assumes the table is "square", i.e. no
column or row spans that throw off this count.) Of course this is what
you were already doing by using $i to select your td; only now the
logic is in the key.

Then we can do:

<xsl:template match="td">
  <xsl:variable name="i" select="count(. | preceding-sibling::td)" />
  <xsl:element name="foodGroup">
    <xsl:attribute name="name"><xsl:value-of select="." /></xsl:attribute>
    <xsl:for-each select="key('td-by-column-no', $i)">
      <xsl:element name="food">
        <xsl:value-of select="." />
      </xsl:element>
    </xsl:for-each>
  </xsl:element>
</xsl:template>

Now of course this doesn't do quite what you want, internally -- it
doesn't key each td to the correct value for its header.

This can be done in XSLT 2.0 like this:

<xsl:key name="td-by-column-header" match="table/tr[position() &gt; 1]/td"
   use="ancestor::table/tr[1]/td[position() eq count(current()/(. |
preceding-sibling::td))]" />

It's double tricky, since the position of the td among its siblings is
still used to line the td elements into their columns; but the
indexing is done inside the key declaration, so only the values of the
column header td elements (tr[1]/td) are exposed.

We can't use the current() function in the key declaration under 1.0,
unfortunately.

Then we no longer need the $i index in our template (since it's built
into the key):

<xsl:template match="td"><!-- remember only the tr[1]/td are being processed -->
  <xsl:element name="foodGroup">
    <xsl:attribute name="name"><xsl:value-of select="." /></xsl:attribute>
    <xsl:for-each select="key('td-by-column-header', .)">
      <xsl:element name="food">
        <xsl:value-of select="." />
      </xsl:element>
    </xsl:for-each>
  </xsl:element>
</xsl:template>

Or (easier on my eyes):

<xsl:template match="td"><!-- remember only the tr[1]/td are being processed -->
  <foodGroup name=".">
    <xsl:for-each select="key('td-by-column-header', .)">
      <food>
        <xsl:value-of select="." />
      </food>
    </xsl:for-each>
  </foodGroup>
</xsl:template>

So: why would we do this? I don't know. As I said this is really a
grouping problem. But I can imagine a situation where you're not
simply trying to pivot a table; rather, you want to be able to grab
all the vegetables listed in the table with the key 'Vegetables', and
this would do that.

I hope this helps. Sorry for my haste yesterday.

Cheers, Wendell

On Tue, Apr 29, 2014 at 4:22 PM, G. T. Stresen-Reuter
<tedmasterweb@xxxxxxxxx> wrote:
> Hi wendell,
>
> Thanks for the feedback. Please see my follow-up questions below.
>
> On Apr 29, 2014, at 7:56 PM, Wendell Piez <wapiez@xxxxxxxxxxxxxxx> wrote:
>
>> On Tue, Apr 29, 2014 at 10:50 AM, G. T. Stresen-Reuter
>> <tedmasterweb@xxxxxxxxx> wrote:
>>> 1) How is a TD element (found in the USE attribute of my Key) able to act as unique identifier for a TR element (it's parent)?
>>
>> Actually it isn't, as you would find if you had two "Fruits" columns,
>> for example.
>>
>> As you have it declared, the key is the (string) value of the tr/td[1]
>> element; it serves as a unique identifier iff that value is unique.
>
> Understood, but what is confusing for me is that each key is actually matching the same rows (and indeed this seems to be the case).
>
> Ultimately, what I'd *like* to do would be to make a key that matches ONLY the TD elements right below it, but I'm having a devil of a time figuring out how to do that and I suspect I still don't have any real mastery of keys to begin with. :-(
>
>>
>>> 2) How should this be done (assuming mine is not optimal)?
>>
>> This method is fine if you can control your columns so that their top
>> cell's values are always unique.
>>
>> It's not even a bad way to go if limited to 1.0 logic.
>>
>> In 2.0 I would probably just group on the position of the items.
>
> Unfortunately I am stuck using 1.0. I suppose that if I couldn't control the top row's values that I could do some sort of positioning filtering to get things to match as needed, but that's an implementation detail. My real question is stated above.
>
>>
>>> Please note that I realize this does not require using keys (I don't think) but I like using keys because it makes sense semantically.
>>
>> Indeed it does -- even in the sense that it holds up only as long as
>> your semantics do. :-)
>
> LOL, funny. Maybe semantically wasn't the right term. What I mean is, it makes sense to me to start by splitting things up into nuggets of like data that can be easily processed in order - pretty common need, I suspect.
>
> Thanks again for the feedback.
>
> Ted
>
>
> 
>


-- 
Wendell Piez | http://www.wendellpiez.com
XML | XSLT | electronic publishing
Eat Your Vegetables
_____oo_________o_o___ooooo____ooooooo_^

Current Thread