Re: [xsl] How to properly use Key elements

Subject: Re: [xsl] How to properly use Key elements
From: Emmanuel Bégué <medusis@xxxxxxxxx>
Date: Wed, 16 Oct 2013 12:54:10 +0200
On Wed, Oct 16, 2013 at 10:16 AM, Michael Kay <mike@xxxxxxxxxxxx> wrote:
>
> Your solution is likely to have quadratic performance on most implementations

It depends on the structure of the input (which we don't know); if all
ships are in the same table then yes of course, using keys is
necessary (as suggested in my previous message).

But if there is one table per ship, the no key solution is actually
faster. Tested with 3000 tables and one ship per table (~21 Mo input
file), with Saxon 6.5.5, average of 5 runs:
- with key: 3708 ms
- without key: 2949 ms

The difference is probably the time needed to build the key, but using
it doesn't yield any performance advantage in this case
(following-sibling::tr just goes to the end of the current table and
doesn't read the whole input every time).

- - -

There was one problem in the last solution provided, though (the one
with a key) in that using the ship name as the key is not a good idea
if two ships have the exact same name; so it's better to use a
(unique) id for the tr containing the ship info.

Here's the updated solution

<?xml version="1.0" ?>
<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
  exclude-result-prefixes="xsl">

<xsl:output method="xml" indent="yes" encoding="UTF-8"/>

<xsl:key name="portsByShip" match="tr"
use="generate-id(preceding-sibling::tr[count(td) &gt; 3][1])"/>

<xsl:template match="/root">
  <xsl:apply-templates/>
  </xsl:template>

<xsl:template match="*">
  <xsl:apply-templates/>
  </xsl:template>

<xsl:template match="thead|text()"/>

<xsl:template match="tr[count(td) &gt; 3]">
  <!-- beware, the ship name should be a valid element name (no
spaces, etc.) -->
  <xsl:variable name="shipName" select="td[1]"/>
  <xsl:variable name="shipId" select="generate-id(.)"/>
  <xsl:element name="{$shipName}">
    <xsl:apply-templates select=".|key('portsByShip', $shipId)" mode="port"/>
    </xsl:element>
  </xsl:template>

<xsl:template match="tr" mode="port">
  <xsl:variable name="numRows" select="count(td)"/>
  <xsl:for-each select="td[$numRows - 1]">
    <port name="{.}">
      <xsl:value-of select="following-sibling::td[1]"/>
        </port>
    </xsl:for-each>
  </xsl:template>

</xsl:stylesheet>

Regards,
EB

Current Thread