Re: [xsl] Are long XPath statements inherently bad?

Subject: Re: [xsl] Are long XPath statements inherently bad?
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Thu, 28 Oct 2004 21:45:24 +0100
Hi John,

> Sorry again, I am relatively new to XSL. I am trying to put together
> of events as a table by outputting a list of items in each cell by
> executing queries such as the following (this is actually
> simplified):
>
> select="$newshome/item[@key='recurring']/item[@template='event'][( 
> @yearspecific, . ) != 1 and substring( @date, 5, 4 ) = $moday ) or (
> @yearspecific = 1 and substring( @date, 1, 8 ) = $date )] | 
> $newshome/item[@key='recurring']/item[@key=$year]/item[@template='event'][(
> @yearspecific != 1 and substring( @date, 5, 4 ) = $moday ) or ( 
> @yearspecific = 1 and substring( @date, 1, 8 ) = $date )]
>
> Is it just my syntax/logic that's bad or is there some better way I
> could/should do this?

In this case, since the predicates on the two halves of the union are
the same, you could do the union before you filter using the
predicate, like so:

  ($newshome/item[@key = 'recurring']/item[@template = 'event'] |
   $newshome/item[@key = 'recurring']/item[@key = $year]
     /item[@template = 'event'])
   [(@yearspecific != 1 and substring(@date, 5, 4) = $moday) or
    (@yearspecific = 1 and substring(@date, 1, 8) = $date)]

But I think that long XPaths like this are pretty unreadable, so I'd
try, where possible, to break them up using variables. In this case,
I'd probably use something like:

  <xsl:variable name="recurring-items"
    select="$newshome/item[@key = 'recurring']" />
  <xsl:variable name="recurring-items-in-year"
    select="$recurrint-items/item[@key = $year]" />
  <xsl:variable name="recurring-events"
    select="($recurring-items | $recurring-items-in-year)
              /item[@template = 'event']" />
  <xsl:variable name="non-yearspecific-events"
    select="$recurring-events[@yearspecific != 1 and
                              substring(@date, 5, 4) = $moday]" />
  <xsl:variable name="yearspecific-events"
    select="$recurring-events[@yearspecific = 1 and
                              substring(@date, 1, 8) = $date]" />

  ... $non-yearspecific-events | $yearspecific-events ...

This is longer, of course, but easier for anyone maintaining your code
to follow (especially since you can add helpful comments between the
lines).

An alternative would be to use a key or a number of keys to help you
access the information quickly and easily. For example, you could
index events by the relevant part of their date:

<xsl:key name="events-by-date"
         match="item[@key = 'recurring']
                  //item[@template = 'event'][@yearspecific = 1]"
         use="substring(@date, 1, 8)" />
<xsl:key name="events-by-date"
         match="item[@key = 'recurring']
                  //item[@template = 'event'][@yearspecific != 1]"
         use="substring(@date, 5, 4)" />

and then use:

  (key('events-by-date', $moday) | key('events-by-date', $date))
    [parent::item[@key = $year or @key = 'recurring']]

This has the advantage of speed as well as brevity.

XPath 2.0 (which I know you're not using, but I mention for interest)
has the benefit of allowing unions within steps and concise if
expressions. Rewritten in XPath 2.0, your expression looks like:

  $newshome/item[@key = 'recurring']
           /(. | item[@key = $year])
           /item[@template = 'event']
                [if (xs:boolean(@yearspecific))
                 then substring(@date, 1, 8) = $date
                 else substring(@date, 5, 4) = $moday]

I'd probably use user-defined functions (for example, to return the
date of an event, based on whether it was yearspecific or not) to
further slim down the code.
  
> The requirements for including events for a date are actually pretty
> complex - this is just one of a series of queries (which I would be
> happy to share!) to get events for a date. Are long XPath statements
> always bad, other than just the possible readability issue?

I can't think of any reason why a long XPath would be bad aside from
because of its (lack of) readability; it's more likely for *short*
XPaths to cause performance problems due to the misuse of "//", for
example. Performance problems over large documents are usually due to
the number of nodes you visit using a path, which is why keys (which
shortcut searches by going straight to the relevant nodes) can help
enormously.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/

Current Thread