[xsl] opinion re flexible style vs speed

Subject: [xsl] opinion re flexible style vs speed
From: "James A. Robinson" <jim.robinson@xxxxxxxxxxxx>
Date: Tue, 09 Jan 2007 08:26:04 -0800
Hi folks,

I've got a style question which I'd like to throw out to folks to
see what people might have to say on the topic.  We're designing
some stylesheets which extract metadata from XML documents.  Along
the lines of extract Dublin Core elements (e.g., title, contributor,
publisher) from articles written in various markup.

I'm struggling with the question of whether or I should sacrifice
speed for future flexibility.  Given that I know the current layout
of the document structure, and that I know, for example, that for
a given type of document I don't need to descend into the Body of
an Article, I could write code like this:

  <xsl:template mode="metadata" match="@*|node()">
    <xsl:apply-templates mode="metadat" select="@*|node()"/>
  </xsl:template>

  <xsl:template mode="metadata" match="Article/Body" />

  <xsl:template mode="metadata" match="Article/Authors/Author">
    ... do something useful, like extract author names ...
    <xsl:apply-templates mode="metadata" select="*" />
  </xsl:template>

As one might expect, ignoring the bulk of the document when you don't
need to process it tends to make things a bit faster.  For the case I'm
studying right now, it nets us a savings of ~200ms.

For tens of thousands or millions of documents being processed, this
can translate into a big savings when it comes to having to reprocess
large batches of articles.

I'm torn about using this technique though, since to me it will mean
somebody down the line will be stuck with examining the stylesheets if
the data model changes in such a way that we need to process the body.
It all comes down to how often the document model will change, I suppose,
and right now I don't have any way of knowing that!

Have any of you folks made a similar decision one way or the other and
come to regret it? :)

Jim

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
James A. Robinson                       jim.robinson@xxxxxxxxxxxx
Stanford University HighWire Press      http://highwire.stanford.edu/
+1 650 7237294 (Work)                   +1 650 7259335 (Fax)

Current Thread