Optimising DOM (& Processing Instructions)

Subject: Optimising DOM (& Processing Instructions)
From: "Simon Hunt" <Simon_Hunt@xxxxxxx>
Date: Wed, 3 Nov 1999 07:48:43 +0000
Hi all

We have recently delivered software to a large financial institution, in which
our core data is based around XML/XSL, transformations via an in-house XSL
Processor.  Now we are looking at strategies for enhancing the run-time
performance of memory-based DOM, to speed up queries into XML instance data and
therefore enabling us to get a decent response for more complex queries (not a
requirement just yet, but we're second guessing the next phase of the project!).

There are a number of speed enhancements already achieved just through
refactoring our XSL, but another strategy seems to be to embellish DOM with
additional in-memory access structures like maps of lists to access all
instances of elements by name, and suchforth.

The problem is that we do have some cases of bulk XML data transfer across the
Business Layer, with large numbers of repeating elements that don't need to be
queried in XSLT independently but rather through parent/grandparent
relationships.  In these cases it's wasteful of processing time and memory to
build additional access structures, but in other cases the potential performance
increase is dramatic.

(Sorry this is going on a bit.)  One possible solution seems to be to use
Processing Instructions for our specific XSL Processor to identify elements that
require categorisation for fast access, and maybe to even specify an access
policy.  We might also be able to use DTDs to identify relationships through
which we could then optimise our access structures.

My question really is is this a valid use of Processing Instructions?  Sure, the
PIs will be ignored by XSL Processors that don't recognise those specific PIs,
but I'm a bit nervous about embedding implementation-specific detail and
optimisation info into data.  Also it has the limitation that you have to know a
fair bit about the data before you choose the Processing Instruction(s) to
embed.

Or is this *exactly* what Processing Instructions are supposed to do?

I'm also concerned that if this strategy is followed commercially in a browser
then we'll end up with some browsers recognising secret PIs of their own to gain
a performance boost over rivals.  I guess optimisation policy PIs would need to
be defined by a third party in order to prevent this...


Simon.



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread