Re: [xsl] [xslt performance for big xml files]

Subject: Re: [xsl] [xslt performance for big xml files]
From: Liam Quin <liam@xxxxxx>
Date: Sat, 25 Apr 2009 21:55:45 -0400
On Sat, Apr 25, 2009 at 05:36:46PM -0400, Robert Koberg wrote:
> I assume you mean, 'use an XQuery implementation that runs against a  
> proprietary XML database', rather than the broadly defined and mostly  
> non-interoperable implementations of the standard XQuery language,  
> right?

Some (many) XQuery implementations, regardless of whether they are
proprietary or open source, speed up queries using indexes.
Typically you first "load" the XML and can then query it as often
as you like.  An example would be dbxml, an open source implementation
that comes from Sleepycat, now Oracle.

As for interoperable, the XQuery Working Group, together with the XSL
Working Gorup, put a very very large amount of effort into defining
a spec that would let queries move from one implementation to
another as much as possible.  I know that on my own Web site I can
change a variable to switch between three or four implementations
of XQuery without having to change the queries, although it's
true that I'm not using any extension functions, and also that
"three of four" implementations is many fewer than the fifty or
so that I've encountered.  But overall the impression I have
is that there's actually a pretty good level of interoperability.

Could you be more specific?  Maybe I haven't been hearing from
the people with problems...

> >* If you read the document once, but tend not to need to look ahead or
> > behind very far, you could split the input into smaller XML files,
> 
> most likely (and perhaps you would want to use StAX to do the split)

It's possible, although for such as small file as 30 MBytes there
are lots of ways, and if XSLT is already in use that's fine I
expect.  Sometimes it's better to keep the number of technologies
used to a minimum and maximise human efficiency rather than maximise
CPU efficiency, although I was indeed thinking of both StAX and
the streaming XSLT work when I wrote my reply.

Liam

-- 
Liam Quin, W3C XML Activity Lead, http://www.w3.org/People/Quin/
http://www.holoweb.net/~liam/ * http://www.fromoldbooks.org/

Current Thread