Re: [xsl] [xslt performance for big xml files]

Subject: Re: [xsl] [xslt performance for big xml files]
From: Liam Quin <liam@xxxxxx>
Date: Sat, 25 Apr 2009 22:05:39 -0400
On Sat, Apr 25, 2009 at 07:16:04PM -0400, Robert Koberg wrote:
> Of all the real world applications deployed that use XQuery (I suppose  
> I could be more specific and say as recommended by Liam, but maybe  
> probably not necessary), how many do you think would work on more than  
> one XQuery processor?

I think quite a few, although yes, you generally will have to change
the collection() and document() arguments.  Try creating a SQL
database and querying it in Oracle, DB2, MySQL, PostgresQL and you'll
generally find you have to change the code at least a little, but
that does not make SQL completely non-interoperable.  It's a case
of managing expectations, and of "the application was ported in
a week" vs "we would need to rewrite millions of lines of code from
scratch".

> [...] XQuery as used/promoted by the XML DBs tend to favor their
> own extensions in documentation and lists  (though there seems to be
> more caveats on the lists lately, though).

I don't actually remember which implementations I suggested -- most
likely MarkLogic, Qizx and dbxml, since I've used them.  I've not
had major problems moving queries between them, though, once the
files are indexed, which is a separate (although not unfair) question.

We didn't standardise collection() -- at some point you have to
say, "this is the scope of our spec" and stop.  Maybe for XQuery 1.1
we could consider an optional directory-of-files-as-collection()
function, but then people would say they needed options to say whether
to re-run indexes, what collation sequences and file encodings to
assume, whether to follow shortcuts and sumbolic links... and pretty
soon it'd be a huge mess.  or at least that's been a difficulty in
the past.  Relational database schemas aren't entirely portable
either, and neither are filenames (e.g. between MS Windows and
Solaris and OS X the character sets, lengths, and default encodings
differ).

You're right that extension functions are a problem -- that's true
for XSLT as well, of course, and XPath, and for that matter C and
Perl and Python....

Liam

-- 
Liam Quin, W3C XML Activity Lead, http://www.w3.org/People/Quin/
http://www.holoweb.net/~liam/ * http://www.fromoldbooks.org/

Current Thread