RE: [xsl] The collection() function

Subject: RE: [xsl] The collection() function
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 15 May 2007 22:16:09 +0100
This should really be discussed on the Saxon list, not here.

In general when performing XSLT processing, if a document has a DTD then the
XML parser must read it, even if not validating, because the DTD may contain
entity definitions and may define ID attributes. If you want to reduce the
cost, the usual answer is to use an EntityResolver that redirects the DTD
references to a local DTD, which is either a local copy of the DTD or a
dummy.

When parsing many documents in a collection that have the same DTD, it would
be nice if the XML parser only processed the DTD once and cached the
results. I don't know if that's possible. I don't see any Xerces options
that allow it.

Michael Kay
http://www.saxonica.com/

> -----Original Message-----
> From: jesper.tverskov@xxxxxxxxx 
> [mailto:jesper.tverskov@xxxxxxxxx] On Behalf Of Jesper Tverskov
> Sent: 15 May 2007 21:07
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: Re: [xsl] The collection() function
> 
> I have a feeling that the implementation of collection() in 
> Saxon could be improved for documents with a DTD. I don't 
> think their loading can be turned off?
> 
> If a directory contains just a handful of XHTML documents 
> loading DTDs over the net, even the most simple use of 
> collection() like for generating a list of filenames in a 
> directory can take minutes. When I delete the DTD 
> declarations as a test, the function is generating the same 
> list in a few seconds.
> 
> Cheers,
> 
> Jesper Tverskov
> http://www.xmlplease.com/xslt

Current Thread