Subject: Re: [xsl] Improving Performance of XSLT on large files From: "Michael Beddow" <mbnospam@xxxxxxxxxxx> Date: Wed, 29 Aug 2001 16:39:41 +0100 |
Perhaps I ought to leave this one to the CompSci folks, but here goes anyway: [..] > Adding relevant spacer characters to any variable containers > in the XML to ensure that the records in the XML repeat in > a mathematically recognisable against character position > throughout the file. [..] This looks to me as though you're imagining that the XSLT processor operates like a serial filter on an input stream. But it doesn't. It parses the entire input stream into an internal tree representation of the data before it does anything else And that's a resource-intensive thing to do. No doubt a clever processor could then detect repeating patterns and use appropriate shortcuts to process them, but the full tree representation has to be built first. There are some Perl modules (e.g. XML::Twig) that try to address the problems this creates for large files by allowing you to extract and handle subtrees without a complete deserialisation of the entire file. Maybe if your data is indeed so ill-matched to the way XSLT processors normally proceed, you should check them out. Michael --------------------------------------------------------- Michael Beddow http://www.mbeddow.net/ XML and the Humanities page: http://xml.lexilog.org.uk/ --------------------------------------------------------- ----- Original Message ----- From: "gary cor" <stuff4gary@xxxxxxxxxxx> To: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx> Sent: Wednesday, August 29, 2001 12:26 PM Subject: [xsl] Improving Performance of XSLT on large files > Dear All, > > I have recently started working with XLST and cannot reason why it needs to > be so slow with a large XML file (i.e. 70MB +). I will be trying to process > without ordering etc. later and will run a lot of tests to see if this helps > but it just seems massively too slow!! Possibly, I don't understand because > my XML files are just like the infinitely complicated DNA structure and DO > always contain repeating substructures at many different levels as well. > So, > would ensuring the following help actually me in any way?? : > > *That All elements, attributes, etc. are compulsory in long record sets so > they are totally repeating units > > *Then Adding relevant spacer characters to any variable containers in > the XML to ensure that the records in the XML repeat in a mathematically > recognisable against character position throughout the file. > > Then possibly I could optimise this process for a DNA Validation! And, > apply a mathematical functions for the pointer so it knows specifically > where to read element data from the document without having even look at any > irrelevant bits?? Then feed only the relevant bits of XML into to the XLST > processor as a secondary process... And, where do I and where don't I get a > performance advantage doing something like this?? And does anything do this > already? > > Any, comments would on this subject would be very much appreciated! > > Kind Regards > > Gary Cornelius > > > > > > _________________________________________________________________ > Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] Improving Performance of XSLT, gary cor | Thread | Re: [xsl] Improving Performance of, Michael Beddow |
[xsl] options select, Karlo | Date | Re: [xsl] RE: Elements between attr, Michael Beddow |
Month |