RE: [xsl] Processing large XML Documents [> 50MB]

Subject: RE: [xsl] Processing large XML Documents [> 50MB]
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 23 Feb 2010 08:28:12 -0000
> We have a need to process XML Documents that could be over 50 
> megs in size.
> 
> Due to the huge size of the document, XSLT is getting tough, 
> with the environment we are running in.

Actually, 50Mb isn't really that big nowadays. Some people are transforming
1Gb or more.
> 
> Basically, the nature of the data procesing are
> 
> a) assemble around 30-40 XML documents [each with a common 
> header and its own lines] into one single XML document, with 
> the common header and all the lines
> b) Update the assembled document in specific locations
> c) generate multiple XML document fragments from the huge XML 
> document based on query criteria. Each XML frgment is created 
> by mapping specific fields in the big document. Each document 
> is created for a specific key element value in the huge document.
> 
> Am puzzled how to handle this one efficiently.
> Any comments are welcome.
> 

It's not entirely clear why you are creating the one big document: it's
perfectly possible to work directly with the 30-40 small ones. Perhaps the
main advantage of building the big document is that you can then use a key
to search across all the data. But if you use a processor like Saxon-EE that
optimizes searches by means of implicit indexing, this might not be
necessary.

Is the 50Mb the size of the combined document, or the size of the individual
pieces?

Current Thread