Subject: RE: [xsl] Processing large XML Documents [> 50MB]|
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 23 Feb 2010 08:28:12 -0000
> We have a need to process XML Documents that could be over 50 > megs in size. > > Due to the huge size of the document, XSLT is getting tough, > with the environment we are running in. Actually, 50Mb isn't really that big nowadays. Some people are transforming 1Gb or more. > > Basically, the nature of the data procesing are > > a) assemble around 30-40 XML documents [each with a common > header and its own lines] into one single XML document, with > the common header and all the lines > b) Update the assembled document in specific locations > c) generate multiple XML document fragments from the huge XML > document based on query criteria. Each XML frgment is created > by mapping specific fields in the big document. Each document > is created for a specific key element value in the huge document. > > Am puzzled how to handle this one efficiently. > Any comments are welcome. > It's not entirely clear why you are creating the one big document: it's perfectly possible to work directly with the 30-40 small ones. Perhaps the main advantage of building the big document is that you can then use a key to search across all the data. But if you use a processor like Saxon-EE that optimizes searches by means of implicit indexing, this might not be necessary. Is the 50Mb the size of the combined document, or the size of the individual pieces?