Subject: Re: [xsl] Transforming large XML docs in small amounts of memory From: Ronan Klyne <ronan.klyne@xxxxxxxxxxx> Date: Mon, 30 Apr 2007 11:00:31 +0100 |
Andrew Welch wrote: > Much can be done, but your available options all depend on the > processor and environment you're running, and how flexible you are - > is it a pure XSLT 1.0/2.0 solution you're after, or can you use > extensions or modify the processing pipeline? It's purely XSLT 1.0, using Saxon (on Linux and Windows, if that matters...), although suggestions to change this would not be shunned. The input XML is the only real fixed quantity, due to the amount of work that would be required to change the code generating it, given that it already 'works'. > Also you need to let us know: > > - Is the input uniform chunks of data in a single file? (likely if > its a "data-centric" xml file) or does the processing require access > to the whole input for the whole transform? The majority of the XSL draws on data from all over the input document, which I suspect will be constraining. There are substantial sections of the input document which could be described as uniform, but I would not say that the term applies to the document as a whole. > - What is your current memory usage? Whats the limit, what is an > acceptable bound? etc.. The servers we're using have several Gb of memory in them, but my objective is to increase the potential for concurrency, by reducing the resource requirements of each transform. I think that transforming 150Mb of data in 400Mb of RAM would be a sensible target (is this sensible?) > - How are you measuring memory usage? Is it simply the input XML that > is using up all available memory, or do other parts of the pipeline > use a lot of memory too? I'm measuring it by increasing the maximum amount of memory available to Java until it runs without throwing OutOfMemory errors (to solve the immediate problem). The larger transforms (150Mb of input) are taking ~1Gb of memory to run. I'm not sure how to tell what proportion of the memory is used for the input DOM, output DOM, etc... Which reminds me, I should mention that the output document is ~<1Mb # r -- Ronan Klyne Business Collaborator Developer Tel: +44 (0)870 163 2555 ronan.klyne@xxxxxxxxxxx www.groupbc.com
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Transforming large XML do, Andrew Welch | Thread | RE: [xsl] Transforming large XML do, Michael Kay |
RE: [xsl] XSL to ODF/OOXML, W Charlton | Date | RE: [xsl] Transforming large XML do, Michael Kay |
Month |