Re: [xsl] "Heap" of trouble handling input file of 500 MByte

Subject: Re: [xsl] "Heap" of trouble handling input file of 500 MByte
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Mon, 21 Feb 2011 23:20:49 +0000
On 21/02/2011 21:24, thehulk@xxxxxxxxxxx wrote:
Thanks for all these suggestions. I tried to use Saxon but ran into typical problems. I have not found an endorsed dir anywhere, and after looking at about a dozen webpages, I am ready to give up and ask: how to put it into the endorsed dir? Also very usable: how to make this one program use the Saxon classes?
I did download files saxon8.jar and saxon9he.jar .


If your application is using the JAXP interfaces, and if you have access to the source code, then by far the simplest way to switch it to using Saxon is to change the line

TransformerFactory f = TransformerFactory.newInstance();

to say instead

TransformerFactory f = new net.sf.saxon.TransformerFactoryImpl();

If you don't have access to the source code, then the simplest is to set the Java system property javax.xml.transform.TransformerFactory to the value "net.sf.saxon.TransformerFactoryImpl".

However, if you are processing 500Mb input documents you are still going to need rather more than the 1Gb of heap space that you seem able to allocate: I would recommend a minimum of 4x input document size, and it can be more than that depending on the nature of the XML source.

A surprisingly simple measure that is often overlooked is to set <xsl:strip-space elements="*"/> - however, it doesn't make much difference on recent versions of Saxon because Saxon will compress the whitespace anyway.

Michael Kay
Saxonica

Current Thread