Subject: RE: [xsl] memory usage of xslt processing From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Wed, 19 Apr 2006 13:59:08 +0100 |
XSLT processors generally read the whole document into memory. Some products may be able to avoid this under certain circumstances, for example see http://www.saxonica.com/documentation/sourcedocs/serial.html for Saxon. Running one transformation per row is certainly feasible in principle though there may be a significant start-up overhead - you'll only find out by measurement. Alternatively, why not retrieve the data from the database in transformer-sized chunks? Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: Thomas Porschberg [mailto:thomas.porschberg@xxxxxxxxx] > Sent: 19 April 2006 13:36 > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > Subject: [xsl] memory usage of xslt processing > > Hi, > > I have the following task: > Create an arbitrary formatted file (XML/HTML/CSV whatever) > based on a Select from a database. > > As a constraint the amount of data fetched from the database > can not be stored in memory as a whole. > Another constraint is that I can not use XML-functionality in > the database, I have to implement the functionality on top of > our database access framework. This database access framework > fetches record for record one after another. > And I have to use Java and Xalan. > > My idea was to decorate every fetched row from the database > with simple generic XML and fire this to Xalan. > > Let do an example: > If my result set from the database looks like: > > ID Name Description > -- ---- ----------- > 1 "dog" "an animal may be dangerous" > 2 "cat" "an animal likes milk" > > I create the following XML: > > <?xml version="1.0" encoding="UTF-8"?> > <dataset> > <row> > <value>1</value> > <value>dog</value> > <value>an animal may be dangerous</value> </row> <row> > <value>2</value> > <value>cat</value> > <value>an animal likes milk</value> > </row> > </dataset> > > I create this XML as "Sax fire events" in an java > class[StringArrayXMLReader], which implements the > org.xml.sax.XMLReader interface. > I have three methods: > > public void init() throws SAXException { > ch.startDocument( ); > ch.startElement("","dataset","dataset",EMPTY_ATTR); > } > > public void close() throws SAXException { > ch.endElement("","dataset","dataset"); > ch.endDocument( ); > } > > public void parse(String [] input) throws SAXException { > ch.startElement("","row","row",EMPTY_ATTR); > for (int i = 0; i< input.length; ++i){ > ch.startElement("","value","value",EMPTY_ATTR); > ch.characters(input[i].toCharArray(), > 0,input[i].length( )); > ch.endElement("","value","value"); > } > ch.endElement("","row","row"); > } > > The parse method creates the <row>...</row> entries for an > overhanded String array. > The StringArrayXMLReader is associated with a > TransformerHandler, which uses a XSL stylesheet to transform > the XML to the desired output. > > What happens here is, that when the fetch from the database > starts I call init() ( and thus startDocument() ) and at > last, after the fetch finished, I call close() (and thus > endDocument()). > I observed that the xslt processing starts when endDocument() > is called. > This is not acceptable for me because I fear the xslt > processor reads all the rows into memory until endDocument() > is called and in this case I take a risk to run in OutOfMemory. > > My second idea was to eliminate the init()/close() methods > and to consider one <row>...</row> section as complete > document input for the processor. This has the disadvantage > that I have to create the head and tail of the document > manually (and in my example I get a NullPointerException when > I the transformer is called twice). > > I have the following questions: > Is it possible to create the output without having the whole > data in memory ? > The basis XML for xslt processing > <dataset> > <row><value>... > <row><value>... > </dataset> > looks very simple and the supplied XLS stylesheets will be > not complex so my hope is to get it working. > I also think that the task in general - produce formatted > output from a potential very large data pool - should be a common one. > Unfortunately I did not do much xslt-processing in the past > so I lack the experience (a bit libxslt which I feed a DOM tree). > If someone has some striking links I would very glad to hear. > My test code I provide at: > > http://randspringer.de/sax_row.tar and > http://randspringer.de/sax.tar > > If someone could have a look at it I would really appreciate it. > > Thomas > > > --
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] memory usage of xslt processi, Thomas Porschberg | Thread | Re: [xsl] memory usage of xslt proc, JAPISoft |
Re: [xsl] command Line Options for , Nadia . Swaby | Date | Re: [xsl] memory usage of xslt proc, JAPISoft |
Month |