Subject: RE: [xsl] 10,000 document()'s From: Peter Binkley <Peter.Binkley@xxxxxxxxxxx> Date: Thu, 10 Apr 2003 12:26:51 -0600 |
Thanks to all who sent suggestions. It looked like David's suggestion of using Xalan's no-caching feature would let me move forward, but the thing still ground to a halt. So for now I'm following Charles' line and have written a PHP script to do the job. Too bad, though; I was looking forward to being able to say I'd done the whole job in XSL. Ultimately I see I need to get further into JAXP and learn to do these things properly. My conclusion is that even where XSL isn't the right tool for full-scale production, it's an awfully handy prototyping tool. Peter Peter Binkley Digital Initiatives Technology Librarian Information Technology Services 4-30 Cameron Library University of Alberta Libraries Edmonton, Alberta Canada T6G 2J8 Phone: (780) 492-3743 Fax: (780) 492-9243 e-mail: peter.binkley@xxxxxxxxxxx > -----Original Message----- > From: Michael Kay [mailto:mhk@xxxxxxxxx] > Sent: Wednesday, April 09, 2003 1:39 PM > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > Subject: RE: [xsl] 10,000 document()'s > > > I would suggest writing a SAX filter that invokes the XSLT > transformations (one transformation for each file) via JAXP, > gets the result back in a StringWriter, and adds an element > containing the word count to the output stream. > > Michael Kay > Software AG > home: Michael.H.Kay@xxxxxxxxxxxx > work: Michael.Kay@xxxxxxxxxxxxxx > > > -----Original Message----- > > From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx > > [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of > > Peter Binkley > > Sent: 08 April 2003 17:06 > > To: 'xsl-list@xxxxxxxxxxxxxxxxxxxxxx' > > Subject: [xsl] 10,000 document()'s > > > > > > I need advice on how to tackle this problem: I've got a file > > that contains a list of about 10,000 other files, and I want > > to process the list so as to add a wordcount for each of the > > external files. Something like this: > > > > Input: > > > > <files> > > <file> > > <filename>/path/to/file/2844942.xml</filename > > <file> > > <file> .... </file> > > <files> > > > > Output: > > > > <files> > > <file> > > <filename>/path/to/file/2844942.xml</filename > > <wordcount>2938</wordcount> > > <file> > > <file> .... </file> > > <files> > > > > The obvious approach is to use a for-each loop that includes > > a variable that opens the external file using a document() > > call. The problem is that the process inevitably runs out of > > memory, both with Saxon and Xalan. It seems that the > > variables are passing out of scope and being destroyed as > > they should, but I gather from a posting by Michael Kay > > (http://www.biglist.com/lists/xsl-list/archives/200212/msg0050 > 7.html) that all of those document() source trees are > remaining in memory throughout the transformation, adding up > to megabytes of data. > > Can anyone suggest a strategy? The process doesn't have to be > fast, it just has to finish. > > Peter Binkley > Digital Initiatives Technology Librarian > Information Technology Services > 4-30 Cameron Library > University of Alberta Libraries > Edmonton, Alberta > Canada T6G 2J8 > Phone: (780) 492-3743 > Fax: (780) 492-9243 > e-mail: peter.binkley@xxxxxxxxxxx > > > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] 10,000 document()'s, cknell | Thread | Re: [xsl] 10,000 document()'s, S Woodside |
Re: [xsl] Re: XSL-List Digest V4 #1, David Carlisle | Date | [xsl] Push/pull, Elijah Mori |
Month |