Re: [xsl] Transform a million XML documents

Subject: Re: [xsl] Transform a million XML documents
From: "Eliot Kimber ekimber@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 13 Feb 2017 15:23:22 -0000
I can report that collection() worked fine on my smaller test set of about 50K
documents. Will have a test against the full 1 million data set in the next
day or two. Again, this is a Saxon-specific feature.

Cheers,

E.

--
Eliot Kimber
http://contrext.com



On 2/13/17, 8:39 AM, "Matthew Stoeffler matthew.stoeffler@xxxxxxxxxx"
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

    Ibve done this on a smaller scale: about 44,000 input documents, minimum
of 2K per doc.  I chose to loop with collection function  and send each input
node to a result tree written out with result document to a temp , working
directory, and generate directly from the loop a shell script that then moved
all the temp files to a final location. This because I has a lot of related
asset files that also needed to move.  I was able to run this with Saxon PE.
I donbt remember run time, but it didnbt seem excessive.



    m./





    > On Feb 10, 2017, at 4:52 PM, Michael Kay mike@xxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

    >

    >>

    >> Here is a summary of the ensuing discussion.

    >>

    >> Scenario: There are a million XML documents that need to be
transformed. Each file is in the 1-4KB range. The files are organized into
directories about 4 or 5 deep and some directories have 100s or 1000s of
files.

    >>

    >> Transforming a million files is easily handled by Saxon-EE,

    >

    >

    > That is in no way a summary of what I wrote on that thread. I wrote,
much more cautiously "I can't see any particular reason why collection()
shouldn't handle it".

    >

    > Michael Kay

    > Saxonica

Current Thread