Re: [xsl] Running the same transformation on many input files, optimisation possible?

Subject: Re: [xsl] Running the same transformation on many input files, optimisation possible?
From: "Dimitre Novatchev dnovatchev@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sun, 15 Dec 2019 19:07:28 -0000
Looking at my answer, I think that it would be useful to have another
overload of the standard  `collection()` function, but this time, due to
reasons we all know, this would be a Saxon extension function:

saxon:collection($arg as xs:string?, $processWith as function(*),
$multiThread as xs:boolean) b item()*



Cheers,
Dimitre

On Sun, Dec 15, 2019 at 9:49 AM Dimitre Novatchev <dnovatchev@xxxxxxxxx>
wrote:

> I would definitely use the `collection()` function, then would try to
> process the documents in parallel using the `saxon:threads` extension
> attributes with a value dependent on the number of cores on the machine.
>
>
>
https://www.saxonica.com/html/documentation/extensions/attributes/threads.htm
l
>
>
> Trying to generalize this a little bit further, if we have N machines, we
> could send N   HTTP requests (why not using the document() function) giving
> each machine a non-overlapping pattern for the set of files it should
> process.
>
> Of course, besides using `collection()` extensively, I haven't ever tried
> the other stuff I proposed above -- would be really interesting to try.
>
> Cheers,
> Dimitre
>
> On Sun, Dec 15, 2019 at 1:02 AM Trevor Nicholls trevor@xxxxxxxxxxxxxxxxxx
> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
>> Hi
>>
>>
>>
>> An application I am working on contains a large number of source
>> documents which are all run through the same series of transformations.
>> While initially the build process didn't take long the cost of repeatedly
>> initialising the XSL processor soon adds up, so I am looking at ways to
>> streamline it.
>>
>>
>>
>> Our processor of choice is Saxon (currently we are using 8.7.3) so I can
>> shift this question to the Saxon list if there are extensions there that
>> are relevant.
>>
>>
>>
>> So the question; given a script that essentially includes the following:
>>
>>
>>
>> cd documents
>>
>> for d in `cat dlist`; do
>>
>>   cd $d
>>
>>   for f in `cat flist`; do
>>
>>     java -jar $SAXONDIR/saxon8.jar  -o  $f.new.xml  $f.xml
>>  $SCRIPTDIR/transform.xsl  doc=$d  file=$f
>>
>>   done
>>
>> done
>>
>>
>>
>> is there a mechanism which would allow a single Java process to perform
>> the equivalent?
>>
>>
>>
>> Thanks
>>
>> T
>>
>>
>> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
>> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/782854> (by
>> email <>)
>>
>
>
> --
> Cheers,
> Dimitre Novatchev
> ---------------------------------------
> Truly great madness cannot be achieved without significant intelligence.
> ---------------------------------------
> To invent, you need a good imagination and a pile of junk
> -------------------------------------
> Never fight an inanimate object
> -------------------------------------
> To avoid situations in which you might make mistakes may be the
> biggest mistake of all
> ------------------------------------
> Quality means doing it right when no one is looking.
> -------------------------------------
> You've achieved success in your field when you don't know whether what
> you're doing is work or play
> -------------------------------------
> To achieve the impossible dream, try going to sleep.
> -------------------------------------
> Facts do not cease to exist because they are ignored.
> -------------------------------------
> Typing monkeys will write all Shakespeare's works in 200yrs.Will they
> write all patents, too? :)
> -------------------------------------
> Sanity is madness put to good use.
> -------------------------------------
> I finally figured out the only reason to be alive is to enjoy it.
>
>


--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
To achieve the impossible dream, try going to sleep.
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they write
all patents, too? :)
-------------------------------------
Sanity is madness put to good use.
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.

Current Thread