Subject: Re: [xsl] Running the same transformation on many input files, optimisation possible? From: "Rolf Kleef rolf@xxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Mon, 16 Dec 2019 14:43:52 -0000 |
The way I do this is with Ant indeed: Ant does a single XSLT compilation, then applies it to all input files where the output file is older than the input file or doesn't exist (which may provide another optimisation). I use a build.xml like this to run `ant transform-files`. <project> <target name="transform-files"> <xslt basedir="/workspace/input/" includes="*.xml" destdir="/workspace/tmp" extension=".new.xml" style="transform.xslt" /> </target> </project> Instead of the basedir and includes attributes, you should be able to create "filelist" or "fileset" collections of files to be processed inside the <xslt> tags. There are ways to combine these, to end up with a single list of input files and benefit from a single XSLT compilation. https://ant.apache.org/manual/Types/filelist.html https://ant.apache.org/manual/Types/fileset.html ~~Rolf. On Sun, 2019-12-15 at 22:12 +0000, Michael Kay mike@xxxxxxxxxxxx wrote: > Note that there's a double overhead here: firstly you're bringing up > a new Java VM for each transformation, and secondly you're > recompiling the stylesheet for each transformation. > You can avoid the Java loading overhead by using ant or XProc, but > I'm not sure either of them will avoid the overhead of recompiling > the stylesheet; though if you use a a recent Saxon version, you could > achieve that by reloading the stylesheet from a pre-compiled SEF > (stylesheet export file). > > You could write your own Java application to control the process, > invoking Saxon via the JAXP or s9api APIs - both allow you to compile > a stylesheet once and execute it repeatedly. > > You might be able to write the control loop in XSLT, for example by > using the collection() function, or functions in the EXPath file > module. However, this could require stylesheet changes if your XSLT > code binds global variables to values derived from the source > document. > > In very simple cases you can take advantage of the fact that the -s > option for the Saxon command line can be a directory, in which case > all the input files are transformed to corresponding files in the -o > directory. > > Michael Kay > Saxonica > > > On 15 Dec 2019, at 09:03, Trevor Nicholls trevor@xxxxxxxxxxxxxxxxxx > > <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > Hi > > > > An application I am working on contains a large number of source > > documents which are all run through the same series of > > transformations. While initially the build process didn't take long > > the cost of repeatedly initialising the XSL processor soon adds up, > > so I am looking at ways to streamline it. > > > > Our processor of choice is Saxon (currently we are using 8.7.3) so > > I can shift this question to the Saxon list if there are extensions > > there that are relevant. > > > > So the question; given a script that essentially includes the > > following: > > > > cd documents > > for d in `cat dlist`; do > > cd $d > > for f in `cat flist`; do > > java -jar $SAXONDIR/saxon8.jar -o $f.new.xml $f.xml > > $SCRIPTDIR/transform.xsl doc=$d file=$f > > done > > done > > > > is there a mechanism which would allow a single Java process to > > perform the equivalent? > > > > Thanks > > T > > > > XSL-List info and archiveEasyUnsubscribe (by email) > > > > > XSL-List info and archive > > EasyUnsubscribe > (by email)
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Running the same transfor, Michael Kay mike@xxx | Thread | Re: [xsl] Running the same transfor, Christophe Marchand |
Re: [xsl] Running the same transfor, Christophe Marchand | Date | [xsl] Suggestions for filtering alg, Rick Quatro rick@xxx |
Month |