[xsl] Re: [saxon] Batching into files using streaming

Subject: [xsl] Re: [saxon] Batching into files using streaming
From: "David Rudel fwqhgads@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 16 Sep 2016 18:58:30 -0000
It seems to me that you can use xsl:iterate with streaming to do this.

The syntax is discussed here:
http://www.saxonica.com/html/documentation/sourcedocs/streaming/stream-with-i
terate.html

So you could iterate with two counters: one to keep track of which
batch you are on for filenaming purposes and the other counter counts
from 1 to batch_size. When it hits batch_size, you call
<xsl:result-document> to send off a chunk of output and reset that
counter and iterate the other.

I think the more modern versions of saxon use snapshots for streaming,
so you could also recover ancestor information if you need to put that
into the output.



On Fri, Sep 16, 2016 at 11:48 AM, Mailing Lists Mail
<daktapaal@xxxxxxxxx> wrote:
>
> Hello All,
>
> I have this requirement :
>
> I have to write an XSLT which will create files of a specific batch size.
> For example
> For 44 college elements and For batch size 10, the XSLT will produce 5
files
> with a max of 10 colleges in each fileb& There are elements preceding and
> following <College> (country-info , Politics, sports, etcb&)and these will
> have to be copied as is into each batch file
>
> XML Sample
>
> <?xml version="1.0" encoding="UTF-8"?>
>
> <Country code="GB">
>
>             <country-info>
>
>                         <tourism/>
>
>                         <population/>
>
>                         <counties/>
>
>             </country-info>
>
>             <University name="Oxford University">
>
>                         <Colleges>
>
>                                     <College>All Souls College</College>
>
>                                     <College>Balliol College</College>
>
>                                     <College>Blackfriars</College>
>
>                                     <College>Brasenose College</College>
>
>                                     <College>Campion Hall</College>
>
>                                     <College>Christ Church</College>
>
>                                     <College>Corpus Christi
> College</College>
>
>                                     <College>Exeter College</College>
>
>                                     <College>Green Templeton
> College</College>
>
>                                     <College>Harris Manchester
> College</College>
>
>                                     <College>Hertford College</College>
>
>                                     <College>Jesus College</College>
>
>                                     <College>Keble College</College>
>
>                                     <College>Kellogg College</College>
>
>                                     <College>Lady Margaret Hall</College>
>
>                                     <College>Linacre College</College>
>
>                                     <College>Lincoln College</College>
>
>                                     <College>Magdalen College</College>
>
>                                     <College>Mansfield College</College>
>
>                                     <College>Merton College</College>
>
>                                     <College>New College</College>
>
>                                     <College>Nuffield College</College>
>
>                                     <College>Oriel College</College>
>
>                                     <College>Pembroke College</College>
>
>                                     <College>The Queen's College</College>
>
>                                     <College>Regent's Park
College</College>
>
>                                     <College>St Anne's College</College>
>
>                                     <College>St Antony's College</College>
>
>                                     <College>St Benet's Hall</College>
>
>                                     <College>St Catherine's
> College</College>
>
>                                     <College>St Cross College</College>
>
>                                     <College>St Edmund Hall</College>
>
>                                     <College>St Hilda's College</College>
>
>                                     <College>St Hugh's College</College>
>
>                                     <College>St John's College</College>
>
>                                     <College>St Peter's College</College>
>
>                                     <College>St Stephen's House</College>
>
>                                     <College>Somerville College</College>
>
>                                     <College>Trinity College</College>
>
>                                     <College>University College</College>
>
>                                     <College>Wadham College</College>
>
>                                     <College>Wolfson College</College>
>
>                                     <College>Worcester College</College>
>
>                                     <College>Wycliffe Hall</College>
>
>                         </Colleges>
>
>             </University>
>
>             <politics/>
>
>             <sports/>
>
>             <airports/>
>
>             <science-relegion/>
>
> </Country>
>
>
> ..
>
>
>
> I tried the following code but I realized I am breaking the Streaming
rulesb&
>
>
>
> <xsl:transform version="3.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
> xmlns:xs="http://www.w3.org/2001/XMLSchema";
>
>             <xsl:mode name="batch" streamable="yes"
> on-no-match="shallow-copyb/>
>
>             <xsl:output method="xml" indent="yes"/>
>
>
>             <xsl:param name="fileHref"
> select="'file:///E:/stylesheets/TestBed/InputSource/University.xml'"/>
>
>             <xsl:param name="BatchSize" select="10"/>
>
>             <xsl:template match="/">
>
>                         <xsl:stream href="{$fileHref}">
>
>                                     <xsl:sequence>
>
>                                                 <xsl:for-each-group
> select="/Country/University/Colleges/College" group-adjacent="(position()
> -1) idiv $BatchSize">
>
<xsl:result-documenthref="file:///E:/stylesheets/TestBed/Result/CollegeBatch{
position()}.xml">
>
>
> <xsl:stream href="{$fileHref}">
>
>
> <xsl:sequence>
>
>
> <xsl:apply-templates mode="batch">
>
>
> <xsl:with-param name="current-group" select="current-group()"tunnel="yes"/>
>
>
> </xsl:apply-templates>
>
>
> </xsl:sequence>
>
>
> </xsl:stream>
>
>
> </xsl:result-document>
>
>                                                 </xsl:for-each-group>
>
>                                     </xsl:sequence>
>
>                         </xsl:stream>
>
>             </xsl:template>
>
>             <xsl:template match="*:Colleges" mode="batch">
>
>                         <xsl:param name="current-group" tunnel="yes"/>
>
>                         <BatchedColleges>
>
>                                     <xsl:copy-of select="$current-group"/>
>
>                         </BatchedColleges>
>
>             </xsl:template>
>
> </xsl:transform>
>
>
>
> I tried to change for-each-group to
> <xsl:for-each-groupselect="/Country/University/Colleges/College/copy-of(.)"
> group-adjacent="(position() -1) idiv $BatchSize">
>
>
>
> Which works but does not copy the right Collegesb&or ends up completely
> messing up with the numbers. Where did I go wrong ?
>
>
> Your help is appreciated.
> DakTapaal
>
>
>
-----------------------------------------------------------------------------
-
>
> _______________________________________________
> saxon-help mailing list archived at http://saxon.markmail.org/
> saxon-help@xxxxxxxxxxxxxxxxxxxxx
> https://lists.sourceforge.net/lists/listinfo/saxon-help



--

"A false conclusion, once arrived at and widely accepted is not
dislodged easily, and the less it is understood, the more tenaciously
it is held." - Cantor's Law of Preservation of Ignorance.

Current Thread