[xsl] How To Use Streaming To Group Elements in a Flat List?

Subject: [xsl] How To Use Streaming To Group Elements in a Flat List?
From: "Eliot Kimber ekimber@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 2 May 2017 20:55:04 -0000
I have some very large (100s of MBs) XML database dump docs that I want to
break into smaller docs. This is an easy application of for-each-group or of a
simple tail recursion approach but I wanted to use this as an opportunity to
learn more about XSLT 3 streaming.

Ibve read through the XSLT 3 spec and I think I generally understand the
options but itbs still not clear either how or how best to do this type of
grouping so that itbs streamable. I didnbt find any examples of this
specific use case searching on bxslt streaming with groupingb (other than
older items that donbt actually work).

If my source looks like this:

<ROWDATA>
    <ROW><SRVC_CAT_ID>54</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>
Exterior Lights</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body,
Cab</PARENT_NAME></ROW>
    <ROW><SRVC_CAT_ID>53</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>
Exterior Body Panels</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body,
Cab</PARENT_NAME></ROW>
    <ROW><SRVC_CAT_ID>51</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>
Entertainment Systems</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body,
Cab</PARENT_NAME></ROW>
    <ROW><SRVC_CAT_ID>40</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>
Door Locks &amp; Anti-Theft Systems</SRVC_CAT_NAME><PARENT_NAME>Accessories
and Body, Cab</PARENT_NAME></ROW>
b& lots more rows b&
</ROWDATA>

Ibd like to generate result files containing 1000 records each, each wrapped
in the same root element.

The non-stream for-each-group is simple enough:

    <xsl:template match="ROWDATA">
        <xsl:variable name="resultURIbase" as="xs:string"
            select="concat($outdir, '/rowdata-')"
        />
        <xsl:variable name="rootname" as="xs:string" select="name(.)"/>

        <xsl:for-each-group select="ROW" group-starting-with="*[(position()
mod 1000) = 0]">
            <xsl:result-document href="{concat($resultURIbase, generate-id(),
'.xml')}">
                <xsl:element name="{$rootname}">
                    <xsl:copy-of select="current-group()"/>
                </xsl:element>
            </xsl:result-document>
        </xsl:for-each-group>

    </xsl:template>


But Ibm not seeing how do this using e.g., xsl:iterate. As is often the case
with XSLT, I feel like Ibm missing the obvious.

Is it in fact possible to do what I want in a streamable way?

Thanks,

Eliot

--
Eliot Kimber
http://contrext.com

Current Thread