Re: [xsl] How To Use Streaming To Group Elements in a Flat List?

Subject: Re: [xsl] How To Use Streaming To Group Elements in a Flat List?
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 2 May 2017 22:25:28 -0000
Running your code on Saxon 9.7, I get

  XTSE3430: Template rule is declared streamable but it does not satisfy the
streamability rules.
  * The xsl:for-each-group/@group-starting-with pattern is not motionless

That's because *[position()] involves counting preceding siblings. Or to look
at it another way, the pattern can't be evaluated simply by looking at the
node in isolation, it has to examine its position relative to other nodes in
the document.

But there's an easy workaround: use group-adjacent="(position() - 1) idiv
1000". With this formulation, position() is counting the items being grouped,
not the number of siblings they have.

Here's the full stylesheet:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
    xmlns:xs="http://www.w3.org/2001/XMLSchema";
    exclude-result-prefixes="xs"
    version="3.0">

    <xsl:mode streamable="yes"/>

    <xsl:template match="ROWDATA">
        <xsl:variable name="resultURIbase" as="xs:string"
            select="concat('out', '/rowdata-')"
        />
        <xsl:variable name="rootname" as="xs:string" select="name(.)"/>

        <xsl:for-each-group select="ROW" group-adjacent="(position() - 1) idiv
1000">
            <xsl:result-document href="{concat($resultURIbase, generate-id(),
'.xml')}">
                <xsl:element name="{$rootname}">
                    <xsl:copy-of select="current-group()"/>
                </xsl:element>
            </xsl:result-document>
        </xsl:for-each-group>

    </xsl:template>

</xsl:stylesheet>


> On 2 May 2017, at 21:55, Eliot Kimber ekimber@xxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> I have some very large (100s of MBs) XML database dump docs that I want to
break into smaller docs. This is an easy application of for-each-group or of a
simple tail recursion approach but I wanted to use this as an opportunity to
learn more about XSLT 3 streaming.
>
> Ibve read through the XSLT 3 spec and I think I generally understand the
options but itbs still not clear either how or how best to do this type of
grouping so that itbs streamable. I didnbt find any examples of this
specific use case searching on bxslt streaming with groupingb (other than
older items that donbt actually work).
>
> If my source looks like this:
>
> <ROWDATA>
>
<ROW><SRVC_CAT_ID>54</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Exte
rior Lights</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body,
Cab</PARENT_NAME></ROW>
>
<ROW><SRVC_CAT_ID>53</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Exte
rior Body Panels</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body,
Cab</PARENT_NAME></ROW>
>
<ROW><SRVC_CAT_ID>51</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Ente
rtainment Systems</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body,
Cab</PARENT_NAME></ROW>
>
<ROW><SRVC_CAT_ID>40</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Door
Locks &amp; Anti-Theft Systems</SRVC_CAT_NAME><PARENT_NAME>Accessories and
Body, Cab</PARENT_NAME></ROW>
> b& lots more rows b&
> </ROWDATA>
>
> Ibd like to generate result files containing 1000 records each, each
wrapped in the same root element.
>
> The non-stream for-each-group is simple enough:
>
>    <xsl:template match="ROWDATA">
>        <xsl:variable name="resultURIbase" as="xs:string"
>            select="concat($outdir, '/rowdata-')"
>        />
>        <xsl:variable name="rootname" as="xs:string" select="name(.)"/>
>
>        <xsl:for-each-group select="ROW" group-starting-with="*[(position()
mod 1000) = 0]">
>            <xsl:result-document href="{concat($resultURIbase, generate-id(),
'.xml')}">
>                <xsl:element name="{$rootname}">
>                    <xsl:copy-of select="current-group()"/>
>                </xsl:element>
>            </xsl:result-document>
>        </xsl:for-each-group>
>
>    </xsl:template>
>
>
> But Ibm not seeing how do this using e.g., xsl:iterate. As is often the
case with XSLT, I feel like Ibm missing the obvious.
>
> Is it in fact possible to do what I want in a streamable way?
>
> Thanks,
>
> Eliot
>
> --
> Eliot Kimber
> http://contrext.com

Current Thread