Re: [xsl] accumulators and continuous numbering

Subject: Re: [xsl] accumulators and continuous numbering
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 18 Oct 2019 20:16:30 -0000
Yes, that's how accumulators work. To quote the spec:

Informally, an accumulator is evaluated by traversing a tree, as follows.

Each node is visited twice, once before processing its descendants, and once
after processing its descendants....

Before the traversal starts, a variable (called the accumulator variable) is
initialized to the value of the expression given as the initial-value
attribute....

Each node is labeled with a pre-descent value for the accumulator, which is
the value of the accumulator variable immediately after processing the first
visit to that node, and with a post-descent value for the accumulator, which
is the value of the accumulator variable immediately after processing the
second visit.


Michael Kay
Saxonica


> On 18 Oct 2019, at 20:08, Graydon graydon@xxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> So I'm converting a bunch of OOXML into DITA using XSLT 3 in oXygen with
Saxon
> 9.8.0.12 PE.
>
> There's a requirement to use short (4 digit) unique identifiers for the
> resuling files and for these to be unique across the content set.  There
are
> many resulting DITA files per source OOXML document.
>
> "A chance to try accumulators!" I thought; I can pre-process the whole
thing
> with file numbers in attributes.  This isn't an attempt to use streaming;
I've
> got a few tens of MB of source.
>
> The accumulator works, BUT the number sequence restarts if I try to use the
> accumuator across a sequence of document nodes or a sequence of element
nodes.
>
> To produce a single continuous sequence of monotonically increasing
integers
> with the accumulator, I have to get the whole content set into the same
tree
> before creating the numbering attributes.
>
> Is that the expected behaviour?
>
> What I have defines the accumulator,
>
> <xsl:accumulator as="xs:integer" initial-value="0" name="documentNumber">
>  <xsl:accumulator-rule
>    match="w:p[w:pPr/w:pStyle/@w:val = $listOfFileStyles]"
>    select="$value + 1">
>  </xsl:accumulator-rule>
> </xsl:accumulator>
>
> pulls the content set into a variable using the file and arch extensions:
>
> <xsl:variable as="element(w:document)+" name="contentSet">
>  <xsl:for-each select="$OOXMLsrc"> <!-- a sequence of path strings -->
>    <xsl:sequence
>      select="(file:path-to-uri(.) => file:read-binary() =>
arch:extract-text('word/document.xml') => parse-xml())/*" />
>  </xsl:for-each>
> </xsl:variable>
>
> then applies the numbering template
>
> <xsl:variable as="element(w:document)+" name="numberedSrc">
> <xsl:apply-templates mode="fileNumber" select="$contentSet" />
> </xsl:variable>
>
> <xsl:mode name="fileNumber" on-no-match="shallow-copy" />
>
> <xsl:template
> 		match="w:p[w:pPr/w:pStyle/@w:val = $listOfFileStyles]"
> 		mode="fileNumber" >
>  <xsl:copy>
> 	<xsl:apply-templates mode="fileNumber" select="@*" />
> 	<xsl:attribute name="fileNumber"
select="accumulator-before('documentNumber') => format-number('0000')" />
> 	<xsl:apply-templates mode="fileNumber" select="node()" />
>  </xsl:copy>
> </xsl:template>
>
> It works; the correct paragraphs are numbered.  But the sequence restarts
for
> each new member of the sequence.
>
> If instead I use:
>
> <xsl:variable as="element(bucket)+" name="contentSet">
>  <bucket>
>    <xsl:for-each select="$OOXMLsrc"> <!-- a sequence of path strings -->
>      <xsl:sequence
>        select="(file:path-to-uri(.) => file:read-binary() =>
arch:extract-text('word/document.xml') => parse-xml())/*"
> 	  />
>    </xsl:for-each>
>  </bucket>
> </xsl:variable>
>
> <xsl:variable as="element(bucket)" name="numberedSrc">
>  <xsl:apply-templates mode="fileNumber" select="$contentSet" />
> </xsl:variable>
>
> I get one continous sequence of numbers.
>
> Is there a better way to get the single continuous sequence of numbers?
>
> Thanks!
>
> -- Graydon

Current Thread