Re: [xsl] Alphabetical index: unstreamable?

Subject: Re: [xsl] Alphabetical index: unstreamable?
From: "Michael Müller-Hillebrand mmh@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 28 May 2014 17:49:58 -0000
Hi Dimitre,

Do I understand correctly this could be as "simple" as defining an accumulator
that incrementally builds up a map? If the source contains <indexterm>
elements I could maybe do something similar to

<xsl:accumulator name="indexterms" as="map(xs:string, element(indexterm))"
    initial-value="map{}">
    <xsl:accumulator-rule match="indexterm"
      new-value=" map:put($value, generate-id(), .) "/>
 </xsl:accumulator>

and at the end process the content of the accumulator?

That would be awesome,

- Michael

[I am aware that I have not tried to store information about the original
location of <indexterm> which would be needed to create an alphabetical
index.]


Am 28.05.2014 um 16:06 schrieb Dimitre Novatchev:

> Without having any details of the source XML document and the required
> properties/formatting of the index, I would use a map to hold the
> entries of the index and will continuously "update" this map with new
> entries as the document is streamed.
>
> This is the main principle also in using accumulators, or just
> functions like fold-left().
>
> On Wed, May 28, 2014 at 5:35 AM, Michael M|ller-Hillebrand wrote:
>> Dear all,
>>
>> In documents the content of what will end up in an alphabetical index is
usually authored in the section to which the index term belongs. That is,
index terms are usually all over the place.
>>
>> When it is time to create an alphabetical index, I see that the XSLT
handling this uses something like
>>
>> <xsl:call-template name="index">
>>  <xsl:with-param name="terms"  select="//indexterm" />
>> </xsl:call-template>
>>
>> and inside the called template all the sorting and grouping is handled.
>>
>> This is not streamable because there is more than a single downward select
(and it is easy to see that you need everything in memory to create the
sequence of all <indexterm>).
>>
>> How would you tackle this (in XSLT) if the source data does not fit in
memory?
>>
>> Thanks a lot for hints,
>>
>> - Michael M|ller-Hillebrand

Current Thread