Subject: Re: [xsl] Alphabetical index: unstreamable? From: "Michael Müller-Hillebrand mmh@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Mon, 2 Jun 2014 18:43:13 -0000 |
Abel, Thanks a lot for the additional hints. That will help us a lot to avoid pitfalls. Just recently I studied your transcribed talk from Prague (Thanks to Roger C.) and learned a lot about the streaming restrictions. Let me put it this way: This thread convinced one of our Java developers to stop implementing a Java solution for this XSLT problem. It looks like we rather wait for the final XSLT3 spec. Thanks, - Michael PS: Unfortunately one can not visit every cool XML conference Am 02.06.2014 um 16:47 schrieb Abel Braaksma: > > On 28-5-2014 19:50, Michael M|ller-Hillebrand mmh@xxxxxxxxx wrote: >> Hi Dimitre, >> >> Do I understand correctly this could be as "simple" as defining an accumulator that incrementally builds up a map? If the source contains <indexterm> elements I could maybe do something similar to >> >> <xsl:accumulator name="indexterms" as="map(xs:string, element(indexterm))" >> initial-value="map{}"> >> <xsl:accumulator-rule match="indexterm" >> new-value=" map:put($value, generate-id(), .) "/> >> </xsl:accumulator> >> >> and at the end process the content of the accumulator? > > Yes, that is essentially how it is supposed to be done. However, there > are a few caveats with the code snippet above: > > - accumulators must be motionless, they cannot consume the current node > - you cannot store references to nodes, here you use ".", which is not > allowed > - childness nodes, such as text(), can be consumed, which comes in handy > here > - map:put was dropped, but it seems to re-emerge, see Public XSLT Spec > Bug 24726 (https://www.w3.org/Bugs/Public/show_bug.cgi?id=24726) > > To create an accumulator for indexterm elements, we need to reverse the > match pattern, so that the focus of the accumulator is on a non-element > leaf-node (a childless node). For simplicity, let's assume DocBook > <indexterm> like as follows: > > <indexterm> > <primary>prim</primary> > <secondary>beginning</secondary> > </indexterm> > > Then your accumulator could look like this: > > <xsl:accumulator name="indexterms" > as="map(xs:string, xs:string+)" > initial-value="map{}"> > <xsl:accumulator-rule > match="text()[parent::primary | > parent::secundary][ancestor::indexterm]" > new-value="map:put( > $value, > generate-id(ancestor::indexterm), > ($indexterms(generate-id(ancestor::indexterm)), string(.)))" /> > > </xsl:accumulator> > > This matches on the text-node, and consuming the text-node is allowed > (it will always be childless). The fn:string(.) is still required (or > use fn:data, of fn:copy-of), because even though it is a childless node, > you cannot store its reference in a map. > > The accumulator above will create a sequence of terms mapped to the > indexterm-element, where the first term will be the <primary> element's > content and the second in the sequence will be the <secundary>, if any. > > The expression inside the new-value attribute can quickly become > unmanageable, but you can write a stylesheet function to write it > declaratively. > > Note that you must be careful with fn:generate-id in a streaming > scenario. With streaming it is likely you will have places where you use > fn:copy-of or fn:snapshot. The id's of these nodes will be different > from the ones on the streamed nodes of the input stream. > > Note also that this won't help you if you want to place the resulting > index prior to the nodes to be processed, i.e. a TOC at the beginning of > a document cannot be created this way. > > If you plan to attend the XML London 2014 conference this weekend, my > talk will be about Streaming Design Patterns, common programming > scenarios encountered in XSLT 2.0 and how to write them in a streamable > way. From easy (such as matching patterns that depend on the child > axis), to intermediate (such as working out following-sibling scenarios) > to advanced (such as a streamable way to do sorting in a maximum of two > passes). > > Cheers, > Abel [demime 1.01d removed an attachment of type application/pgp-signature which had a name of signature.asc]
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Alphabetical index: unstr, Abel Braaksma (Exsel | Thread | Re: [xsl] Re: XPath 3.0 Functions a, Abel Braaksma (Exsel |
[xsl] [ANN] "HTML5 and XML: Mending, Tommie Usdin btusdin | Date | [xsl] [ANN] Late-breaking News for , Tommie Usdin btusdin |
Month |