Subject: Re: [xsl] Question on streaming and grouping with nested keys From: "Felix Sasaki felix@xxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Fri, 14 Jul 2017 13:02:18 -0000 |
2017-07-14 14:41 GMT+02:00 Martin Honnen martin.honnen@xxxxxx < xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>: > On 14.07.2017 14:05, Felix Sasaki felix@xxxxxxxxxxxxxx wrote: > > I tried the example from Martin with >> >> <xsl:template match="TRANSACTION-LIST"> >> <xsl:copy> >> <xsl:for-each-group select="copy-of(TRANSACTION)" >> group-by="ITEM2/SUBITEM2/GROUPING-KEY"> >> <xsl:copy> >> <item1-sum><xsl:value-of select="sum(current-group()/IT >> EM2/SUBITEM2.1)"/></item1-count> >> >> ... >> >> It gives me an of memory error. The input file is 160MB, but the >> individual transactions are rather small (around 20+ elements). The error >> also appears if I remove "<xsl:copy>". >> > > 160 MB doesn't sound like a file you need streaming for at all. Does that > suggestion above cause memory problems only when using streaming (e.g. when > you have <xsl:mode streamable="yes"/>) or also without streaming? Without streaming it works. > Have you tried increasing the memory for Saxon/Java? > No. > > As you mention Saxon EE, let's hope Michael Kay comes across this thread > and can certainly tell you more on how to tackle that problem with his > product. > > I have a working solution using an accumulator and maps, see below, but >> here I did not manage to use streaming. If I set the accumulator to >> streamable="yes", Saxon EE tells me >> >> >> "The xsl:accumulator-rule/@select expression for a streaming accumulator >> must be motionless" >> >> >> Although I am using xsl-copy() as in Martin's example. >> >> >> <xsl:accumulator name="gather-values" as="map(xs:anyAtomicType, >> node())" initial-value="map{}"> >> <xsl:accumulator-rule match="TRANSACTION"> >> <xsl:variable name="current" select="copy-of()"/> >> > > As far as I understand it, you can't use copy-of() in an accumulator you > want to be streamable. Working with streaming and accumulating values > requires a change of the usual coding habits with XSLT, I think, for > instance to capture the key you have with an accumulator and streaming you > would need to use e.g. > <xsl:accumulator-rule match="TRANSACTION/ITEM2/SUBITEM2.2/GROUPING-KEY/text()" > select="string()"/> > as only on the text node you are able to read out that value while > streaming through the document. > > So to try to solve that problem with accumulators and streaming I think > you need several of them, one counting ITEM1, one summing up > SUBITEM2.1/text(), the above for the key and then you need to combine them > to store the data together. > Thanks. Working without accumulators is fine, just trying to understand the issue. Other input files are a bit bigger, up to 1.5 GB, so having a streaming solution would be nice but it's not mandatory. - Felix
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Question on streaming and, Michael Kay mike@xxx | Thread | Re: [xsl] Question on streaming and, Martin Honnen martin |
Re: [xsl] Question on streaming and, Michael Kay mike@xxx | Date | Re: [xsl] Question on streaming and, Martin Honnen martin |
Month |