|
Subject: Re: [xsl] Question on streaming and grouping with nested keys From: "Felix Sasaki felix@xxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Fri, 14 Jul 2017 13:02:18 -0000 |
2017-07-14 14:41 GMT+02:00 Martin Honnen martin.honnen@xxxxxx <
xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>:
> On 14.07.2017 14:05, Felix Sasaki felix@xxxxxxxxxxxxxx wrote:
>
> I tried the example from Martin with
>>
>> <xsl:template match="TRANSACTION-LIST">
>> <xsl:copy>
>> <xsl:for-each-group select="copy-of(TRANSACTION)"
>> group-by="ITEM2/SUBITEM2/GROUPING-KEY">
>> <xsl:copy>
>> <item1-sum><xsl:value-of select="sum(current-group()/IT
>> EM2/SUBITEM2.1)"/></item1-count>
>>
>> ...
>>
>> It gives me an of memory error. The input file is 160MB, but the
>> individual transactions are rather small (around 20+ elements). The error
>> also appears if I remove "<xsl:copy>".
>>
>
> 160 MB doesn't sound like a file you need streaming for at all. Does that
> suggestion above cause memory problems only when using streaming (e.g. when
> you have <xsl:mode streamable="yes"/>) or also without streaming?
Without streaming it works.
> Have you tried increasing the memory for Saxon/Java?
>
No.
>
> As you mention Saxon EE, let's hope Michael Kay comes across this thread
> and can certainly tell you more on how to tackle that problem with his
> product.
>
> I have a working solution using an accumulator and maps, see below, but
>> here I did not manage to use streaming. If I set the accumulator to
>> streamable="yes", Saxon EE tells me
>>
>>
>> "The xsl:accumulator-rule/@select expression for a streaming accumulator
>> must be motionless"
>>
>>
>> Although I am using xsl-copy() as in Martin's example.
>>
>>
>> <xsl:accumulator name="gather-values" as="map(xs:anyAtomicType,
>> node())" initial-value="map{}">
>> <xsl:accumulator-rule match="TRANSACTION">
>> <xsl:variable name="current" select="copy-of()"/>
>>
>
> As far as I understand it, you can't use copy-of() in an accumulator you
> want to be streamable. Working with streaming and accumulating values
> requires a change of the usual coding habits with XSLT, I think, for
> instance to capture the key you have with an accumulator and streaming you
> would need to use e.g.
> <xsl:accumulator-rule match="TRANSACTION/ITEM2/SUBITEM2.2/GROUPING-KEY/text()"
> select="string()"/>
> as only on the text node you are able to read out that value while
> streaming through the document.
>
> So to try to solve that problem with accumulators and streaming I think
> you need several of them, one counting ITEM1, one summing up
> SUBITEM2.1/text(), the above for the key and then you need to combine them
> to store the data together.
>
Thanks. Working without accumulators is fine, just trying to understand the
issue. Other input files are a bit bigger, up to 1.5 GB, so having a
streaming solution would be nice but it's not mandatory.
- Felix
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: [xsl] Question on streaming and, Michael Kay mike@xxx | Thread | Re: [xsl] Question on streaming and, Martin Honnen martin |
| Re: [xsl] Question on streaming and, Michael Kay mike@xxx | Date | Re: [xsl] Question on streaming and, Martin Honnen martin |
| Month |