Re: [xsl] Problem sorting incoming node list

Subject: Re: [xsl] Problem sorting incoming node list
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 8 May 2019 17:18:13 -0000
I haven't had a chance to study all of this, but please note that if you want
to use the XPath 3.1 function then you should be using a more recent Saxon
release than 9.3. The current version is 9.9. Also, only the 1-argument
version of sort() will work in Saxon-HE; the more useful versions of the
function (that allow a user-defined sort key) are higher order functions and
therefore require Saxon-PE or Saxon-EE. The syntax would be

> <xsl:for-each-group select=bsort(entry[location=$undulator],
function($entry) {data($entry/(isodate,time))})"
group-by="statistics_category">

Michael Kay
Saxonica

> On 8 May 2019, at 18:08, Raimund Kammering raimund.kammering@xxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Hello,
>
> I have simple sorting problem but none the less am completely stuck!
>
> I know the standard techniques for sorting and how and where to apply it -
at least I thought so! But for the given set of transformations I fail to
fiddle in the sorting for one part of the transformations!
>
> Problem is as follows:
>
> XML is like this (sorry for the long example, but to make the not nice order
clear I need to supply such a sequence)
>
> <list>
> <entry>
> <severity>NONE</severity>
> <isodate>2019-05-06</isodate>
> <time>07:01:03</time>
> <location>not set</location>
> <category>USERLOG</category>
> </entry>
> <entry>
> <severity>STATISTICS</severity>
> <isodate>2019-05-06</isodate>
> <time>07:00:00</time>
> <location>SASE1</location>
> <statistics_category>SASE_delivery</statistics_category>
> <SASE_tuning>not_set</SASE_tuning>
> <Linac_setup>not_set</Linac_setup>
> <Down>not_set</Down>
> <category>STATISTICS</category>
> </entry>
> <entry>
> <severity>STATISTICS</severity>
> <isodate>2019-05-06</isodate>
> <time>07:00:00</time>
> <location>SASE2</location>
> <statistics_category>SASE_delivery</statistics_category>
> <SASE_tuning>not_set</SASE_tuning>
> <Linac_setup>not_set</Linac_setup>
> <Down>not_set</Down>
> <category>STATISTICS</category>
> </entry>
> <entry>
> <severity>STATISTICS</severity>
> <isodate>2019-05-06</isodate>
> <time>07:00:00</time>
> <location>SASE3</location>
> <statistics_category>SASE_delivery</statistics_category>
> <SASE_tuning>not_set</SASE_tuning>
> <Linac_setup>not_set</Linac_setup>
> <Down>not_set</Down>
> <category>STATISTICS</category>
> </entry>
> <entry>
> <severity>NONE</severity>
> <isodate>2019-05-06</isodate>
> <time>07:15:53</time>
> <location>not set</location>
> <category>USERLOG</category>
> </entry>
> ...
> <entry>
> <severity>NONE</severity>
> <isodate>2019-05-06</isodate>
> <time>08:57:47</time>
> <category>USERLOG</category>
> </entry>
> <entry>
> <severity>STATISTICS</severity>
> <isodate>2019-05-06</isodate>
> <time>08:30:00</time>                                       <bbb This
<entry> is NOT sorted according to time!
> <location>ACCELERATOR</location>
> <statistics_category>Linac_setup</statistics_category>
> <Linac_setup>not_set</Linac_setup>
> <SASE_tuning>not_set</SASE_tuning>
> <Down>not_set</Down>
> <category>STATISTICS</category>
> </entry>
> <entry>
> <severity>NONE</severity>
> <isodate>2019-05-06</isodate>
> <time>09:08:45</time>
> <category>USERLOG</category>
> </entry>
> </list>
>
> bNormalb processing of this is no problem with the following XSLT
>
> <xsl:template match=blistb>
> b&
>  <xsl:apply-templates select="entry">
>    <xsl:sort order=bdescending" select="isodate"/>
>    <xsl:sort order=bdescending" select="time"/>
>  </xsl:apply-templates>
> b&
> </xsl:template>
>
> This correctly produces one to one processing of the standard entries
(category=USERLOG). But now I need to do some bulk processing to evaluate some
time differences. Logic is roughly like this:
>
> 1. get (first) category=bSTATISTICSb entry
> 2. find next one (smaller times since descending) with same location
> 3. Calculate time difference (done by a function f:duration-in-hours())
> 4. Advance to new found time step
> 5. repeat
>
> This is done for all possible locations and even allows easily summing up
the partial time spans like this:
>
> <xsl:template name="statistics_summary">
> <xsl:param name="undulator"/>
>  <xsl:for-each-group select="entry[location=$undulator]"
group-by="statistics_category">
>    <xsl:value-of select="current-grouping-key()"/>
>    <xsl:text>: &#160;</xsl:text>
>    <xsl:value-of
select="format-number(sum(current-group()/f:duration-in-hours(.,$undulator)),
'0.00'), ' h'"/>
>  </xsl:for-each-group>
> </xsl:template>
>
> This template is called from the list template like this:
>
> <xsl:template match=blistb>
> b&
>  <xsl:call-template name="statistics_summary">
>     <xsl:with-param name="undulator">ACCELERATOR</xsl:with-param>
>  </xsl:call-template>
>  <xsl:call-template name="statistics_summary">
>     <xsl:with-param name="undulatorb>SASE1</xsl:with-param>
>  </xsl:call-template>
>  <xsl:call-template name="statistics_summary">
>     <xsl:with-param name="undulatorb>SASE2</xsl:with-param>
>  </xsl:call-template>
>  <xsl:call-template name="statistics_summary">
>     <xsl:with-param name="undulatorb>SASE3</xsl:with-param>
>  </xsl:call-template>
> ...
>  <xsl:apply-templates select="entry">
>    <xsl:sort order=bdescending" select="isodate"/>
>    <xsl:sort order=bdescending" select="time"/>
>  </xsl:apply-templates>
> b&
> </xsl:template>
>
> The whole transformations work well except for that the incorrect order in
respect to the <time> value make the result incorrect!
>
> So idea is clearly: Sort the entries according to the <time> element before
processing with for-each-group. But here I completely failed!
> First idea would be to use apply-templates instead of call-template to allow
the use of xsl:sort, but this breaks all logic implemented for the time span
calculations. Next would be to sort within the statistics_summary template,
but here I guess Ibm on the wrong axis - right?
> Even tried to do a copy-of to create a small node set only containing the
relevant entries and sort these prior to passing it to the for-each-group but
did not manage to do so and might anyhow not be very clever. Naively thinking
I would love to have a kind of xpath sort to allow something like this:
>
> <xsl:for-each-group select=bsort(entry[location=$undulator]b,
(isodate,time)) group-by="statistics_category">
>
> and found this to be present in xpath 3.1. But here I ran into transformer
exceptions concerning the syntax (using Saxon 9.3.0.5 HE)
>
> So that Ibm now finally complete stuck to find an reasonable solution
(without reworking the whole statistics calculation)! Any help would be very
appreciated!
>
> Greetings,
> Raimund

Current Thread