Subject: Re: [xsl] To determine the distinct elements in a sequence of 46,656 elements takes 5 hours of XSLT processing From: "Imsieke, Gerrit, le-tex" <gerrit.imsieke@xxxxxxxxx> Date: Mon, 20 Aug 2012 01:21:02 +0200 |
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:f="f" version="2.0" exclude-result-prefixes="f xs" >
<xsl:template match="maps"> <xsl:for-each-group select="map" group-by="f:signature(.)"> <xsl:sequence select="." /> </xsl:for-each-group> </xsl:template>
<?xml version="1.0" encoding="utf-8"?> <maps> <map id="Planes-Enroute-to-Airports"> <singletonMap from="F16" to="DFW"/> <singletonMap from="B707" to="ORD"/> <singletonMap from="F35" to="MIA"/> <singletonMap from="S340" to="LAX"/> <singletonMap from="A320" to="SFO"/> <singletonMap from="MD90" to="DEN"/> </map> <map id="Planes-Enroute-to-Airports"> <singletonMap from="F16" to="DFW"/> <singletonMap from="B707" to="BER"/> <singletonMap from="F35" to="MIA"/> <singletonMap from="S340" to="LAX"/> <singletonMap from="A320" to="SFO"/> <singletonMap from="MD90" to="DEN"/> </map> <map id="Planes-Enroute-to-Airports"> <singletonMap from="F16" to="DFW"/> <singletonMap from="B707" to="ORD"/> <singletonMap from="F35" to="MIA"/> <singletonMap from="S340" to="LAX"/> <singletonMap from="A320" to="SFO"/> <singletonMap from="MD90" to="DEN"/> </map> <map id="Planes-Enroute-to-Airports"> <singletonMap from="F16" to="DFW"/> <singletonMap from="F35" to="MIA"/> <singletonMap from="S340" to="LAX"/> <singletonMap from="A320" to="SFO"/> <singletonMap from="MD90" to="DEN"/> </map> <map id="Planes-Enroute-to-Airports"> <singletonMap from="F16" to="DFW"/> <singletonMap from="B767" to="ORD"/> <singletonMap from="F35" to="MIA"/> <singletonMap from="S340" to="LAX"/> <singletonMap from="A320" to="SFO"/> <singletonMap from="MD90" to="DEN"/> </map> </maps>
<?xml version="1.0" encoding="UTF-8"?> <maps> <map id="Planes-Enroute-to-Airports"> <singletonMap from="F16" to="DFW"/> <singletonMap from="B707" to="ORD"/> <singletonMap from="F35" to="MIA"/> <singletonMap from="S340" to="LAX"/> <singletonMap from="A320" to="SFO"/> <singletonMap from="MD90" to="DEN"/> </map> <map id="Planes-Enroute-to-Airports"> <singletonMap from="F16" to="DFW"/> <singletonMap from="B707" to="BER"/> <singletonMap from="F35" to="MIA"/> <singletonMap from="S340" to="LAX"/> <singletonMap from="A320" to="SFO"/> <singletonMap from="MD90" to="DEN"/> </map> <map id="Planes-Enroute-to-Airports"> <singletonMap from="F16" to="DFW"/> <singletonMap from="F35" to="MIA"/> <singletonMap from="S340" to="LAX"/> <singletonMap from="A320" to="SFO"/> <singletonMap from="MD90" to="DEN"/> </map> <map id="Planes-Enroute-to-Airports"> <singletonMap from="F16" to="DFW"/> <singletonMap from="B767" to="ORD"/> <singletonMap from="F35" to="MIA"/> <singletonMap from="S340" to="LAX"/> <singletonMap from="A320" to="SFO"/> <singletonMap from="MD90" to="DEN"/> </map> </maps>
On 20/08/2012 00:11, Michael Kay wrote:Try the following:
count(distinct-values(map/f:signature(.))Sorry, I misread the question as just wanting the number of distinct maps. If you want the first map in each set of duplicates, use instead
<xsl:for-each-group select="map" group-by="f:signature(.)"> <xsl:sequence select="current-group()[1]"/> </xsl:for-each-group>
Michael Kay Saxonica
where f:signature is
<xsl:function name="f:signature"> <xsl:param name="map" as="element(map)"> <xsl:variable name="sorted-map" as="element(singletonMap)*"> <xsl:perform-sort select="$map/*"> <xsl:sort select="@from"/> <xsl:sort select="@to"/> </xsl:perform-sort> </xsl:variable> <xsl:sequence select="string-join($sorted-map/concat(@from, '~', @to), '|')"/> </xsl:function>
Michael Kay Saxonica
On 19/08/2012 23:43, Costello, Roger L. wrote:Hi Folks,
I have a sequence of 46,656 elements that I call "maps."
Here is one map:
<map id="Planes-Enroute-to-Airports"> <singletonMap from="F16" to="DFW"/> <singletonMap from="B707" to="ORD"/> <singletonMap from="F35" to="MIA"/> <singletonMap from="S340" to="LAX"/> <singletonMap from="A320" to="SFO"/> <singletonMap from="MD90" to="DEN"/> </map>
I wrote a function to return all of the distinct maps.
Unfortunately it takes about 5 hours of XSLT processing.
Perhaps my XSLT program is inefficient.
I am hoping that you can show me a more efficient program or identify where my program is inefficient.
I am using XSLT 2.0.
Here is my function to return all distinct maps: <xsl:function name="ct:distinct" as="element(map)*"> <xsl:param name="maps" /> <xsl:variable name="new-maps"> <maps> <xsl:sequence select="$maps" /> </maps> </xsl:variable> <xsl:for-each select="$new-maps/maps/map"> <xsl:if test="not(ct:contained-within(., ./following-sibling::map))"><xsl:sequence select="." /></xsl:if> </xsl:for-each> </xsl:function>
The following function determines if a map is contained within a sequence of maps; it uses a binary divide-and-conquer approach: <xsl:function name="ct:contained-within" as="xs:boolean"> <xsl:param name="map" as="element(map)"/> <xsl:param name="maps" as="element(map)*"/> <xsl:variable name="cnt" select="count($maps)" /> <xsl:choose> <xsl:when test="$cnt eq 0"><xsl:value-of select="false()" /></xsl:when> <xsl:when test="ct:equal($map, $maps[1])"><xsl:value-of select="true()" /></xsl:when> <xsl:when test="$cnt eq 1"><xsl:value-of select="false()" /></xsl:when> <xsl:otherwise> <xsl:variable name="half" select="$cnt idiv 2" /> <xsl:choose> <xsl:when test="ct:contained-within($map, $maps[position() = (2 to $half)])"><xsl:value-of select="true()" /></xsl:when> <xsl:otherwise><xsl:value-of select="ct:contained-within($map, $maps[position() = (($half + 1) to last())])" /></xsl:otherwise> </xsl:choose> </xsl:otherwise> </xsl:choose> </xsl:function>
Two maps are equal iff for each singletonMap in map1 there is a singletonMap in map2 with the same value for @to and @from: <xsl:function name="ct:equal" as="xs:boolean"> <xsl:param name="map1" as="element(map)"/> <xsl:param name="map2" as="element(map)"/> <xsl:choose> <xsl:when test="count($map1/*) ne count($map2/*)"><xsl:value-of select="false()" /></xsl:when> <xsl:otherwise> <xsl:variable name="result"> <xsl:for-each select="$map1/singletonMap"> <xsl:variable name="here" select="." /> <xsl:if test="$map2/singletonMap[@from eq $here/@from and @to ne $here/@to]">false</xsl:if> </xsl:for-each> </xsl:variable> <xsl:choose> <xsl:when test="contains($result, 'false')"><xsl:value-of select="false()" /></xsl:when> <xsl:otherwise><xsl:value-of select="true()" /></xsl:otherwise> </xsl:choose> </xsl:otherwise> </xsl:choose> </xsl:function>
/Roger
-- Gerrit Imsieke Geschdftsf|hrer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@xxxxxxxxx, http://www.le-tex.de
Registergericht / Commercial Register: Amtsgericht Leipzig Registernummer / Registration Number: HRB 24930
Geschdftsf|hrer: Gerrit Imsieke, Svea Jelonek, Thomas Schmidt, Dr. Reinhard Vvckler
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] To determine the distinct, Michael Kay | Thread | Re: [xsl] To determine the distinct, Imsieke, Gerrit, le- |
Re: [xsl] To determine the distinct, Michael Kay | Date | Re: [xsl] To determine the distinct, Imsieke, Gerrit, le- |
Month |