Subject: Re: [xsl] Grouping elements that have at least one common value From: "John Lumley john@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Fri, 16 Jun 2023 13:29:24 -0000 |
Doesnbt the xsl:iterate solution need an xsl:break under the xsl:when test=bempty($next-nodes)b, Mike? John Lumley Sent from my iPad > On 16 Jun 2023, at 13:00, Michael Kay michaelkay90@xxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > o;?This feels to me as being not so much a grouping problem, as a transitive closure problem. There's a relation between two elements ("have at least one value in common") and you're looking for the transitive closure of that relation to form one group. > > For efficiency you first need to find a way of evaluating the basic relation efficiently, and then you need an efficient way of finding its transitive closure. > > For the basic relation, start by defining a key > > <xsl:key name="k" match="GRCHOIX" use="CHOIX/@CODE"/> > > Then key('k', 'choix-1') will find all GRCHOIX elements that have 'choix-1' as one of their values. > > And key('k', //GRCHOIX/@CODE[.="grchoix-1"]/CHOIX/@CODE) will find all GRCHOIX elements that have either 'choix-1' or 'choix-2' as one of their values. > > So the base relation is defined by a function: > > <xsl:function name="f:step"> as="element(GRCHOIX)*"> > <xsl:param name="start" as="element(GRCHOIX)"/> > <xsl:sequence select="key('k', $start/CHOIX/@CODE)"/> > </xsl:function> > > Then you need to find the transitive closure of this step. > > As it happens we've just three days ago defined a transitive closure function for inclusion in XPath 4.0, which you can find here: > > https://qt4cg.org/pr/533/xpath-functions-40/Overview.html#func-transitive-clo sure > > Unfortunately, you need it now! You can implement it yourself, but it's not particularly easy. One way might be like this: > > <xsl:iterate select="1 to 100000000"> > <xsl:param name="result" as="node()*" select="()"/> > <xsl:param name="origin" as="node()*" select="$start-node"/> > <xsl:variable name="next-nodes" select="($origin ! f:step(.)) except $result"/> > <xsl:choose> > <xsl:when test="empty($next-nodes)"> > <xsl:sequence select="$result"/> > </xsl:when> > <xsl:otherwise> > <xsl:next-iteration> > <xsl:with-param name="result" select="$result | $next-ndoes"/> > <xsl:with-param name="origin" select="$next-nodes"/> > </ > </ > </ > > which basically keeps looking for more nodes satisfying the relation until no more are found. > > This gives you one group starting from one specific $start-node, which you presumably select at random. > > Then repeat, starting from a different $start-node, one that has not already been allocated to a group; until no more nodes are left. > > NOT TESTED! > > Michael Kay > Saxonica > > > >> grchoix-1 > > > >> On 16 Jun 2023, at 12:09, Matthieu Ricaud-Dussarget ricaudm@xxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: >> >> Hi all, >> >> I need to group elements that have at least one common value : >> >> <FORMS> >> <!--CASE 1--> >> <GRCHOIX CODE="grchoix-1"> >> <CHOIX CODE="choix-1"/> >> <CHOIX CODE="choix-2"/> >> </GRCHOIX> >> <GRCHOIX CODE="grchoix-2"> >> <CHOIX CODE="choix-1"/> >> <CHOIX CODE="choix-2"/> >> <CHOIX CODE="choix-3"/> >> </GRCHOIX> >> <GRCHOIX CODE="grchoix-3"> >> <CHOIX CODE="choix-2"/> >> <CHOIX CODE="choix-3"/> >> <CHOIX CODE="choix-4"/> >> </GRCHOIX> >> <!--CASE 2--> >> <GRCHOIX CODE="grchoix-A"> >> <CHOIX CODE="choix-a"/> >> <CHOIX CODE="choix-b"/> >> </GRCHOIX> >> <GRCHOIX CODE="grchoix-B"> >> <CHOIX CODE="choix-b"/> >> <CHOIX CODE="choix-c"/> >> </GRCHOIX> >> <GRCHOIX CODE="grchoix-C"> >> <CHOIX CODE="choix-d"/> >> <CHOIX CODE="choix-e"/> >> </GRCHOIX> >> <GRCHOIX CODE="grchoix-D"> >> <CHOIX CODE="choix-c"/> >> <CHOIX CODE="choix-d"/> >> </GRCHOIX> >> </FORMS> >> >> * grchoix-1 and grchoix-2 have 2 common values : choix-1, choix-2 b they must belong the same group >> * grchoix-2 and grchoix-3 have 2 common values : choix-2, choix-3 b they must belong the same group >> b At the end : grchoix-1, grchoix-2 and grchoix-3 must belong the same group >> >> * grchoix-A and grchoix-B have 1 common value choix-b b they must belong the same group >> * grchoix-B and grchoix-C have 0 common values >> * but as grchoix-D have >> - 1 common value (choix-c) with grchoix-B >> - 1 one common value (choix-d) with grchoix-C >> Then grchoix-B and grchoix-C must also belong the same GROUP >> b At the end : grchoix-A, grchoix-B, grchoix-C and grchoix-D must belong the same group >> >> The expected XML output is 2 Groups : >> <FORMS> >> <!--CASE 1--> >> <GROUP> >> <GRCHOIX CODE="grchoix-1"> >> <CHOIX CODE="choix-1"/> >> <CHOIX CODE="choix-2"/> >> </GRCHOIX> >> <GRCHOIX CODE="grchoix-2"> >> <CHOIX CODE="choix-1"/> >> <CHOIX CODE="choix-2"/> >> <CHOIX CODE="choix-3"/> >> </GRCHOIX> >> <GRCHOIX CODE="grchoix-3"> >> <CHOIX CODE="choix-2"/> >> <CHOIX CODE="choix-3"/> >> <CHOIX CODE="choix-4"/> >> </GRCHOIX> >> </GROUP> >> <!--CASE 2--> >> <GROUP> >> <GRCHOIX CODE="grchoix-A"> >> <CHOIX CODE="choix-a"/> >> <CHOIX CODE="choix-b"/> >> </GRCHOIX> >> <GRCHOIX CODE="grchoix-B"> >> <CHOIX CODE="choix-b"/> >> <CHOIX CODE="choix-c"/> >> </GRCHOIX> >> <GRCHOIX CODE="grchoix-C"> >> <CHOIX CODE="choix-d"/> >> <CHOIX CODE="choix-e"/> >> </GRCHOIX> >> <GRCHOIX CODE="grchoix-D"> >> <CHOIX CODE="choix-c"/> >> <CHOIX CODE="choix-d"/> >> </GRCHOIX> >> </GROUP> >> </FORMS> >> >> I'm really not sure how to go ahead with such a grouping problem. >> I've tried something that work on my sample but it's probably too complicated (maybe it doesnt' work all cases) and most of all it has drastic bad performances ! >> I'm using fro-each-group on multiple values, wich causes all combination to be executed and make the internal variables huge (from A 8Mo input I get a 500 Mo size of intermediate step) >> At this point I get without surprise a saxon internal error on my real (rather big) XML input. >> >> See my (so) bad solution in a next message (I get a message size to long when I copy it here) >> >> I'm pretty sure there's something quite more elegant to do, did you ever had to do such a grouping ? >> BTW I'm using XSLT 3. >> >> Any suggestion/help appreciated >> >> Cheers, >> >> Matthieu Ricaud-Dussarget >> >> >> >> XSL-List info and archive >> EasyUnsubscribe (by email) > > XSL-List info and archive > EasyUnsubscribe (by email)
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Grouping elements that ha, Michael Kay michaelk | Thread | Re: [xsl] Grouping elements that ha, Michael Kay michaelk |
Re: [xsl] Grouping elements that ha, Michael Kay michaelk | Date | Re: [xsl] Grouping elements that ha, Michael Kay michaelk |
Month |