Subject: Re: [xsl] exercise in complex grouping From: "David Carlisle d.p.carlisle@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Tue, 12 May 2020 10:53:34 -0000 |
I have assumed there are no complicated overlapping cases here but this works for the one case I tried which includes A before B and B before A <x> blah blah blah <d><e>blah</e> blah <B target="#A1">blort</B> <f>monkey</f> shines <A xml:id="A1">snort</A> blah <A xml:id="A2">snort</A> <q/> <l>zzz</l> <B target="#A2">blort</B> <kkkk/> </d> zzz </x> ---- <xsl:stylesheet version="2.0" xmlns:xsl=" http://www.w3.org/1999/XSL/Transform"> <xsl:template match="node()"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:apply-templates/> </xsl:copy> </xsl:template> <xsl:key name="b" match="B" use="substring(@target,2)"/> <xsl:template match="d"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:for-each-group select="node()" group-adjacent="self::B or self::A[key('b',@xml:id)]"> <xsl:choose> <xsl:when test="current-grouping-key()"> </xsl:when> <xsl:otherwise> <xsl:variable name="a" select="preceding-sibling::*[1]"/> <xsl:variable name="b" select="current-group()[last()]/following-sibling::*[1]"/> <xsl:choose> <xsl:when test="concat('#',$a/@xml:id)=$b/@target or concat('#',$b/@xml:id)=$a/@target"> <xsl:text> </xsl:text><C><xsl:text> </xsl:text> <xsl:copy-of select="$a"/> <xsl:text> </xsl:text> <xsl:copy-of select="current-group()"/> <xsl:text> </xsl:text> <xsl:copy-of select="$b"/> <xsl:text> </xsl:text></C><xsl:text> </xsl:text> </xsl:when> <xsl:otherwise> <xsl:copy-of select="current-group()"/> </xsl:otherwise> </xsl:choose> </xsl:otherwise> </xsl:choose> </xsl:for-each-group> </xsl:copy> </xsl:template> </xsl:stylesheet> --- produces <x> blah blah blah <d><e>blah</e> blah <C> <B target="#A1">blort</B> <f>monkey</f> shines <A xml:id="A1">snort</A> </C> blah <C> <A xml:id="A2">snort</A> <q/> <l>zzz</l> <B target="#A2">blort</B> </C> <kkkk/> </d> zzz </x> On Tue, 12 May 2020 at 10:33, Syd Bauman s.bauman@xxxxxxxxxxxxxxxx < xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > I have a moderately sizable TEI file (~31,000 text nodes with ~100,400 > "words" or ~688,000 characters; ~20,000 elements, ~15,000 attributes). > Somewhere in all that mess there are a few pairs of elements for which > I need some special processing. > > Say each pair is an <A> and a <B>. I can find each <B> by XPath quite > trivially. In addition, for every pair, <B> has a @target that points > to the corresponding <A> via a bare name identifier URL. Furthermore, > every <B> in the document is part of such a pair. (Which is why it is > so trivial to find them via XPath. The same can not be said for <A>: > there are *lots* of <A> elements that are not part of an <A>-<B> > pair; but none, of course, that bear that particular @xml:id, so they > can be found by XPath. It's just easy, not trivial. :-) > > In general, there can be other nodes between <A> and <B>, and there > will be cases in which <B> precedes rather than follows the <A> it > points to. E.g., > > blah blah blah > <d><e>blah</e> blah > <B target="#A1">blort</B> > <f>monkey</f> shines > <A xml:id="A1">snort</A> > blah</d> > > I want to be able to handle these cases, too. > > For the foreseeable future, there will never be another <B> in between > a <B> and the <A> it points to, and each <B> will be a child of the > same element as the <A> it points to. (I.e., no overlap problems.) But > as soon as I say these complications will never happen, the very next > day the editors will gleeful send e-mail saying they have found such a > case. But for now, if needed, I'm willing to write code that presumes > it won't happen. > > What I want for output is to be able to wrap the <B> with the <A> it > points to, *and everything in between* in a <C>. > > blah blah blah > <d><e>blah</e> blah > <C xml:id="A1Container"> > <B target="#A1">blort</B> > <f>monkey</f> shines > <A xml:id="A1">snort</A> > </C> > blah</d> > > I am 90% confident I can write some messy XSLT 1.0 Muenchian grouping > code that does this. (Although I suspect it would take two passes, > one for <A> precedes <B>, another for <B> precedes <A>; but I don't > care about two passes at all, and would not even care if it took N > passes.[1]) But I am equally confident there is a much better > <xsl:for-each-group> method that, at the moment, I simply can't wrap > my head around. > > Thanks for any thoughts, pointers, code, or advice. > > Note > ---- > [1] Where N is proportional to the number <A>-<B> pairs. > > -- > Syd Bauman, NRP (he/him/his) > Senior XML Programmer/Analyst > Northeastern University Women Writers Project > s.bauman@xxxxxxxxxxxxxxxx or > Syd_Bauman@xxxxxxxxxxxxxxxx
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] exercise in complex group, Wendell Piez wapiez@ | Thread | [xsl] [ANN] Balisage Peer-reviewed , B Tommie Usdin btusd |
Re: [xsl] exercise in complex group, Geert Bormans geert@ | Date | Re: [xsl] exercise in complex group, Geert Bormans geert@ |
Month |