Subject: Re: [xsl] exercise in complex grouping From: "Geert Bormans geert@xxxxxxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Tue, 12 May 2020 10:40:08 -0000 |
Hi Syd, This might not be appealing if you are looking for beautiful code... I would make a two-pass 1st pass: group-starting-with B and make the C group when you find the matching A in the same group (could do that with a group-ending-with matching A on the current() group) 2nd pass: group-ending-with B and roughly do the opposite of the above Or am I stating the obvious? In a single pass you can group-starting-with the A that have a following-sibling matching B AND on B having a following-sibling matching A maybe find the matching through a key in a function to not end up with a very costly behaviour finding the matching nodes Not sure it is worth the cost just to avoid two passes Maybe you can throw us an example file so we can test this Met vriendelijke groeten, Best regards, Geert Bormans ----- Oorspronkelijk bericht ----- Van: "Abel Braaksma, (Exselt) abel@xxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Aan: "xsl-list" <xsl-list@xxxxxxxxxxxxxxxxxxxxxx> Verzonden: Dinsdag 12 mei 2020 11:33:54 Onderwerp: [xsl] exercise in complex grouping I have a moderately sizable TEI file (~31,000 text nodes with ~100,400 "words" or ~688,000 characters; ~20,000 elements, ~15,000 attributes). Somewhere in all that mess there are a few pairs of elements for which I need some special processing. Say each pair is an <A> and a <B>. I can find each <B> by XPath quite trivially. In addition, for every pair, <B> has a @target that points to the corresponding <A> via a bare name identifier URL. Furthermore, every <B> in the document is part of such a pair. (Which is why it is so trivial to find them via XPath. The same can not be said for <A>: there are *lots* of <A> elements that are not part of an <A>-<B> pair; but none, of course, that bear that particular @xml:id, so they can be found by XPath. It's just easy, not trivial. :-) In general, there can be other nodes between <A> and <B>, and there will be cases in which <B> precedes rather than follows the <A> it points to. E.g., blah blah blah <d><e>blah</e> blah <B target="#A1">blort</B> <f>monkey</f> shines <A xml:id="A1">snort</A> blah</d> I want to be able to handle these cases, too. For the foreseeable future, there will never be another <B> in between a <B> and the <A> it points to, and each <B> will be a child of the same element as the <A> it points to. (I.e., no overlap problems.) But as soon as I say these complications will never happen, the very next day the editors will gleeful send e-mail saying they have found such a case. But for now, if needed, I'm willing to write code that presumes it won't happen. What I want for output is to be able to wrap the <B> with the <A> it points to, *and everything in between* in a <C>. blah blah blah <d><e>blah</e> blah <C xml:id="A1Container"> <B target="#A1">blort</B> <f>monkey</f> shines <A xml:id="A1">snort</A> </C> blah</d> I am 90% confident I can write some messy XSLT 1.0 Muenchian grouping code that does this. (Although I suspect it would take two passes, one for <A> precedes <B>, another for <B> precedes <A>; but I don't care about two passes at all, and would not even care if it took N passes.[1]) But I am equally confident there is a much better <xsl:for-each-group> method that, at the moment, I simply can't wrap my head around. Thanks for any thoughts, pointers, code, or advice. Note ---- [1] Where N is proportional to the number <A>-<B> pairs. -- Syd Bauman, NRP (he/him/his) Senior XML Programmer/Analyst Northeastern University Women Writers Project s.bauman@xxxxxxxxxxxxxxxx or Syd_Bauman@xxxxxxxxxxxxxxxx
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] exercise in complex group, Martin Honnen martin | Thread | Re: [xsl] exercise in complex group, Geert Bormans geert@ |
Re: [xsl] exercise in complex group, Martin Honnen martin | Date | Re: [xsl] exercise in complex group, David Carlisle d.p.c |
Month |