Re: [xsl] Comparing direct ancestors

Subject: Re: [xsl] Comparing direct ancestors
From: Ihe Onwuka <ihe.onwuka@xxxxxxxxxxxxxx>
Date: Sun, 27 May 2012 22:23:24 +0100
On Thu, May 24, 2012 at 4:40 PM, Spencer Tickner
<spencertickner@xxxxxxxxx> wrote:
> Hi List,
>
> Thanks in advance for any help.
>
> I've been banging my head around this problem for awhile now and could
> use any advice anyone is willing to give. I have the following xml:
>
> <?xml version="1.0"?>
> <root>
>        <change-begin/>
>        <a>
>                <p>Foo<change-end/></p>
>        </a>
>        <change-begin/>
>        <b>
>                <d>
>                        <t>Bar</t>
>                </d>
>                <d>
>                        <t>Foo</t>
>                </d>
>        </b>
>        <change-end/>
>        <p>Nothing <change-begin/>to worry<change-end/> about</p>
> </root>
>
>
> So far it doesn't work and is fast becoming a bunch of spaghetti
> <choose> statements. If anyone has any ideas I would love to hear
> them.
>

OK. Lets start this again. This is long because it's meant to pedagogic.

When this problem initially appeared David Carlisle posted an almost
instantaneous solution.  I was still half heartedly trying to grok the
problem when the solution appeared and was completely awestruck at
David's solution. Actually dumbstruck would be more apt because I
didn't understand it and I had no idea how he had derived it. I gave
the code a passing glance and thought to myself - well you're  not in
his  league, don't even try to figure it out and the OP has a
solution.

As mentioned before though I wanted to see how far I could get by
applying Felleisen et al's Design Recipe to the problem.

I'm going to step through it again - this time to conclusion. I stress
I arrived at my solution independently to his but it will be
instructive to compare the answer the process yields to David's.

The purpose of the post is pedagogic, so lets slowly and methodically
step through the process right from the start. If you didn't consider
yourself capable of solving this problem this is for you.

Step 1 - Let us get a skeleton output where everything that is in it
is known to be in the right place.
Simplest way to do that is to remove all the change-begin and
change-end elements - see the bit commented STEP ONE

<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; version="2.0"
    exclude-result-prefixes="">

   <xsl:template match="@*|node()">
       <xsl:copy>
          <xsl:apply-templates select="@*|node()"/>
       </xsl:copy>
   </xsl:template>

   <!-- STEP ONE Remove all change-begin and change-end elements -->
    <xsl:template match="change-begin|change-end"/>
</xsl:stylesheet>

resulting output.

<?xml version="1.0" encoding="UTF-8"?>
<root>

       <a>
               <p>Foo
      </p>
       </a>

       <b>
               <d>
                       <t>Bar</t>
               </d>
               <d>
                       <t>Foo</t>
               </d>
       </b>

       <p>Nothing to worry about</p>
</root>

Everything that is there is in the right place but some things that
were in the right place have been removed.
Lets put them back. Additional template rule - commented STEP TWO does that.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
version="2.0"
                exclude-result-prefixes="">

   <xsl:template match="@*|node()">
      <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
      </xsl:copy>
   </xsl:template>

   <!-- STEP ONE Remove all change-begin and change-end elements -->
    <xsl:template match="change-begin|change-end"/>

    <!-- STEP TWO Restore all change-begin and change-ends that are
already in the correct place -->
    <xsl:template
match="p/change-begin|p/change-end|t/change-begin|t/change-end">
       <xsl:copy-of select="."/>
    </xsl:template>

</xsl:stylesheet>

That gives us this

<?xml version="1.0" encoding="UTF-8"?><root>

       <a>
               <p>Foo<change-end/></p>
       </a>

       <b>
               <d>
                       <t>Bar</t>
               </d>
               <d>
                       <t>Foo</t>
               </d>
       </b>

       <p>Nothing <change-begin/>to worry<change-end/> about</p>
</root>

So the change-begin and the change-end that started off in the right
place have been put back there.

Right heres where the last attempt went off the rails. We need to pull
in the stray change-begin's and change-ends into the nearest allowable
element and the original attempt not worth repeating was to try and
come up with an XPath expression to do that. It's tough to do in a
push style, (see the previous post for the attempt and explanation).

Now is the time to step back and remember that all methodical
approaches have limitations and can only take you so far. Mind you
they can take you alot farther than you might think and they are often
abandoned too early. The Design Recipe has it's roots in Jackson
Structured Programming (very annoying that I can't just use the
acronym).

We cannot process wholly in push style because as at the time we
encounter the first change-begin we have not yet read the rest of the
nodes much less know where it needs to be inserted. The problem we are
coming up with here looks like what Jackson would have called a
structure clash.
Suffice to say that it warrants looking at different techniques the
language offers because like JSP (oops thats torn it) the basics
methodology only entails using a very small subset of the language.
That's not to say we willy nilly pluck any old idea from the language.
Rather we want to extend our problem solving toolset in a controlled
way.

Thats when I thought of using xsl:keys. By doing a lookahead (in the
case of change-behind) and lookbehind in the case of change end, Can
we associate  the change elements with the allowable elements, worth a
try isn't it. Sounds easier than trying to figure out the XPath's to
do the same thing (although YMMV on that one depending on your
abilities).

Lookahead and Lookbehind seem to be fairly disciplined activities (i.e
as opposed to goto like hacks). Let me stress that I arrived at the
idea to use keys independently (and alot slower) than David did. I did
not look at his code to see what or how had done it. I vaguely
recollected him using keys in his solution and that merely suggested
there was mileage in pursuing that angle.

OK step THREE is to define the key and step FOUR is to use it. In case
we are wrong we will just do the change-begins.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
version="2.0"
                exclude-result-prefixes="">

    <!-- STEP THREE Define a key to pair up a change to the allowable
element it should go into -->
    <xsl:key name="changes" match="*[not (self::p or self::t)]/change-begin"
            use="generate-id(following::*[self::p or self::t][1])"/>

   <xsl:template match="@*|node()">
      <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
      </xsl:copy>
   </xsl:template>

   <!-- STEP ONE Remove all change-begin and change-end elements -->
    <xsl:template match="change-begin|change-end"/>

    <!-- STEP TWO Restore all change-begin and change-ends that are
already in the correct place -->
    <xsl:template
match="p/change-begin|p/change-end|t/change-begin|t/change-end">
       <xsl:copy-of select="."/>
    </xsl:template>

    <!-- STEP FOUR USE the key in STEP THREE to insert the
change-begin's in the right place -->
    <xsl:template match="p|t">
	    <xsl:copy>
		       <xsl:copy-of select="key('changes', generate-id())"/>
	        <xsl:apply-templates/>
	    </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

The key says for every strange change-begin look ahead for the first
allowable element that should house it.  As to how I came up with that
xsl:key, well knowledge of keys should be part of  your arsenal as an
XSLT developer.

Step Four exploits implicit priorities on the template match rule to
identify allowable elements that could take a change-begin.

Lets run it.

<?xml version="1.0" encoding="UTF-8"?><root>

       <a>
               <p><change-begin/>Foo<change-end/>
      </p>
       </a>

       <b>
               <d>
                       <t><change-begin/>Bar</t>
               </d>
               <d>
                       <t>Foo</t>
               </d>
       </b>

       <p>Nothing <change-begin/>to worry<change-end/> about</p>
</root>

Voila, all the change begin's are in the right place.

Step 5 is to handle the change-end's. There's a symmetry to the
additional key definition that it entails that often emerges when you
are on the right track to a solution.


<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; version="2.0"
    exclude-result-prefixes="">

    <!-- STEP THREE Define a key to pair up a change to the allowable
element it should go into -->
    <xsl:key name="changes" match="*[not (self::p or
self::t)]/change-begin" use="generate-id(following::*[self::p or
self::t][1])"/>

    <!-- STEP FIVE By symmetry extend the key to pair up change-ends
to allowable elements -->
    <xsl:key name="changes" match="*[not (self::p or
self::t)]/change-end" use="generate-id(preceding::*[self::p or
self::t][1])"/>

<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

   <!-- STEP ONE Remove all change-begin and change-end elements -->
    <xsl:template match="change-begin|change-end"/>

    <!-- STEP TWO Restore all change-begin and change-ends that are
already in the correct place -->
    <xsl:template
match="p/change-begin|p/change-end|t/change-begin|t/change-end">
       <xsl:copy-of select="."/>
    </xsl:template>

    <!-- STEP FOUR USE the key in STEP THREE to insert the
change-begin's in the right place -->
    <xsl:template match="p|t">
	    <xsl:copy>
		    <xsl:copy-of select="key('changes', generate-id())"/>
	    <xsl:apply-templates/>
	    </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

Running this  does give the answer that the OP was seeking. I won't
paste the answer so as to not take up even more space.

Now I have no idea what process David went through but having got the
answer I thought it would be worth taking a peek at his code.

Again I was awestruck at the similarities in the final solution' given
there had not been any mental synergy in their construction.

Awestruck but this time, happily, not dumbfounded. :)

Current Thread