[xsl] Re: Upward projection?

Subject: [xsl] Re: Upward projection?
From: "Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sun, 16 Aug 2020 21:15:51 -0000
Replying to xsl-list with Wendell's permission, for the upward projection connoisseurs out there. See Wendell's message that I'm replying to below.

tl;dr: Wendell invented a slightly different upward projection method in 2013, and his method's performance for large chunks might be better than mine. The performance needs to be systematically measured for the same test set that I used in my XML Prague paper.


Hi Wendell,

Trying to wrap my head around what the stylesheet is doing.

So you are grouping leave notes from a chunking root, each group starting with a milestone. I do this, too.

In the 'build' template, you then group the leaves according to their ancestors' generated IDs, topmost ancestors first (not all ancestors though, only those that are descendants of the splitting root).

At $level=1, (ancestor::* except $from/ancestor-or-self::*)[$level] will give the child of $here that is an ancestor of the current leaf.

So we are putting the leaves that have the top-level ancestor below $here in each group.

Line 41: $copying is this top-level ancestor below $here (at least for $level=1)

If it is empty ($here is current node's parent), the grouping key for group-adjacent is the zero-length string, so the complete group from the group-starting-with will be output.

If it isn't empty, the top-level ancestor will be copied.

Then the build template will be called for each group with $level=2, that is, the grandchildren of $here will be reproduced for each current group of leaves.

Ok, I think I get it.

I'd say this is an alternative way to do upward projection, and it is definitely independent discovery. We have been using this method since 2010, but we didn't mention it publicly before 2014.

And I can imagine that your solution doesn't suffer from the performance degradation for large chunk sizes that I described in the paper (https://archive.xmlprague.cz/2019/files/xmlprague-2019-proceedings.pdf#page=360).

What my method did was to transform $here with a tunneled parameter, $restricted-to, that contains the generated IDs of all $here descendants that are simultaneously ancestor-or-selfs of each leaf group. For each node, it decides whether it is in the current $restricted-to set, and only then it is output. And these comparisons become increasingly expensive the larger the chunks become (the more strings $restricted-to contains). This may be a Saxon implementation detail, and we should calculate the results for Saxon 10, too.

We should re-implement the code (https://subversion.le-tex.de/common/presentations/2019-02-09_xmlprague_xslt-upward-projection/examples/pb/) with your method and carry out the same measurements. If your method proves to be more efficient for large chunks, it will justify another XML Prague presentation, at least for a 15-minute slot. Or at the very least another post to this list.


P.S.: No deer here, only cat & dog. Also no rain, only sunshine at max. 30 B0C (86 B0F) during the day.

On 16.08.2020 17:07, Wendell Piez wrote:
Holla Gerrit,

Is this an example of upward projection? (see template name='build')


I am curious if it is a case of independent discovery. In any case, as
you well know, the problem occasionally rears its ugly head in the TEI
context, indeed this is actually a species of overlap (a natural
enough occurrence that is found to be 'unnatural' by XML).

Greetings from Maryland, where today it is raining and we have four
deer in our backyard this morning.

Cheers, Wendell

Current Thread