Re: [jats-list] Tagging page information in flow of a document

Subject: Re: [jats-list] Tagging page information in flow of a document
From: "Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx" <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 1 Feb 2022 22:01:58 -0000
On 01.02.2022 18:24, Kevin Hawkins kevin.s.hawkins@xxxxxxxxxxxxxxxxxx wrote:
But on this point ...

On 2/1/22 7:05 AM, Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx wrote:
Calculating the page number that a given piece of rendered content is on can also be a bit expensive when dealing with milestone elements. If you use XSLT 2+ (or Xquery 1+) for rendering, the >> and << operators can be used as in ($all-page-milestones[. >> current()])[1] to find which of the pagination milestones comes next. This should be sufficiently efficient.

... this method gets complicated if you ever have footnotes that span pages since then you will have two milestones for "the same" page break: one in the main flow of the text and another in the affected footnote.

Or if <boxed-text>, too, may stretch across pages, separate from the main text flow. It will get even more complicated if <boxed-text> contains footnotes that are rendered below the <boxed-text> columns...


Considering only the footnotes vs. main text flow case: Here's a small gist with an XSLT that adds page numbers to text nodes:

https://gist.github.com/gimsieke/739f185a0bb5d032f2999aa5e439a163

By convention, the page numbers are encoded in IDs of the form 'p-X' (where X is a placeholder, not a Roman numeral...) or 'fnp-Y' for page breaks in footnotes.

If the text node is in a footnote, the preceding page break that comes last in document order, *whether regular or in a footnote*, gives the page number. For text nodes in the main text flow, only the last *regular* page break is considered.

In the interest of efficiency, it is important not to use preceding::target[1] or (preceding::target)[last()].

Well, the XSLT processor might optimize preceding::target[1] in a way that it will stop at the first target it encounters when looking back, so this approach doesn't have to be too inefficient.

Still, computing all page breaks once and passing them in a tunnel parameter (or in a global variable, but this isn't optimal for other reasons) will improve performance for larger documents since the XSLT processor only needs to determine which of the *known* page breaks precede the current node, it need not examine the preceding axis afresh for each text node.

But this is probably something I should better discuss on xsl-list.

Gerrit



Kevin

Current Thread