Re: [EXTERNAL] Re: [jats-list] Tagging page information in flow of a document

Subject: Re: [EXTERNAL] Re: [jats-list] Tagging page information in flow of a document
From: "Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx" <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 1 Feb 2022 15:52:59 -0000
Of course, I didnbt think of target, thank you, Martin & Jen.

I did intend to use only milestone-start without milestone-end, and I suggested to use named-content as a quasi-atomic element with no substructure. So all 3 suggestions are quite equivalent in their utility and in their (lack of) semantics. But I admit that target might be the idiomatically most natural approach.

Gerrit

On 01.02.2022 16:41, Jennifer Flint jen@xxxxxxxxxxxxxxxxxx wrote:
I admit to being one of those who uses an empty <target target-type="pagenumber" id="page-1"/> to identify the start of each PDF page. Using <milestone-start> ... <milestone-end> is tricky when you are dealing with floating objects (tables, boxes, figures) as they will be unintentionally trapped within your <milestone-start> and <milestone-end> elements when they may in fact be placed on a subsequent page in the layout. I find <target> fits in most locations where it is required.

Jen

jflintcreative.com

On Tue, Feb 1, 2022, at 16:18, Latterner, Martin (NIH/NLM/NCBI) [E] latternm@xxxxxxxxxxxxxxxx wrote:
To share an observation, not a recommendation:

I oftentimes see <target> being used for page numbers: <target
target-type="page" id="p6">VI</target>

Martin

-----Original Message-----
From: Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx
<jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Tuesday, February 01, 2022 8:05 AM
To: jats-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: [EXTERNAL] Re: [jats-list] Tagging page information in flow of
a document

CAUTION: This email originated from outside of the organization. Do not
click links or open attachments unless you recognize the sender and are
confident the content is safe.


Hi Jan,


Maybe you can use something like <milestone-start rationale="page"
id="p-14"/> and establish a convention that the page number is encoded
in the ID.

Alternatively you can put the page number in <named-content
content-type="page-number" id="p-14">14</named-content> and make sure
that it isn't treated as part of the normal text flow.

Both elements can only be used inline, so you need to put them, for
example, at the beginning of the section title if the page starts with
this section, rather than between two <sec> elements.

When full-text indexing the content, you might want to strip away these
elements before tokenizing. Or you might want to move these pagination
milestones to the end of each token so that you know on which page each
word was. Or for rendering HTML, if you convert the milestones to <a
id="..."> anchor elements, it might be advisable to move them from the
middle of a word to the beginning. Or to establish a practice not to
put the page break information between the exact characters where the
page break occurred but to the beginning or end of the word.

Calculating the page number that a given piece of rendered content is
on can also be a bit expensive when dealing with milestone elements. If
you use XSLT 2+ (or Xquery 1+) for rendering, the >> and << operators
can be used as in ($all-page-milestones[. >> current()])[1] to find
which of the pagination milestones comes next. This should be
sufficiently efficient.

Gerrit




On 01.02.2022 13:14, Jan Driesen jan.driesen@xxxxxxxxxxx wrote:
Dear all,

Is there a good practice for tagging information on the print page
number in JATS and / or BITS? Not just the start & and page in the
metadata level, but rather balong the flowb in the body-text?

While the XML may be intended to be medium independent, it is
customary to cite statements in another work using the exact page
number in the print/pdf version. Print/PDF has this bpageb feature
obviously exposed when consulting it, and users tend to consider this
paged format as a version of record for citing because of this. To
enable other output formats to deliver these features (e.g. in a
side-bar or overlay for html), contributes to the capability of citing
them, without the need to revert to a paged medium version of the content.

I understand this touches on how digital or un-paged media can (or
should) be cited in other ways, but as things stand today, users of
non-paged media feel a need to refer to print page number (of a
certain parallel paged version). If there is a good way to code this
in JATS/BITS, we could offer this information to in other formats too,
so it is easier to cite the workb&

Kind regards,

Jan Driesen

JATS-List info and archive
<http://www.mulberrytech.com/JATS/JATS-List/>
EasyUnsubscribe <http://lists.mulberrytech.com/unsub/jats-list/225679>
(by email <>)

Current Thread