Re: [jats-list] Tagging page information in flow of a document

Subject: Re: [jats-list] Tagging page information in flow of a document
From: "Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx" <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 1 Feb 2022 13:05:20 -0000
Hi Jan,

Maybe you can use something like <milestone-start rationale="page" id="p-14"/> and establish a convention that the page number is encoded in the ID.

Alternatively you can put the page number in <named-content content-type="page-number" id="p-14">14</named-content> and make sure that it isn't treated as part of the normal text flow.

Both elements can only be used inline, so you need to put them, for example, at the beginning of the section title if the page starts with this section, rather than between two <sec> elements.

When full-text indexing the content, you might want to strip away these elements before tokenizing. Or you might want to move these pagination milestones to the end of each token so that you know on which page each word was. Or for rendering HTML, if you convert the milestones to <a id="..."> anchor elements, it might be advisable to move them from the middle of a word to the beginning. Or to establish a practice not to put the page break information between the exact characters where the page break occurred but to the beginning or end of the word.

Calculating the page number that a given piece of rendered content is on can also be a bit expensive when dealing with milestone elements. If you use XSLT 2+ (or Xquery 1+) for rendering, the >> and << operators can be used as in ($all-page-milestones[. >> current()])[1] to find which of the pagination milestones comes next. This should be sufficiently efficient.

Gerrit




On 01.02.2022 13:14, Jan Driesen jan.driesen@xxxxxxxxxxx wrote:
Dear all,

Is there a good practice for tagging information on the print page number in JATS and / or BITS? Not just the start & and page in the metadata level, but rather balong the flowb in the body-text?

While the XML may be intended to be medium independent, it is customary to cite statements in another work using the exact page number in the print/pdf version. Print/PDF has this bpageb feature obviously exposed when consulting it, and users tend to consider this paged format as a version of record for citing because of this. To enable other output formats to deliver these features (e.g. in a side-bar or overlay for html), contributes to the capability of citing them, without the need to revert to a paged medium version of the content.

I understand this touches on how digital or un-paged media can (or should) be cited in other ways, but as things stand today, users of non-paged media feel a need to refer to print page number (of a certain parallel paged version). If there is a good way to code this in JATS/BITS, we could offer this information to in other formats too, so it is easier to cite the workb&

Kind regards,

Jan Driesen

JATS-List info and archive <http://www.mulberrytech.com/JATS/JATS-List/>
EasyUnsubscribe <http://lists.mulberrytech.com/unsub/jats-list/225679> (by email <>)

-- Gerrit Imsieke GeschC$ftsfC<hrer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@xxxxxxxxx, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

GeschC$ftsfC<hrer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt

Current Thread