Re: [jats-list] Translations for section titles

Subject: Re: [jats-list] Translations for section titles
From: "Lizzi, Vincent vincent.lizzi@xxxxxxxxxxxxxxxxxxxx" <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 6 May 2022 17:25:39 -0000
Hi Gareth,

Thank you for sharing your comments!

What you described that youbve done with articles published with alternative
language versions in separate documents makes perfect sense, and this is one
of the use cases that the committee considered. The IDs provide a
synchronization between the two documents that contain alternative language
versions of some content. The @lang-group attribute can be used in
synchronization between documents to more explicitly describe what you are
already doing with the @id attribute.

There are also scenarios where the entire text of a journal article, or
substantial portions of the text, is published with alternative language
versions in one document. Both approaches (one document or separate documents)
have arguments for and against them. JATS users might choose either approach
and JATS can support both approaches.

In the example of a German technical paper that quotes some English language
content, the document would declare German as the primary language at the root
element <article xml:lang="de"> and the English language quote would have
@xml:lang="en" on the element that contains the quote. If there is a
translation of the English quote into Germen then the original and the
translation can be joined using the language attributes. For example:

<disp-quote xml:lang="en" lang-variant="original" lang-group="quote1"
id="quote1"><p>English text</p></disp-quote>
<p>das heiCt C<bersetzt</p>
<disp-quote lang-variant="translation" lang-group="quote1"><p>Deutscher
Text</p></disp-quote>

If instead of translating the quote the author has summarized the English
quote in German then the German variant could be tagged as an interpretation
of the quote.

<p lang-variant="interpretation" lang-group="quote1">Zusammenfassung</p>

Ibm using Google Translate so my apologies if the German text in this
example is incorrect.

Kind regards,
Vincent

_____________________________________________
Vincent M. Lizzi
Head of Information Standards | Taylor & Francis Group
vincent.lizzi@xxxxxxxxxxxxxxxxxxxx<mailto:vincent.lizzi@xxxxxxxxxxxxxxxxxxxx>




Information Classification: General
From: Gareth Oakes goakes@xxxxxxx <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Friday, May 6, 2022 7:32 AM
To: jats-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [jats-list] Translations for section titles

Hi all,

I must admit to ignorance, having not attended the JATSCon presentations, but
it seems a mistake to try and overload document elements with multi-lingual
features. In simple cases Ibm sure it works out but more complicated cases
may require each language to be in its own document. For example, an English
document with index where the Mandarin equivalent has the index entries with
an entirely different sort order.

What webve done in complex cases is use unique IDs to identify relevant
structural elements across the individual language variants. These IDs
logically provide bsynchronisationb points in each document where we know
the content will align. The IDs are unique within the source document but
shared across the language variants. A trivial example would be a French
document with 8 paragraphs and two headings with English equivalent having
only 6 paragraphs plus two headings. You can align at each heading and give it
a unique ID for bsyncb purposes. You can even ID the paragraphs (although
two IDs are missing in the English example). Some XML legislations works this
way, e.g. in the Canadian legal tradition that aligns English and French.

Where the element overloading idea makes a lot of sense is when you have
documents authored in a certain language but contain other language as
embedded content. For example, a German technical paper may quote some English
language content within. The bother languageb elements may act as either
block or inline.

This is speaking from general experience, Ibm not sure how much directly
relates to JATS, I guess at the end of the day it depends on what you want to
do with the multi-lingual data anyway. I hope some if the ideas are helpful
for the purposes of this discussion.

// Gareth Oakes
// Chief Project Officer, GPSL
// www.gpsl.co<http://www.gpsl.co>

From: "Nikos Markantonatos nikos@xxxxxxxxxx<mailto:nikos@xxxxxxxxxx>"
<jats-list-service@xxxxxxxxxxxxxxxxxxxxxx<mailto:jats-list-service@xxxxxxxxxx
errytech.com>>
Reply to:
"jats-list@xxxxxxxxxxxxxxxxxxxxxx<mailto:jats-list@xxxxxxxxxxxxxxxxxxxxxx>"
<jats-list@xxxxxxxxxxxxxxxxxxxxxx<mailto:jats-list@xxxxxxxxxxxxxxxxxxxxxx>>
Date: Friday, 6 May 2022 at 17:48
To:
"jats-list@xxxxxxxxxxxxxxxxxxxxxx<mailto:jats-list@xxxxxxxxxxxxxxxxxxxxxx>"
<jats-list@xxxxxxxxxxxxxxxxxxxxxx<mailto:jats-list@xxxxxxxxxxxxxxxxxxxxxx>>
Subject: Re: [jats-list] Translations for section titles

Hi Gerrit.

Thanks for sharing this illustrative example. I personally find the collection
of all languages under the same <title> element by means of several
<name-content> elements semantically inelegant and simply unacceptable. I
suspect that the multi-lingual approach described by Vincent in JATSCon will
be accompanied by a corresponding relaxing of several models in JATS to allow
for "zero or more" instances rather than the stricter "zero or one" logic the
models allow for today.

So <label> and <title> under <sec> will have to allow for "zero or more"
instances. They only allow for "zero or one" today. Same goes for <label> and
<title> under <list>. A similar logic must be applied to <label> and <caption>
under <table-wrap>. And the list goes on and on.

Vincent, I wonder whether there has been any thought on how many models in
JATS will need to be relaxed to allow for true side-by-side multi-lingual
encoding.

Nikos

On 5/4/22 9:42 PM, Imsieke, Gerrit, le-tex
gerrit.imsieke@xxxxxxxxx<mailto:gerrit.imsieke@xxxxxxxxx> wrote:
Hi Vincent (and list),

Yesterday after your presentation and during the JATS-Con social (or happy)
hour I complained that if you have a multiple-language document in which you
choose to tag your content with block-level items and their translations
side-by-side, you can't have translations for section titles.

I was thinking of someone who followed the "content items in two or more
languages" approach
(https://www.ncbi.nlm.nih.gov/books/NBK579699/#lizzi-content-items-in-two-or-
more-languages<https://www.ncbi.nlm.nih.gov/books/NBK579699/#lizzi-content-it
ems-in-two-or-more-languages>) by adding a translation for each figure,
table-wrap, p etc. element that is allowed within a section, side by side.

While you can have multiple figures, tables, paragraphs etc. in a row, you
can't provide translation elements for elements that are allowed exactly once
in a section, in particular, for the title element.

But there's a workaround: You can have the title translations as inline
content in the title element, like so:

<sec id="sec1">
  <title><named-content content-type="title-content" lang-group="sec1-content"
xml:lang="en">English Title</named-content>
    <named-content content-type="title-content" lang-group="sec1-content"
xml:lang="fr">Titre franC'ais</named-content>
    <named-content content-type="title-content" lang-group="sec1-content"
xml:lang="de">Deutscher Titel</named-content></title>
  <p lang-group="sec1-p1" xml:lang="en">Paragraph</p>
  <p lang-group="sec1-p1" xml:lang="fr">Paragraphe</p>
  <p lang-group="sec1-p1" xml:lang="de">Absatz</p>
</sec>

So I think it's feasible by and large to pursue a block-level translation
approach, with minor sacrifices where you need to process inline translations.

(Note: I'm not advocating such an approach, but it came up occasionally when
discussing how to tag multilingual content with customers.)

Gerrit

--
[Image removed by sender.]<https://www.atypon.com/>
Nikos Markantonatos | Atypon, Greece Operations Head
Leoforos Ethnikis Antistaseos 39A, 3rd floor, Nea Ionia, 14234, Greece
office +302110133003 | mobile +306974302945 |
nikos@xxxxxxxxxx<mailto:nikos@xxxxxxxxxx>
[Image removed by sender.]atyponb.com
CONFIDENTIAL: This email and any attachments may contain confidential and
legally privileged information for the exclusive use of the designated
recipients. Unauthorized review, use, storage, disclosure or distribution is
prohibited. If you are not the intended recipient, contact the sender and
destroy all copies of the original message.
JATS-List info and archive<http://www.mulberrytech.com/JATS/JATS-List/>
EasyUnsubscribe<http://lists.mulberrytech.com/unsub/jats-list/2708257> (by
email<>)

Current Thread