Re: [jats-list] Translations for section titles

Subject: Re: [jats-list] Translations for section titles
From: "Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx" <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 7 May 2022 09:55:20 -0000
If you decided to make the language switch on the top-level sections, you'll have a German, an English, and a French section 5 side by side in the body.

Where do you put the mixed-language table that is supposed to span all three columns?

Maybe mark one of the three top-level section 5s with lang-focus="primary", put the mixed-language table in this section and omit it in the others? You might not even need to mark it as primary.

Instruct the renderer to render block-level content that is only present in the primary language variant centered and across all columns?

But what if this content is only relevant for the primary language and should be rendered only in its own column?

You can modify the instructions to the renderer as follows:

Block-level content
  - that is only present in one language
  - but that contains inline lang-groups for all languages
should be rendered across all columns.

I think this might work.

It should be noted that these side-by-side renditions (and manuscripts! Content was not only rendered, but also edited in multiple columns) have been deprecated within DIN for several years now. DIN 820-2, of which the image that I shared was a screenshot, albeit of the 2004 version, is an exception. The brand new DIN 820-2 manuscript that we are currently processing, at its core, still has a two-column German and English version of "CEN/CENELEC Internal Regulations -- Part 3, Principles and rules for the structure and drafting of CEN and CENELEC documents (ISO/IEC Directives b Part 2:2021, modified)", augmented by a single-column German national appendix.

In the two-column part, there are no artisanally merged tables like the one I shared, that is, it might already have been automatically merged from individual single-language sources in a straightforward manner.


On 06.05.2022 19:54, B Tommie Usdin btusdin@xxxxxxxxxxxxxxxx wrote:
I don't see the problem.

As I see it, you have a section at a level we don't see in this example. It contains:
   - some stuff that comes before this location in the document,
   - a table, which contains content in several languages, tagged at the phrase or paragraph or section level inside the table
   - a section in German, with an @id, an @xml:lang (de), and a @lang-group, and inside it:
      - a label
      - a title
      - a paragraph
      - a paragraph
   - a section in English, with an @id (not the same as the one on the German section), an @xml:lang (en), and a @lang-group
    (the SAME as the one on the German section), and inside it:
      - a label
      - a title
      - a paragraph
      - a paragraph
   - a section in French, with an @id (not the same as the one on the German or English sections), an @xml:lang (fr), and a @lang-group
    (the SAME as the one on the German section), and inside it:
      - a label
      - a title
      - a paragraph
      - a paragraph

What am I missing here?

-- Tommie

On May 6, 2022, at 12:34 PM, Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

If you decided that you made the language divide on a top-level section, you can't have different languages side by side at lower levels.

Consider the table in

It contains the three language variants on a paragraph or table cell level.

If you did the language split on Section 5.2.4 (the section that contains the table), you can't have this multi-language table within the section (unless you repeated it in each language version). The same is true if you did the language split on Section 5 -- then you cannot have multilingual objects within any subsection unless you repeat them in each language variant of the section.

My point is that having a language switch on a given level prevents encoding fine-grained language-dependent code switches on lower levels.


On 06.05.2022 17:41, B Tommie Usdin btusdin@xxxxxxxxxxxxxxxx wrote:
Hi Gerrit --
I don't understand why you say "only if you decide to provide side-by-side language content on the topmost section level, not below". My example is at a high level, but I think this could be done at any level. Can you show an example where this would not work?
-- Tommie
On May 6, 2022, at 11:36 AM, Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

Hi Tommie,

This of course works, but only if you decide to provide side-by-side language content on the topmost section level, not below. This feels only marginally better for block-level advocates than to have a distinct document for each language. If you have the language switch/group as coarse-grained as on the topmost section level you might as well have it on the article level -- without lang-groups, but with the assumption that all language versions should align by identical IDs.

As I said, I'm not advocating side-by-side alignment of translations on a block level granularity, but there are people who want to have it and the examples in the paper suggest that it's attainable. As Nikos pointed out, with repeatable title and label elements this would be possible, but it's a huge stretch grammar-wise.

I observe that the utility or universal applicability of the lang-group feature is significantly hampered by the grammatical constraints (that is, at most one title per section), but I don't know whether it makes lang-group largely useless (probably not).

All I can say at the moment that I have the feeling that this discussion will lead us somewhere.


On 06.05.2022 17:09, B Tommie Usdin btusdin@xxxxxxxxxxxxxxxx wrote:
Hi Gerrit --
I'm jumping in because your example of interleaved titles and paragraphs in multiple languages makes me very uncomfortable. I think there is a much more graceful way to encode what I think you are describing and to provide the linking at multiple levels that I think you need.
The example below shows:
   - several levels of sections, each with a label, title, and some content at the bottom level.
   - The "same" text is provided in 3 languages. (Please don't get caught in the details of the language content,
     it was created by an auto-translator and it probably horrible. I hope it is good enough to make my point.)
   - each of the language versions is structurally coherent, that is, it is a section with it's identified language
     and content as a JATS user would expect to see it
   - at all levels the structures are associated with the other language versions of the same structure, which should
     allow side by side alignment should that be desired
I don't know if the list will mangle this example; if it does and anyone reading it wants a clean copy, please send me email at btusdin@xxxxxxxxxxxxxxxx and I'll be happy to send a zipped copy to you directly.
-- Tommie
<?xml version="1.0" encoding="UTF-8"?>
<!--<!DOCTYPE article SYSTEM "some-imaginary-future-JATS-journalpublishing1-X.dtd">
<article xml:lang="mul">
  <processing-meta lang-grouping="yes"> </processing-meta>
    <sec xml:lang="en" lang-group="sec-1" id="sec-1">
     <!-- In English -->
     <label>Part I.</label>
     <title>Rules of Order.</title>
     <sec lang-group="sec-1-1" id="sec-1-1">
      <label>Art. I.</label>
      <title>Introduction of Business. [B'B' 1-5.]</title>
      <sec lang-group="sec-1-1-1" id="sec-1-1-1">
       <p>All business should be brought before the assembly by a
        motion of a member, or by the presentation of a communication
        to the assembly. It is not usual, however, to make a motion to
        receive the reports of committees [B' 30] or communications to
        the assembly; and in many other cases in the ordinary routine
        of business, the formality of a motion is dispensed with; but
        should any member object, a regular motion becomes
      <sec lang-group="sec-1-1-2" id="sec-1-1-2">
       <p>Before a member can make a motion or address the assembly
        upon any question, it is necessary that he obtain the floor;
        that is, he must rise and address the presiding officer by his
        title, thus: "Mr. Chairman" [B' 34], who will then announce the
        member's name. Where two or more rise at the same time the
        Chairman must decide who is entitled to the floor, which he
        does by announcing that member's name. From this decision,
        however, an appeal [B' 14] can he taken; though if there is any
        doubt as to who is entitled to the floor, the Chairman can at
        the first allow the assembly to decide the question by a
        votebthe one getting the largest vote being entitled to the
    <sec xml:lang="nl" lang-group="sec-1">
     <!-- In Dutch -->
     <label>Deel I.</label>
     <sec lang-group="sec-1-1">
      <label>Kunst. I.</label>
      <title>Introductie van het bedrijfsleven. [B'B' 1-5.] </title>
      <sec lang-group="sec-1-1-1">
       <p>Alle zaken dienen aan de vergadering te worden voorgelegd
        door een motie van een lid of door de presentatie van een
        mededeling aan de vergadering. Het is echter niet gebruikelijk
        om een motie in te dienen om de verslagen van commissies [B' 30]
        of mededelingen aan de vergadering te ontvangen; en in veel
        andere gevallen in de gewone gang van zaken wordt de
        formaliteit van een motie achterwege gelaten; maar mocht een
        lid bezwaar maken, dan wordt een reguliere motie
      <sec lang-group="sec-1-1-2">
       <p>Voordat een lid een motie kan indienen of de vergadering kan
        toespreken over een vraag, is het noodzakelijk dat hij het
        woord krijgt; dat wil zeggen, hij moet opstaan en de voorzitter
        aanspreken met zijn titel, dus: "Meneer de Voorzitter" [B' 34],
        die dan de naam van het lid zal aankondigen. Wanneer twee of
        meer tegelijk opstaan, moet de voorzitter beslissen wie het
        woord mag voeren, hetgeen hij doet door de naam van dat lid
        bekend te maken. Tegen deze beslissing kan hij echter beroep
        instellen [B' 14]; Maar als er enige twijfel bestaat over wie
        het woord heeft, kan de voorzitter bij de eerste gelegenheid de
        vergadering toestaan om de kwestie door middel van een stemming
        te beslissen ; degene die de grootste stem krijgt, heeft het
        woord. </p>
    <sec xml:lang="el" lang-group="sec-1">
     <!-- In Greek -->
     <label>NN-ON?O I.</label>
     <title>NN1N=ON=N5O ON,N>N7O.</title>
     <sec lang-group="sec-1-1">
      <label>N$N-ON=N7. I.</label>
      <title>NN9ON1N3O	N3N. ON7O N5ON9ON5N/ON7ON7O. [B'B' 1-5.]</title>
      <sec lang-group="sec-1-1-1">
       <p>NN;N5O N?N9 N5ON9ON5N9ON.ON5N9O N8N1 OON-ON5N9 N=N1 OON?ON,N3N?ON= N5N=OON9N?N= ON7O
        OON7 OON=N-N;N5OON7. N)OOOON?, N4N5N= N5N/N=N1N9 OO
        OON?ON,ON5N9O N3N9N1 ON7 N;N.ON7 OO	N= N5N:N8N-ON5O	N= OO	N= N5ON9OON?OON= [B' 30] N. OO	N=
        N1N=ON9ON1ON8N5N/, N:N1N8N/OON1ON1N9 N1ON1ON1N/ON7ON7 N<N9N1 ON1N:ON9N:N. N:N/N=N7ON7.</p>
      <sec lang-group="sec-1-1-2">
       <p>NN9N1 N=N1 N<ON?ON-ON5N9 N-N=N1 N<N-N;N?O N=N1 OON?N2N,N;N5N9 OOOON1ON7 N. N=N1 N1ON5ON8ON=N8N5N/
        N=N1 N;N,N2N5N9 ON?N= N;ON3N?. NN7N;N1N4N., OON-ON5N9 N=N1 N1N=OOO	N8N5N/ N:N1N9 N=N1 N1ON5ON8ON=N8N5N/
        OON?N= OON?N5N4ON5O
N?N=ON1 N<N5 ON?N= ON/ON;N? ON?O, N-OON9: B+N:. N OO N5N4ON?OB; [B' 34],
        N? N?ON?N/N?O OON7 OON=N-ON5N9N1 N8N1 N1N=N1N:N?N9N=OON5N9 ON? ON=N?N<N1 ON?O N<N-N;N?OO. NON1N=
ON1N9 ON?N= N;O N3N?, OON,N3N<N1 ON?O N:N,N=N5N9
        N1N=N1N:N?N9N=ON=N?N=ON1O ON? ON=N?N<N1 ON?O N5N= N;ON3O	 N<N-N;N?OO. NN=N?N<N1. NOO ON7N=
        N1OOON1ON7 N1OON., O	OOOON?, N<ON?ON5N/ N=N1 N1ON:N.ON5N9 N-ON5ON7 [N,ON8ON? 14] NN= N:N1N9
        N1N= OON,OON5N9 N1N<ON9N2N?N;N/N1 N3N9N1 ON? ON?N9N?O N4N9N:N1N9N?O
ON1N9 ON?N= N;O N3N?, N?
        ON? N5OOON7N<N1 N<N5 ON7ON?ON?ON/N1 b N1OON. ON?O N;N1N<N2N,N=N5N9 ON7 N<N5N3N1N;O
ON1N9 ON? N;O N3N?.</p>
On May 4, 2022, at 2:42 PM, Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

Hi Vincent (and list),

Yesterday after your presentation and during the JATS-Con social (or happy) hour I complained that if you have a multiple-language document in which you choose to tag your content with block-level items and their translations side-by-side, you can't have translations for section titles.

I was thinking of someone who followed the "content items in two or more languages" approach ( by adding a translation for each figure, table-wrap, p etc. element that is allowed within a section, side by side.

While you can have multiple figures, tables, paragraphs etc. in a row, you can't provide translation elements for elements that are allowed exactly once in a section, in particular, for the title element.

But there's a workaround: You can have the title translations as inline content in the title element, like so:

<sec id="sec1">
  <title><named-content content-type="title-content" lang-group="sec1-content" xml:lang="en">English Title</named-content>
    <named-content content-type="title-content" lang-group="sec1-content" xml:lang="fr">Titre franC'ais</named-content>
    <named-content content-type="title-content" lang-group="sec1-content" xml:lang="de">Deutscher Titel</named-content></title>
  <p lang-group="sec1-p1" xml:lang="en">Paragraph</p>
  <p lang-group="sec1-p1" xml:lang="fr">Paragraphe</p>
  <p lang-group="sec1-p1" xml:lang="de">Absatz</p>

So I think it's feasible by and large to pursue a block-level translation approach, with minor sacrifices where you need to process inline translations.

(Note: I'm not advocating such an approach, but it came up occasionally when discussing how to tag multilingual content with customers.)


-- Gerrit Imsieke GeschC$ftsfC<hrer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@xxxxxxxxx,

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

GeschC$ftsfC<hrer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt

B. Tommie Usdin			mailto: btusdin@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.
Phone: 301/315-9631
Mulberry Technologies, Inc.: A Consultancy Specializing in XML for Prose Documents

-- Gerrit Imsieke GeschC$ftsfC<hrer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@xxxxxxxxx,

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

GeschC$ftsfC<hrer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt

B. Tommie Usdin			mailto: btusdin@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.
Phone: 301/315-9631
Mulberry Technologies, Inc.: A Consultancy Specializing in XML for Prose Documents

-- Gerrit Imsieke GeschC$ftsfC<hrer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@xxxxxxxxx,

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

GeschC$ftsfC<hrer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt

=================================================================================== B. Tommie Usdin mailto: btusdin@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc. Phone: 301/315-9631 --------------------------------------------------------------------------------- Mulberry Technologies, Inc.: A Consultancy Specializing in XML for Prose Documents ===================================================================================

-- Gerrit Imsieke GeschC$ftsfC<hrer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@xxxxxxxxx,

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

GeschC$ftsfC<hrer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt

Current Thread