Re: [jats-list] Markup for linguistics (glossed text)

Subject: Re: [jats-list] Markup for linguistics (glossed text)
From: Michael Boudreau <mboudreau@xxxxxxxxxxxxxxxxxx>
Date: Wed, 20 Nov 2013 16:56:24 +0000
Thanks, everyone, for these comments. I should have mentioned that we're
currently using graphics, like so (highly simplified):

   <p>Some text precedes an example:</p>
   <p><graphic href="example1.tiff"/></p>
   <p>And the text continues.</p>

This can be converted by our host to a readable HTML presentation. The
down-side is that the content of the graphic is not searchable by the
user's browser (though the site's search engine can build its index from
the PDF version, so all is not lost), and the graphic's visual quality is
relatively low, particularly on mobile devices.

To answer Nikos's question, I don't have a current project that requires a
particular type of markup for such examples, but the examples in their
context just don't strike me as "tabular"--but I'm not a linguist and
would defer to the journal editors if they deemed table markup
appropriate. I think <ruby> is closer to the mark; I'd have to do
extensive testing to see if it could handle examples with multiple layers
of glossing on the base text (sometimes there are 2 or 3 or more). (I
tremble to think what it would take to train our typesetting vendors to
apply either <table> or <ruby> markup to these examples.)

I hadn't thought of <array>, which actually might help solve a processing
problem on our vendor's side even while still using <graphic>.


--
Michael R. Boudreau
Electronic Publishing Technology Manager
The University of Chicago Press
1427 E. 60th Street
Chicago, IL 60637
(773) 753-3298
www.journals.uchicago.edu





On 11/20/13, 9:14 AM, "Alexander Schwarzman" <aschwarzman@xxxxxxxxx> wrote:

>Or, perhaps, use <array>, with either <graphic>, as Nikos suggested,
>or with <tbody> inside...
>
>--Sasha
>
>Alexander ('Sasha') Schwarzman, Content Technology Architect
>phone: +1.202.416.1979 | e-mail: aschwarzman@xxxxxxx
>
>The Optical Society (OSA)
>2010 Massachusetts Ave., NW
>Washington, DC 20036 USA
>www.osa.org
>
>
>On Wed, Nov 20, 2013 at 5:01 AM, Nikos Markantonatos <nikos@xxxxxxxxxx>
>wrote:
>> Hi Michael,
>>
>> The question that arises of course out of the "semantically reasonable"
>> encoding of such difficult pieces of text is why you need it. Are you
>> planning to draw some logic across different types of such linguistic
>> representations? In that case, JATS alone will hardly offer you a
>>solution.
>> JATS often resorts to other known standards for the representation of
>> "tough" textual pieces, such as mathematical equations (MathML) and
>>tables
>> (XHTML, OASIS). If there was a corresponding XML encoding standard for
>> linguistic representations, one could make the case for embedding it
>>into
>> JATS.
>>
>> Otherwise, you are left to choose between the encoding options
>>suggested by
>> Debbie, or to capture it as an image (my favorite option), or even
>>attempt
>> to represent it in TeX/LaTeX or MathML.
>>
>> Best regards,
>> Nikos Markantonatos
>> Atypon
>>
>>
>> On 11/19/2013 11:47 PM, Debbie Lapeyre wrote:
>>>
>>> Dear Michael--
>>>
>>> Ouch! No you are not overlooking anything obvious. The problem
>>> is that, although you ask for "semantically reasonable", you
>>> really want presentation markup. JATS does not do presentation,
>>> by design or very well.
>>>
>>>   - My first thought is a table, which this certainly looks like
>>>     to me. But I do see your problem.
>>>
>>>   - If it has to present EXACTLY this way, another obvious
>>>     (but less than perfect) choice is <preformat>. That would
>>>      - force this into a monofont (sorry about that)
>>>      - preserve all your alignments and whitespace
>>>      - let you include the italics, bold, and stuff.
>>>
>>>   - Another possibility (not in NLM 3.0, but in the brand new
>>>     JATS 1.1d1) is using <ruby>, which has a base (<rb>) and a
>>>     ruby text annotation (rt) traditionally displayed atop the
>>>     base (rt), or inside parenthesis after the base for browsers
>>>     that cannot handle Ruby. Ruby is part of HTML5, as well as
>>>     part of JATS. Ruby markup is intended for textual annotation,
>>>     and might fit this case very well.
>>>
>>> But I've got to tell you, I found this example incredibly hard to
>>> human parse and be sure what went with what and why were these 2
>>> clusters parallel and that one all alone? When the top line and the
>>> bottom line both had values, I was fine, but sometimes... Whatever
>>> you decide, a few horizontal lines or just more white space between
>>> the lines and/or less between the line and its gloss, would help
>>> me to separate.
>>>
>>> --Debbie
>>>
>>>
>>> On Nov 19, 2013, at 4:17 PM, Michael Boudreau
>>> <mboudreau@xxxxxxxxxxxxxxxxxx> wrote:
>>>
>>>> Greetings,
>>>>
>>>> Has anyone tackled the problem of marking up textual illustrations
>>>>that
>>>> require multiple points of vertical alignment--the sort of thing for
>>>> which
>>>> you1d set tab stops on a typewriter or word processor?
>>>>
>>>> I1m working on a linguistics journal that has lots of glossed text
>>>> illustrations that are typeset like the items labeled (3) and (4) on
>>>>this
>>>> page image:
>>>>
>>>>    http://mss.uchicago.edu:81/mrb/linguistics.png
>>>>
>>>> We1re using the NLM Journal Publishing 3.0 DTD, and I1m at a loss for
>>>>a
>>>> markup solution that seems semantically reasonable and illustrates the
>>>> relationships between the chunks of text that the typesetting makes
>>>> obvious. I1ve considered table markup, but I don1t want to break a
>>>>single
>>>> sentence or other unit of meaning into multiple table cells across a
>>>>row.
>>>> When I consider how our online host would convert XML into HTML, I see
>>>> only the same bad option.
>>>>
>>>> Am I overlooking something obvious?
>>>>
>>>> --
>>>> Michael R. Boudreau
>>>> Electronic Publishing Technology Manager
>>>> The University of Chicago Press
>>>> 1427 E. 60th Street
>>>> Chicago, IL 60637
>>>> (773) 753-3298
>>>> www.journals.uchicago.edu
>>>>
>>>
>>>
>>> ================================================================
>>> Deborah A Lapeyre              mailto:dalapeyre@xxxxxxxxxxxxxxxx
>>> Mulberry Technologies, Inc.      http://www.mulberrytech.com
>>> 17 West Jefferson Street         Phone: 301-315-9631 (USA)
>>> Suite 207                        Fax:   301-315-8385
>>> Rockville, MD 20850
>>> ----------------------------------------------------------------
>>> Mulberry Technologies: Consultancy for XML, XSLT, and Schematron
>>> ================================================================

Current Thread