Re: [jats-list] Markup for linguistics (glossed text)

Subject: Re: [jats-list] Markup for linguistics (glossed text)
From: Michael Boudreau <mboudreau@xxxxxxxxxxxxxxxxxx>
Date: Thu, 21 Nov 2013 22:01:04 +0000
For what it's worth, our hosting platform informs me that the only way to
get these images to display at a consistent size is to submit the
<graphic> element as a child of <disp-formula>. They were not sympathetic
to my pointing out that these are not math.

--
Michael R. Boudreau
Electronic Publishing Technology Manager
The University of Chicago Press
1427 E. 60th Street
Chicago, IL 60637
(773) 753-3298
www.journals.uchicago.edu





On 11/20/13, 10:56 AM, "Michael Boudreau" <mboudreau@xxxxxxxxxxxxxxxxxx>
wrote:

>Thanks, everyone, for these comments. I should have mentioned that we're
>currently using graphics, like so (highly simplified):
>
>   <p>Some text precedes an example:</p>
>   <p><graphic href="example1.tiff"/></p>
>   <p>And the text continues.</p>
>
>This can be converted by our host to a readable HTML presentation. The
>down-side is that the content of the graphic is not searchable by the
>user's browser (though the site's search engine can build its index from
>the PDF version, so all is not lost), and the graphic's visual quality is
>relatively low, particularly on mobile devices.
>
>To answer Nikos's question, I don't have a current project that requires a
>particular type of markup for such examples, but the examples in their
>context just don't strike me as "tabular"--but I'm not a linguist and
>would defer to the journal editors if they deemed table markup
>appropriate. I think <ruby> is closer to the mark; I'd have to do
>extensive testing to see if it could handle examples with multiple layers
>of glossing on the base text (sometimes there are 2 or 3 or more). (I
>tremble to think what it would take to train our typesetting vendors to
>apply either <table> or <ruby> markup to these examples.)
>
>I hadn't thought of <array>, which actually might help solve a processing
>problem on our vendor's side even while still using <graphic>.
>
>
>--
>Michael R. Boudreau
>Electronic Publishing Technology Manager
>The University of Chicago Press
>1427 E. 60th Street
>Chicago, IL 60637
>(773) 753-3298
>www.journals.uchicago.edu
>
>
>
>
>
>On 11/20/13, 9:14 AM, "Alexander Schwarzman" <aschwarzman@xxxxxxxxx>
>wrote:
>
>>Or, perhaps, use <array>, with either <graphic>, as Nikos suggested,
>>or with <tbody> inside...
>>
>>--Sasha
>>
>>Alexander ('Sasha') Schwarzman, Content Technology Architect
>>phone: +1.202.416.1979 | e-mail: aschwarzman@xxxxxxx
>>
>>The Optical Society (OSA)
>>2010 Massachusetts Ave., NW
>>Washington, DC 20036 USA
>>www.osa.org
>>
>>
>>On Wed, Nov 20, 2013 at 5:01 AM, Nikos Markantonatos <nikos@xxxxxxxxxx>
>>wrote:
>>> Hi Michael,
>>>
>>> The question that arises of course out of the "semantically reasonable"
>>> encoding of such difficult pieces of text is why you need it. Are you
>>> planning to draw some logic across different types of such linguistic
>>> representations? In that case, JATS alone will hardly offer you a
>>>solution.
>>> JATS often resorts to other known standards for the representation of
>>> "tough" textual pieces, such as mathematical equations (MathML) and
>>>tables
>>> (XHTML, OASIS). If there was a corresponding XML encoding standard for
>>> linguistic representations, one could make the case for embedding it
>>>into
>>> JATS.
>>>
>>> Otherwise, you are left to choose between the encoding options
>>>suggested by
>>> Debbie, or to capture it as an image (my favorite option), or even
>>>attempt
>>> to represent it in TeX/LaTeX or MathML.
>>>
>>> Best regards,
>>> Nikos Markantonatos
>>> Atypon
>>>
>>>
>>> On 11/19/2013 11:47 PM, Debbie Lapeyre wrote:
>>>>
>>>> Dear Michael--
>>>>
>>>> Ouch! No you are not overlooking anything obvious. The problem
>>>> is that, although you ask for "semantically reasonable", you
>>>> really want presentation markup. JATS does not do presentation,
>>>> by design or very well.
>>>>
>>>>   - My first thought is a table, which this certainly looks like
>>>>     to me. But I do see your problem.
>>>>
>>>>   - If it has to present EXACTLY this way, another obvious
>>>>     (but less than perfect) choice is <preformat>. That would
>>>>      - force this into a monofont (sorry about that)
>>>>      - preserve all your alignments and whitespace
>>>>      - let you include the italics, bold, and stuff.
>>>>
>>>>   - Another possibility (not in NLM 3.0, but in the brand new
>>>>     JATS 1.1d1) is using <ruby>, which has a base (<rb>) and a
>>>>     ruby text annotation (rt) traditionally displayed atop the
>>>>     base (rt), or inside parenthesis after the base for browsers
>>>>     that cannot handle Ruby. Ruby is part of HTML5, as well as
>>>>     part of JATS. Ruby markup is intended for textual annotation,
>>>>     and might fit this case very well.
>>>>
>>>> But I've got to tell you, I found this example incredibly hard to
>>>> human parse and be sure what went with what and why were these 2
>>>> clusters parallel and that one all alone? When the top line and the
>>>> bottom line both had values, I was fine, but sometimes... Whatever
>>>> you decide, a few horizontal lines or just more white space between
>>>> the lines and/or less between the line and its gloss, would help
>>>> me to separate.
>>>>
>>>> --Debbie
>>>>
>>>>
>>>> On Nov 19, 2013, at 4:17 PM, Michael Boudreau
>>>> <mboudreau@xxxxxxxxxxxxxxxxxx> wrote:
>>>>
>>>>> Greetings,
>>>>>
>>>>> Has anyone tackled the problem of marking up textual illustrations
>>>>>that
>>>>> require multiple points of vertical alignment--the sort of thing for
>>>>> which
>>>>> you1d set tab stops on a typewriter or word processor?
>>>>>
>>>>> I1m working on a linguistics journal that has lots of glossed text
>>>>> illustrations that are typeset like the items labeled (3) and (4) on
>>>>>this
>>>>> page image:
>>>>>
>>>>>    http://mss.uchicago.edu:81/mrb/linguistics.png
>>>>>
>>>>> We1re using the NLM Journal Publishing 3.0 DTD, and I1m at a loss for
>>>>>a
>>>>> markup solution that seems semantically reasonable and illustrates
>>>>>the
>>>>> relationships between the chunks of text that the typesetting makes
>>>>> obvious. I1ve considered table markup, but I don1t want to break a
>>>>>single
>>>>> sentence or other unit of meaning into multiple table cells across a
>>>>>row.
>>>>> When I consider how our online host would convert XML into HTML, I
>>>>>see
>>>>> only the same bad option.
>>>>>
>>>>> Am I overlooking something obvious?
>>>>>
>>>>> --
>>>>> Michael R. Boudreau
>>>>> Electronic Publishing Technology Manager
>>>>> The University of Chicago Press
>>>>> 1427 E. 60th Street
>>>>> Chicago, IL 60637
>>>>> (773) 753-3298
>>>>> www.journals.uchicago.edu
>>>>>
>>>>
>>>>
>>>> ================================================================
>>>> Deborah A Lapeyre              mailto:dalapeyre@xxxxxxxxxxxxxxxx
>>>> Mulberry Technologies, Inc.      http://www.mulberrytech.com
>>>> 17 West Jefferson Street         Phone: 301-315-9631 (USA)
>>>> Suite 207                        Fax:   301-315-8385
>>>> Rockville, MD 20850
>>>> ----------------------------------------------------------------
>>>> Mulberry Technologies: Consultancy for XML, XSLT, and Schematron
>>>> ================================================================

Current Thread