Subject: Re: [jats-list] Markup for linguistics (glossed text) From: "Imsieke, Gerrit, le-tex" <gerrit.imsieke@xxxxxxxxx> Date: Sat, 23 Nov 2013 00:48:37 +0100 |
Hi again,
Sorry I take it back: since the line breaks in the samples appear to arbitrary, 'ruby' might be a better choice after all (although this is also a "creative" use of Ruby, which has generally been for phonological transcription AFAIK) than tables. Still not as fun as your own markup.
Cheers, Wendell
Wendell Piez | http://www.wendellpiez.com XML | XSLT | electronic publishing Eat Your Vegetables _____oo_________o_o___ooooo____ooooooo_^
On Fri, Nov 22, 2013 at 3:20 PM, Wendell Piez <wapiez@xxxxxxxxxxxxxxx> wrote:Hi again,
Also, I'd prefer plain-old tables (however ornate) to 'ruby' following the "Principle of Least Surprise".
Cheers, Wendell
Wendell Piez | http://www.wendellpiez.com XML | XSLT | electronic publishing Eat Your Vegetables _____oo_________o_o___ooooo____ooooooo_^
On Fri, Nov 22, 2013 at 2:56 PM, Wendell Piez <wapiez@xxxxxxxxxxxxxxx> wrote:Hi,
My nominations for alternatives:
(1) If there are a lot of these, and real benefit to be gained, then design and use a little markup language for them. Then, format as you like, probably via tables.
Disadvantage: time and expertise required. Dependence on specialists' knowhow. (But that could be an advantage.)
(2) Custom-designed tables, validated via Schematron. JATS provides @content-type Just as much work, and you'd be doing all the same work as (1), but they could be made to validate as JATS without extending it.
Advantage: relatively quick and dirty to get something started. Disadvantage: the XML would be relatively hard to maintain compared to (1). Also, this is schema design without a schema, so relatively fragile and not scalable to complexity.
(Such a table could also be used to represent (1) in JATS when interfacing with JATS-based systems.)
(3) SVG. Similar disadvantages, many advantages of its own. They could be very pretty. :-)
It sounds like graphics made from SVGs might be the preferred choice of your vendor (and I don't blame them). But as Debbie points out, they're not searchable. (If the SVGs were available they'd be sort of searchable.)
What my choice would be would depend on my goals, long-term and short-term resources, and the frequency with which it occurs or number of them. Having a finite number of these things (i.e. I'd never expect to see more of these than I already have) or having them very infrequently would argue for (2) or (3). The more of these there are and the more interesting/important the semantics they could expose, the more I'd do (1).
Designing and specifying a well-controlled, clean descriptive format (1) would also be really fun. (2) and (3) are also natural spin-offs for (1), not exclusive of it -- although you could also skip to them directly (and specialists in CSS and SVG might prefer to do so).
Cheers, Wendell
Wendell Piez | http://www.wendellpiez.com XML | XSLT | electronic publishing Eat Your Vegetables _____oo_________o_o___ooooo____ooooooo_^
On Thu, Nov 21, 2013 at 5:01 PM, Michael Boudreau <mboudreau@xxxxxxxxxxxxxxxxxx> wrote:For what it's worth, our hosting platform informs me that the only way to get these images to display at a consistent size is to submit the <graphic> element as a child of <disp-formula>. They were not sympathetic to my pointing out that these are not math.
-- Michael R. Boudreau Electronic Publishing Technology Manager The University of Chicago Press 1427 E. 60th Street Chicago, IL 60637 (773) 753-3298 www.journals.uchicago.edu
On 11/20/13, 10:56 AM, "Michael Boudreau" <mboudreau@xxxxxxxxxxxxxxxxxx> wrote:
Thanks, everyone, for these comments. I should have mentioned that we're currently using graphics, like so (highly simplified):
<p>Some text precedes an example:</p> <p><graphic href="example1.tiff"/></p> <p>And the text continues.</p>
This can be converted by our host to a readable HTML presentation. The down-side is that the content of the graphic is not searchable by the user's browser (though the site's search engine can build its index from the PDF version, so all is not lost), and the graphic's visual quality is relatively low, particularly on mobile devices.
To answer Nikos's question, I don't have a current project that requires a particular type of markup for such examples, but the examples in their context just don't strike me as "tabular"--but I'm not a linguist and would defer to the journal editors if they deemed table markup appropriate. I think <ruby> is closer to the mark; I'd have to do extensive testing to see if it could handle examples with multiple layers of glossing on the base text (sometimes there are 2 or 3 or more). (I tremble to think what it would take to train our typesetting vendors to apply either <table> or <ruby> markup to these examples.)
I hadn't thought of <array>, which actually might help solve a processing problem on our vendor's side even while still using <graphic>.
-- Michael R. Boudreau Electronic Publishing Technology Manager The University of Chicago Press 1427 E. 60th Street Chicago, IL 60637 (773) 753-3298 www.journals.uchicago.edu
On 11/20/13, 9:14 AM, "Alexander Schwarzman" <aschwarzman@xxxxxxxxx> wrote:
Or, perhaps, use <array>, with either <graphic>, as Nikos suggested, or with <tbody> inside...
--Sasha
Alexander ('Sasha') Schwarzman, Content Technology Architect phone: +1.202.416.1979 | e-mail: aschwarzman@xxxxxxx
The Optical Society (OSA) 2010 Massachusetts Ave., NW Washington, DC 20036 USA www.osa.org
On Wed, Nov 20, 2013 at 5:01 AM, Nikos Markantonatos <nikos@xxxxxxxxxx> wrote:Hi Michael,
The question that arises of course out of the "semantically reasonable" encoding of such difficult pieces of text is why you need it. Are you planning to draw some logic across different types of such linguistic representations? In that case, JATS alone will hardly offer you a solution. JATS often resorts to other known standards for the representation of "tough" textual pieces, such as mathematical equations (MathML) and tables (XHTML, OASIS). If there was a corresponding XML encoding standard for linguistic representations, one could make the case for embedding it into JATS.
Otherwise, you are left to choose between the encoding options suggested by Debbie, or to capture it as an image (my favorite option), or even attempt to represent it in TeX/LaTeX or MathML.
Best regards, Nikos Markantonatos Atypon
On 11/19/2013 11:47 PM, Debbie Lapeyre wrote:
Dear Michael--
Ouch! No you are not overlooking anything obvious. The problem is that, although you ask for "semantically reasonable", you really want presentation markup. JATS does not do presentation, by design or very well.
- My first thought is a table, which this certainly looks like to me. But I do see your problem.
- If it has to present EXACTLY this way, another obvious (but less than perfect) choice is <preformat>. That would - force this into a monofont (sorry about that) - preserve all your alignments and whitespace - let you include the italics, bold, and stuff.
- Another possibility (not in NLM 3.0, but in the brand new JATS 1.1d1) is using <ruby>, which has a base (<rb>) and a ruby text annotation (rt) traditionally displayed atop the base (rt), or inside parenthesis after the base for browsers that cannot handle Ruby. Ruby is part of HTML5, as well as part of JATS. Ruby markup is intended for textual annotation, and might fit this case very well.
But I've got to tell you, I found this example incredibly hard to human parse and be sure what went with what and why were these 2 clusters parallel and that one all alone? When the top line and the bottom line both had values, I was fine, but sometimes... Whatever you decide, a few horizontal lines or just more white space between the lines and/or less between the line and its gloss, would help me to separate.
--Debbie
On Nov 19, 2013, at 4:17 PM, Michael Boudreau <mboudreau@xxxxxxxxxxxxxxxxxx> wrote:
Greetings,
Has anyone tackled the problem of marking up textual illustrations that require multiple points of vertical alignment--the sort of thing for which youDd set tab stops on a typewriter or word processor?
IDm working on a linguistics journal that has lots of glossed text illustrations that are typeset like the items labeled (3) and (4) on this page image:
http://mss.uchicago.edu:81/mrb/linguistics.png
WeDre using the NLM Journal Publishing 3.0 DTD, and IDm at a loss for a markup solution that seems semantically reasonable and illustrates the relationships between the chunks of text that the typesetting makes obvious. IDve considered table markup, but I donDt want to break a single sentence or other unit of meaning into multiple table cells across a row. When I consider how our online host would convert XML into HTML, I see only the same bad option.
Am I overlooking something obvious?
-- Michael R. Boudreau Electronic Publishing Technology Manager The University of Chicago Press 1427 E. 60th Street Chicago, IL 60637 (773) 753-3298 www.journals.uchicago.edu
================================================================ Deborah A Lapeyre mailto:dalapeyre@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Phone: 301-315-9631 (USA) Suite 207 Fax: 301-315-8385 Rockville, MD 20850 ---------------------------------------------------------------- Mulberry Technologies: Consultancy for XML, XSLT, and Schematron ================================================================
-- Gerrit Imsieke GeschC$ftsfC<hrer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@xxxxxxxxx, http://www.le-tex.de
Registergericht / Commercial Register: Amtsgericht Leipzig Registernummer / Registration Number: HRB 24930
GeschC$ftsfC<hrer: Gerrit Imsieke, Svea Jelonek, Thomas Schmidt, Dr. Reinhard VC6ckler
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [jats-list] Markup for linguist, Wendell Piez | Thread | Re: [jats-list] Markup for linguist, Wendell Piez |
RE: [jats-list] Markup for linguist, Maloney, Christopher | Date | Re: [jats-list] Markup for linguist, Wendell Piez |
Month |