Subject: Re: [jats-list] Markup for linguistics (glossed text) From: Wendell Piez <wapiez@xxxxxxxxxxxxxxx> Date: Fri, 22 Nov 2013 15:44:03 -0500 |
Hi again, Sorry I take it back: since the line breaks in the samples appear to arbitrary, 'ruby' might be a better choice after all (although this is also a "creative" use of Ruby, which has generally been for phonological transcription AFAIK) than tables. Still not as fun as your own markup. Cheers, Wendell Wendell Piez | http://www.wendellpiez.com XML | XSLT | electronic publishing Eat Your Vegetables _____oo_________o_o___ooooo____ooooooo_^ On Fri, Nov 22, 2013 at 3:20 PM, Wendell Piez <wapiez@xxxxxxxxxxxxxxx> wrote: > Hi again, > > Also, I'd prefer plain-old tables (however ornate) to 'ruby' following > the "Principle of Least Surprise". > > Cheers, Wendell > > Wendell Piez | http://www.wendellpiez.com > XML | XSLT | electronic publishing > Eat Your Vegetables > _____oo_________o_o___ooooo____ooooooo_^ > > > On Fri, Nov 22, 2013 at 2:56 PM, Wendell Piez <wapiez@xxxxxxxxxxxxxxx> wrote: >> Hi, >> >> My nominations for alternatives: >> >> (1) If there are a lot of these, and real benefit to be gained, then >> design and use a little markup language for them. Then, format as you >> like, probably via tables. >> >> Disadvantage: time and expertise required. Dependence on specialists' >> knowhow. (But that could be an advantage.) >> >> (2) Custom-designed tables, validated via Schematron. JATS provides >> @content-type >> Just as much work, and you'd be doing all the same work as (1), but >> they could be made to validate as JATS without extending it. >> >> Advantage: relatively quick and dirty to get something started. >> Disadvantage: the XML would be relatively hard to maintain compared to >> (1). Also, this is schema design without a schema, so relatively >> fragile and not scalable to complexity. >> >> (Such a table could also be used to represent (1) in JATS when >> interfacing with JATS-based systems.) >> >> (3) SVG. Similar disadvantages, many advantages of its own. They could >> be very pretty. :-) >> >> It sounds like graphics made from SVGs might be the preferred choice >> of your vendor (and I don't blame them). But as Debbie points out, >> they're not searchable. (If the SVGs were available they'd be sort of >> searchable.) >> >> What my choice would be would depend on my goals, long-term and >> short-term resources, and the frequency with which it occurs or number >> of them. Having a finite number of these things (i.e. I'd never expect >> to see more of these than I already have) or having them very >> infrequently would argue for (2) or (3). The more of these there are >> and the more interesting/important the semantics they could expose, >> the more I'd do (1). >> >> Designing and specifying a well-controlled, clean descriptive format >> (1) would also be really fun. (2) and (3) are also natural spin-offs >> for (1), not exclusive of it -- although you could also skip to them >> directly (and specialists in CSS and SVG might prefer to do so). >> >> Cheers, Wendell >> >> >> >> >> >> Wendell Piez | http://www.wendellpiez.com >> XML | XSLT | electronic publishing >> Eat Your Vegetables >> _____oo_________o_o___ooooo____ooooooo_^ >> >> >> On Thu, Nov 21, 2013 at 5:01 PM, Michael Boudreau >> <mboudreau@xxxxxxxxxxxxxxxxxx> wrote: >>> For what it's worth, our hosting platform informs me that the only way to >>> get these images to display at a consistent size is to submit the >>> <graphic> element as a child of <disp-formula>. They were not sympathetic >>> to my pointing out that these are not math. >>> >>> -- >>> Michael R. Boudreau >>> Electronic Publishing Technology Manager >>> The University of Chicago Press >>> 1427 E. 60th Street >>> Chicago, IL 60637 >>> (773) 753-3298 >>> www.journals.uchicago.edu >>> >>> >>> >>> >>> >>> On 11/20/13, 10:56 AM, "Michael Boudreau" <mboudreau@xxxxxxxxxxxxxxxxxx> >>> wrote: >>> >>>>Thanks, everyone, for these comments. I should have mentioned that we're >>>>currently using graphics, like so (highly simplified): >>>> >>>> <p>Some text precedes an example:</p> >>>> <p><graphic href="example1.tiff"/></p> >>>> <p>And the text continues.</p> >>>> >>>>This can be converted by our host to a readable HTML presentation. The >>>>down-side is that the content of the graphic is not searchable by the >>>>user's browser (though the site's search engine can build its index from >>>>the PDF version, so all is not lost), and the graphic's visual quality is >>>>relatively low, particularly on mobile devices. >>>> >>>>To answer Nikos's question, I don't have a current project that requires a >>>>particular type of markup for such examples, but the examples in their >>>>context just don't strike me as "tabular"--but I'm not a linguist and >>>>would defer to the journal editors if they deemed table markup >>>>appropriate. I think <ruby> is closer to the mark; I'd have to do >>>>extensive testing to see if it could handle examples with multiple layers >>>>of glossing on the base text (sometimes there are 2 or 3 or more). (I >>>>tremble to think what it would take to train our typesetting vendors to >>>>apply either <table> or <ruby> markup to these examples.) >>>> >>>>I hadn't thought of <array>, which actually might help solve a processing >>>>problem on our vendor's side even while still using <graphic>. >>>> >>>> >>>>-- >>>>Michael R. Boudreau >>>>Electronic Publishing Technology Manager >>>>The University of Chicago Press >>>>1427 E. 60th Street >>>>Chicago, IL 60637 >>>>(773) 753-3298 >>>>www.journals.uchicago.edu >>>> >>>> >>>> >>>> >>>> >>>>On 11/20/13, 9:14 AM, "Alexander Schwarzman" <aschwarzman@xxxxxxxxx> >>>>wrote: >>>> >>>>>Or, perhaps, use <array>, with either <graphic>, as Nikos suggested, >>>>>or with <tbody> inside... >>>>> >>>>>--Sasha >>>>> >>>>>Alexander ('Sasha') Schwarzman, Content Technology Architect >>>>>phone: +1.202.416.1979 | e-mail: aschwarzman@xxxxxxx >>>>> >>>>>The Optical Society (OSA) >>>>>2010 Massachusetts Ave., NW >>>>>Washington, DC 20036 USA >>>>>www.osa.org >>>>> >>>>> >>>>>On Wed, Nov 20, 2013 at 5:01 AM, Nikos Markantonatos <nikos@xxxxxxxxxx> >>>>>wrote: >>>>>> Hi Michael, >>>>>> >>>>>> The question that arises of course out of the "semantically reasonable" >>>>>> encoding of such difficult pieces of text is why you need it. Are you >>>>>> planning to draw some logic across different types of such linguistic >>>>>> representations? In that case, JATS alone will hardly offer you a >>>>>>solution. >>>>>> JATS often resorts to other known standards for the representation of >>>>>> "tough" textual pieces, such as mathematical equations (MathML) and >>>>>>tables >>>>>> (XHTML, OASIS). If there was a corresponding XML encoding standard for >>>>>> linguistic representations, one could make the case for embedding it >>>>>>into >>>>>> JATS. >>>>>> >>>>>> Otherwise, you are left to choose between the encoding options >>>>>>suggested by >>>>>> Debbie, or to capture it as an image (my favorite option), or even >>>>>>attempt >>>>>> to represent it in TeX/LaTeX or MathML. >>>>>> >>>>>> Best regards, >>>>>> Nikos Markantonatos >>>>>> Atypon >>>>>> >>>>>> >>>>>> On 11/19/2013 11:47 PM, Debbie Lapeyre wrote: >>>>>>> >>>>>>> Dear Michael-- >>>>>>> >>>>>>> Ouch! No you are not overlooking anything obvious. The problem >>>>>>> is that, although you ask for "semantically reasonable", you >>>>>>> really want presentation markup. JATS does not do presentation, >>>>>>> by design or very well. >>>>>>> >>>>>>> - My first thought is a table, which this certainly looks like >>>>>>> to me. But I do see your problem. >>>>>>> >>>>>>> - If it has to present EXACTLY this way, another obvious >>>>>>> (but less than perfect) choice is <preformat>. That would >>>>>>> - force this into a monofont (sorry about that) >>>>>>> - preserve all your alignments and whitespace >>>>>>> - let you include the italics, bold, and stuff. >>>>>>> >>>>>>> - Another possibility (not in NLM 3.0, but in the brand new >>>>>>> JATS 1.1d1) is using <ruby>, which has a base (<rb>) and a >>>>>>> ruby text annotation (rt) traditionally displayed atop the >>>>>>> base (rt), or inside parenthesis after the base for browsers >>>>>>> that cannot handle Ruby. Ruby is part of HTML5, as well as >>>>>>> part of JATS. Ruby markup is intended for textual annotation, >>>>>>> and might fit this case very well. >>>>>>> >>>>>>> But I've got to tell you, I found this example incredibly hard to >>>>>>> human parse and be sure what went with what and why were these 2 >>>>>>> clusters parallel and that one all alone? When the top line and the >>>>>>> bottom line both had values, I was fine, but sometimes... Whatever >>>>>>> you decide, a few horizontal lines or just more white space between >>>>>>> the lines and/or less between the line and its gloss, would help >>>>>>> me to separate. >>>>>>> >>>>>>> --Debbie >>>>>>> >>>>>>> >>>>>>> On Nov 19, 2013, at 4:17 PM, Michael Boudreau >>>>>>> <mboudreau@xxxxxxxxxxxxxxxxxx> wrote: >>>>>>> >>>>>>>> Greetings, >>>>>>>> >>>>>>>> Has anyone tackled the problem of marking up textual illustrations >>>>>>>>that >>>>>>>> require multiple points of vertical alignment--the sort of thing for >>>>>>>> which >>>>>>>> youDd set tab stops on a typewriter or word processor? >>>>>>>> >>>>>>>> IDm working on a linguistics journal that has lots of glossed text >>>>>>>> illustrations that are typeset like the items labeled (3) and (4) on >>>>>>>>this >>>>>>>> page image: >>>>>>>> >>>>>>>> http://mss.uchicago.edu:81/mrb/linguistics.png >>>>>>>> >>>>>>>> WeDre using the NLM Journal Publishing 3.0 DTD, and IDm at a loss for >>>>>>>>a >>>>>>>> markup solution that seems semantically reasonable and illustrates >>>>>>>>the >>>>>>>> relationships between the chunks of text that the typesetting makes >>>>>>>> obvious. IDve considered table markup, but I donDt want to break a >>>>>>>>single >>>>>>>> sentence or other unit of meaning into multiple table cells across a >>>>>>>>row. >>>>>>>> When I consider how our online host would convert XML into HTML, I >>>>>>>>see >>>>>>>> only the same bad option. >>>>>>>> >>>>>>>> Am I overlooking something obvious? >>>>>>>> >>>>>>>> -- >>>>>>>> Michael R. Boudreau >>>>>>>> Electronic Publishing Technology Manager >>>>>>>> The University of Chicago Press >>>>>>>> 1427 E. 60th Street >>>>>>>> Chicago, IL 60637 >>>>>>>> (773) 753-3298 >>>>>>>> www.journals.uchicago.edu >>>>>>>> >>>>>>> >>>>>>> >>>>>>> ================================================================ >>>>>>> Deborah A Lapeyre mailto:dalapeyre@xxxxxxxxxxxxxxxx >>>>>>> Mulberry Technologies, Inc. http://www.mulberrytech.com >>>>>>> 17 West Jefferson Street Phone: 301-315-9631 (USA) >>>>>>> Suite 207 Fax: 301-315-8385 >>>>>>> Rockville, MD 20850 >>>>>>> ---------------------------------------------------------------- >>>>>>> Mulberry Technologies: Consultancy for XML, XSLT, and Schematron >>>>>>> ================================================================
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [jats-list] Markup for linguist, Wendell Piez | Thread | Re: [jats-list] Markup for linguist, Imsieke, Gerrit, le- |
Re: [jats-list] Markup for linguist, Wendell Piez | Date | RE: [jats-list] Markup for linguist, Maloney, Christopher |
Month |