Re: [jats-list] html fragments and JATS

Subject: Re: [jats-list] html fragments and JATS
From: "Gareth Oakes goakes@xxxxxxx" <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 9 Mar 2016 20:41:13 -0000
Hi Chris,

Interesting thought about the use of data URLs, is this something in active
use in the JATS community? (Ibve not come across it outside of HTML yet)

I guess there are two comments on that approach: (1) certain characters will
need escaping in attribute values; (2) Ibve seen XML processors in the past
that have fixed limits on the length of attribute values. Obviously such XML
processors should be fixed, but in production scenarios it is sometimes
difficult to effect such changes. Just something to be aware of.

// Gareth Oakes
// Chief Architect, GPSL
// www.gpsl.co








On 10/03/2016, 01:35, "Maloney, Christopher (NIH/NLM/NCBI) [C]
maloneyc@xxxxxxxxxxxxxxxx" <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

>There is another alternative: you could use data URLs. They are pretty
>
>common on the web nowadays, often for CSS background images and that sort
>
>of thing. I donbt see any reason why they couldnbt be used in JATS b
they
>
>are basically a way of embedding an external resource into a document.
>
>Something like this:
>
>
>
>        <inline-formula>
>
>          <alternatives>
>
>            <inline-graphic xlink:href="data:text/html;utf8,<h1>The
>
>Sun!</h1>"></inline-graphic>
>
>          </alternatives>
>
>        </inline-formula>
>
>
>
>
>
>Renderers would have to know what to do with this, though, and it would
>
>depend on the output format. Herebs a jsfiddle showing data urls being
>
>used in html, to include html and svg:
>
>https://jsfiddle.net/klortho/tmk3rzse/
>
>
>
>The question of CDATA vs entity references is really a question about the
>
>lexical layer of XML, and your XML tools and libraries should take care of
>
>that, *hopefully*. In my opinion, bCDATAb is a broken concept, and should
>
>be avoided. The problem is that people tend to use it to produce XML
>
>documents with tools that donbt understand XML, and just write unescaped
>
>markup into it, assuming it will parse. But problems ensue if the
>
>unescaped markup itself contains CDATA, like this
>
>
>
><textual-form><!<CDATA[
>
>  Herebs some unescaped markup: <!<CDATA[Happy gardens forever!]>
>
>]></textual-form>
>
>
>
>
>
>It happens!
>
>
>
>
>
>--
>
>Chris Maloney
>
>NIH/NLM/NCBI (Contractor)
>
>Building 45, 4AN36D-12
>
>301-594-2842
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>"Alexander Schwarzman aschwarzman@xxxxxxxxx" wrote:
>
>
>
>>An HTML fragment could be tagged with either <textual-form> or <code>
>
>>-- and thus it would be nice if the Tag Library provided guidance on
>
>>the use of <textual-form> vs. <code>, especially within
>
>><alternatives>. Also, whether it is <textual-form> or <code>, in order
>
>>to represent angular brackets one could use escaped characters &lt;
>
>>and &gt; or the CDATA section instead, as Gareth has suggested. The
>
>><code> examples in the Tag Library use the escaped characters, but it
>
>>is unclear if the use of CDATA is deprecated or not.
>
>>
>
>>Alexander ('Sasha') Schwarzman, Content Technology Architect
>
>>phone: +1.202.416.1979 | e-mail: aschwarzman@xxxxxxx
>
>>
>
>>The Optical Society (OSA)
>
>>2010 Massachusetts Ave., NW
>
>>Washington, DC 20036 USA
>
>>www.osa.org
>
>>
>
>>On Wed, Mar 9, 2016 at 4:28 AM, Peter Krautzberger
>
>>peter.krautzberger@xxxxxxxxxxx
>
>><jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
>>> Hi Gareth,
>
>>>
>
>>> Thanks for the quick reply!
>
>>>
>
>>> Option 1) sounds good -- I didn't think of (ab)using it this way.
>
>>>
>
>>> Option 2) is good to know. I don't think it's necessary for me as I'll
>
>>> always have MathML (which the HTML is created from).
>
>>>
>
>>> Best regards,
>
>>> Peter.
>
>>>
>
>>> On Wed, Mar 9, 2016 at 10:23 AM, Gareth Oakes goakes@xxxxxxx
>
>>> <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
>>>>
>
>>>> Hi Peter,
>
>>>>
>
>>>> The JATS doctype doesnbt include XHTML so definitely no way to store
>
>>>>HTML
>
>>>> fragments as-is. You do have a number of options but it depends on the
>
>>>> various users of your data as to what makes sense. I see most options
>
>>>>as
>
>>>> falling into one of two categories.
>
>>>>
>
>>>> 1. Most simply you wrap everything up as CDATA:
>
>>>> <disp-formula><alternatives><textual-form><![CDATA[<span
>
>>>> class="ABC">text</span>]]></textual-form>b&</disp-formula>
>
>>>>
>
>>>> 2. Otherwise you translate the HTML to something JATS-y (carefully
>
>>>> capturing all attributes):
>
>>>> <disp-formula><alternatives><textual-form><styled-content
>
>>>>
>
>>>>style-type="ABC">text</styled-content>]]></textual-form>b&</disp-formula>
>
>>>>
>
>>>> First option is quick and easy. Second option lets you do more with the
>
>>>> content when it is in JATS format.
>
>>>>
>
>>>> I hope the thought process, at least, helps.
>
>>>>
>
>>>> // Gareth Oakes
>
>>>> // Chief Architect, GPSL
>
>>>> // www.gpsl.co
>
>>>>
>
>>>> From: "Peter Krautzberger peter.krautzberger@xxxxxxxxxxx"
>
>>>> <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
>
>>>> Reply-To: "jats-list@xxxxxxxxxxxxxxxxxxxxxx"
>
>>>> <jats-list@xxxxxxxxxxxxxxxxxxxxxx>
>
>>>> Date: Wednesday, 9 March 2016 at 19:00
>
>>>> To: "jats-list@xxxxxxxxxxxxxxxxxxxxxx"
>
>>>><jats-list@xxxxxxxxxxxxxxxxxxxxxx>
>
>>>> Subject: [jats-list] html fragments and JATS
>
>>>>
>
>>>> Dear list members,
>
>>>>
>
>>>> I feel I have to apologize in advance. This is my first posting and it
>
>>>>was
>
>>>> difficult to search the archives for such a generic-sounding question.
>
>>>>I'm
>
>>>> sorry if I missed any earlier discussions on the topic.
>
>>>>
>
>>>> I'm wondering if there is any way to include (x)HTML-fragments in a
>
>>>>JATS
>
>>>> document.
>
>>>>
>
>>>> More precisely, I'm looking to include such fragments as (an
>
>>>>alternative
>
>>>> within) inline/display-formulas.
>
>>>>
>
>>>> The HTML fragments are just a number of nested <span> elements with
>
>>>> typical HTML attributes (class, style, role. aria-label etc).
>
>>>>
>
>>>> I'm relatively certain that this is not possible (in a valid way) but I
>
>>>> wanted to make sure I didn't miss anything.
>
>>>>
>
>>>> Thanks in advance for any pointers!
>
>>>>
>
>>>> Best regards,
>
>>>> Peter Krautzberger.
>
>>>> JATS-List info and archive
>
>>>> EasyUnsubscribe (by email)
>
>>>
>
>>>
>
>>> JATS-List info and archive
>
>>> EasyUnsubscribe (by email)

Current Thread