Re: [jats-list] html fragments and JATS

Subject: Re: [jats-list] html fragments and JATS
From: "Maloney, Christopher (NIH/NLM/NCBI) [C] maloneyc@xxxxxxxxxxxxxxxx" <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 9 Mar 2016 15:35:24 -0000
There is another alternative: you could use data URLs. They are pretty
common on the web nowadays, often for CSS background images and that sort
of thing. I donbt see any reason why they couldnbt be used in JATS b
they
are basically a way of embedding an external resource into a document.
Something like this:

        <inline-formula>
          <alternatives>
            <inline-graphic xlink:href="data:text/html;utf8,<h1>The
Sun!</h1>"></inline-graphic>
          </alternatives>
        </inline-formula>


Renderers would have to know what to do with this, though, and it would
depend on the output format. Herebs a jsfiddle showing data urls being
used in html, to include html and svg:
https://jsfiddle.net/klortho/tmk3rzse/

The question of CDATA vs entity references is really a question about the
lexical layer of XML, and your XML tools and libraries should take care of
that, *hopefully*. In my opinion, bCDATAb is a broken concept, and should
be avoided. The problem is that people tend to use it to produce XML
documents with tools that donbt understand XML, and just write unescaped
markup into it, assuming it will parse. But problems ensue if the
unescaped markup itself contains CDATA, like this

<textual-form><!<CDATA[
  Herebs some unescaped markup: <!<CDATA[Happy gardens forever!]>
]></textual-form>


It happens!


--
Chris Maloney
NIH/NLM/NCBI (Contractor)
Building 45, 4AN36D-12
301-594-2842







"Alexander Schwarzman aschwarzman@xxxxxxxxx" wrote:

>An HTML fragment could be tagged with either <textual-form> or <code>
>-- and thus it would be nice if the Tag Library provided guidance on
>the use of <textual-form> vs. <code>, especially within
><alternatives>. Also, whether it is <textual-form> or <code>, in order
>to represent angular brackets one could use escaped characters &lt;
>and &gt; or the CDATA section instead, as Gareth has suggested. The
><code> examples in the Tag Library use the escaped characters, but it
>is unclear if the use of CDATA is deprecated or not.
>
>Alexander ('Sasha') Schwarzman, Content Technology Architect
>phone: +1.202.416.1979 | e-mail: aschwarzman@xxxxxxx
>
>The Optical Society (OSA)
>2010 Massachusetts Ave., NW
>Washington, DC 20036 USA
>www.osa.org
>
>On Wed, Mar 9, 2016 at 4:28 AM, Peter Krautzberger
>peter.krautzberger@xxxxxxxxxxx
><jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>> Hi Gareth,
>>
>> Thanks for the quick reply!
>>
>> Option 1) sounds good -- I didn't think of (ab)using it this way.
>>
>> Option 2) is good to know. I don't think it's necessary for me as I'll
>> always have MathML (which the HTML is created from).
>>
>> Best regards,
>> Peter.
>>
>> On Wed, Mar 9, 2016 at 10:23 AM, Gareth Oakes goakes@xxxxxxx
>> <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>>>
>>> Hi Peter,
>>>
>>> The JATS doctype doesnbt include XHTML so definitely no way to store
>>>HTML
>>> fragments as-is. You do have a number of options but it depends on the
>>> various users of your data as to what makes sense. I see most options
>>>as
>>> falling into one of two categories.
>>>
>>> 1. Most simply you wrap everything up as CDATA:
>>> <disp-formula><alternatives><textual-form><![CDATA[<span
>>> class="ABC">text</span>]]></textual-form>b&</disp-formula>
>>>
>>> 2. Otherwise you translate the HTML to something JATS-y (carefully
>>> capturing all attributes):
>>> <disp-formula><alternatives><textual-form><styled-content
>>>
>>>style-type="ABC">text</styled-content>]]></textual-form>b&</disp-formula>
>>>
>>> First option is quick and easy. Second option lets you do more with the
>>> content when it is in JATS format.
>>>
>>> I hope the thought process, at least, helps.
>>>
>>> // Gareth Oakes
>>> // Chief Architect, GPSL
>>> // www.gpsl.co
>>>
>>> From: "Peter Krautzberger peter.krautzberger@xxxxxxxxxxx"
>>> <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
>>> Reply-To: "jats-list@xxxxxxxxxxxxxxxxxxxxxx"
>>> <jats-list@xxxxxxxxxxxxxxxxxxxxxx>
>>> Date: Wednesday, 9 March 2016 at 19:00
>>> To: "jats-list@xxxxxxxxxxxxxxxxxxxxxx"
>>><jats-list@xxxxxxxxxxxxxxxxxxxxxx>
>>> Subject: [jats-list] html fragments and JATS
>>>
>>> Dear list members,
>>>
>>> I feel I have to apologize in advance. This is my first posting and it
>>>was
>>> difficult to search the archives for such a generic-sounding question.
>>>I'm
>>> sorry if I missed any earlier discussions on the topic.
>>>
>>> I'm wondering if there is any way to include (x)HTML-fragments in a
>>>JATS
>>> document.
>>>
>>> More precisely, I'm looking to include such fragments as (an
>>>alternative
>>> within) inline/display-formulas.
>>>
>>> The HTML fragments are just a number of nested <span> elements with
>>> typical HTML attributes (class, style, role. aria-label etc).
>>>
>>> I'm relatively certain that this is not possible (in a valid way) but I
>>> wanted to make sure I didn't miss anything.
>>>
>>> Thanks in advance for any pointers!
>>>
>>> Best regards,
>>> Peter Krautzberger.
>>> JATS-List info and archive
>>> EasyUnsubscribe (by email)
>>
>>
>> JATS-List info and archive
>> EasyUnsubscribe (by email)

Current Thread