RE: [jats-list] How/When do you produce a JATS-XML version of you publication within your publication workflow

Subject: RE: [jats-list] How/When do you produce a JATS-XML version of you publication within your publication workflow
From: "Lizzi, Vincent" <Vincent.Lizzi@xxxxxxxxxxxxxxxxxxxx>
Date: Thu, 25 Oct 2012 14:37:24 -0400
Another way is use the <self-uri> element, which is sometimes used to identify the PDF file for an article. You could tag:

<self-uri content-type="html" xlink:href="index.html" />

With <self-uri> placed in <article-meta>. This would leave the <body> element available for possible future conversion of the article content to XML.

The difference with the Atom format mentioned is here the HTML is in a separate file, whereas Atom includes the HTML in the same document. If you can manage the mixed vocabularies, you might be able to place the HTML in CDATA, or redefine <body> to allow content in the HTML namespace. Using a separate file for the HTML version is probably much easier.

Vincent



-----Original Message-----
From: eaton.alf@xxxxxxxxx [mailto:eaton.alf@xxxxxxxxx] On Behalf Of Alf Eaton
Sent: Tuesday, October 23, 2012 10:10 AM
To: jats-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [jats-list] How/When do you produce a JATS-XML version of you publication within your publication workflow

On 21 October 2012 17:26, Kevin Hawkins
<kevin.s.hawkins@xxxxxxxxxxxxxxxxxx> wrote:
> You clearly understand the problems here, and there are many points to respond to.  Let me make an attempt at this.  I'm sure others will have things to add.
>
> You've already decided to use OJS, so the first question is whether to bother producing XML at all.

This raises an interesting question, which I was thinking about during JATS-CON. If an article is authored in HTML, it's not entirely straightforward to convert the body of the article to JATS markup. In that case, is it possible to use JATS for the metadata (front and
back) of the article, but maintain the body of the article in the original source format?

The source format might be HTML (in this case), PDF (as in the J-STAGE converter <http://www.ncbi.nlm.nih.gov/books/NBK100490/> which only extracts metadata so far), an image (from OCR), or even something else.

An analogous element is Atom's "content" element, which has a "type"
attribute[1] containing the MIME type of the content within the element.

The JATS "body" element only has a "specific-use" attribute, and the element definition would have to be extended to allow non-JATS content, so, instead, maybe a solution is to provide the main content of the article in a separate file, and link to it within the body element, like this?:

<article>
  <front>[...]</front>
  <body>
    <media mimetype="text" mime-subtype="html" xlink:href="index.html"
xlink:actuate="onLoad"/>
  </body>
  <back>[...]</back>
</article>

A big question, though, is whether PubMed Central would be willing to archive content when the body of the document is in a format other than JATS XML (possibly in a separate, accompanying file).

Alf

[1] http://tools.ietf.org/html/rfc4287.html#section-4.1.3.1

--~------------------------------------------------------------------
JATS-List info and archive:  http://www.mulberrytech.com/JATS/JATS-List/
To unsubscribe, go to: http://lists.mulberrytech.com/jats-list/
or e-mail: <mailto:jats-list-unsubscribe@xxxxxxxxxxxxxxxxxxxxxx>
--~--

Current Thread