Re: [jats-list] How/When do you produce a JATS-XML version of you publication within your publication workflow

Subject: Re: [jats-list] How/When do you produce a JATS-XML version of you publication within your publication workflow
From: Kevin Hawkins <kevin.s.hawkins@xxxxxxxxxxxxxxxxxx>
Date: Sun, 21 Oct 2012 12:26:10 -0400
Hello Matthias,

You clearly understand the problems here, and there are many points to respond to. Let me make an attempt at this. I'm sure others will have things to add.

You've already decided to use OJS, so the first question is whether to bother producing XML at all. You could use OJS in a way that the "layout" stage does not use XML at all; instead, a typesetter uses Word, LibreOffice, InDesign, or another desktop-publishing program to create HTML and PDF versions of each article for delivery through OJS. However, it seems you are interested in having an XML workflow, perhaps not only for the ease of deriving HTML and PDF but also because it will allow you to produce other formats (EPUB, MOBI, etc.) and to do things like provide the XML to anyone who wants to do further manipulation or data mining on journal articles. While it takes more work to set up such a workflow than if you did something without XML, there are long-term benefits.

Next is the choice between TEI and JATS. If you're committed to using OJS (as opposed to having your journal hosted on Revues.org, which uses TEI), then you'll want to choose JATS. There is a plugin included with OJS called "XML Galleys" that allows an NLM Blue version 2.3 document to be loaded into OJS and then dynamically rendered as HTML and PDF. See:

http://pkp.sfu.ca/support/forum/viewtopic.php?f=9&t=7327 (as noted here, Lemon8 is no longer being actively developed)

http://pkp.sfu.ca/support/forum/viewtopic.php?f=2&t=7374

But, as you ask, how should you produce the XML?

Commercial publishers generally send edited manuscripts to offshore vendors who convert to XML and/or do typesetting. If you use OJS with the XML Galleys plugin, all you need is conversion to JATS. There are many vendors that will do this for you.

Some publishers instead have their staff or typesetters use eXtyles (from a company called Inera) to apply styles in Word and convert to JATS. As you've already discovered, there are some free alternatives:

a) Lemon8-XML (which is no longer being actively developed) essentially involves applying styles in a word processor.

b) Microsoft's Article Authoring Add-in for Word (which is a "Technology Preview" -- essentially, software in beta, but which was released in June under an Apache license: http://authoring.codeplex.com/ )

c) the Norm module of mPach ( http://www.lib.umich.edu/mpach ), which is being actively developed, also involves applying styles in a word processor. In fact, mPach is planned to work with OJS so that you can send manuscripts from OJS directly into the Prepper interface (which includes Norm) for creation of JATS. So your choice of OJS will position you to take advantage of these mPach components once the software is released. For now mPach developers are focusing on the DOCX format but hope to support ODT in the future.

I would not recommend an XML authoring tool like <oXygen/>'s XML Author mode because your authors are unlikely to compose a document using this tool, and XML authorting tools don't work as well as eXtyles and the three tools listed above for converting an author manuscript written using a word processor to JATS. If you used an XML authoring tool, you would end up pasting components of the document into the authoring tool rather than going through the word-processor document and labeling the components.

Hope that helps,

Kevin

On 10/21/12 7:04 AM, Einbrodt Matthias wrote:
Hello,

my name is Matthias Einbrodt and I work for a relatively small University Library in Bozen-Bolzano. We are currently planning to support a research institution in our region to setup and publish an OA online journal. We will use the Open Journal System for this task.

What emerged so far from our first analysis is that we will be most probably in charge of producing/layouting an (X)Html and PDF version of each article. Our idea is that we want to start from an XML based representation (using JATS or TEI) of the article and apply XSLT / XSL:FO transformations to get the X(Html) and PDF representations.

The most interesting question to us, however, is how to produce the XML itself. We came up with the following possibilities:

(1) We burden the author with the task. Although it seems as if there has been some development regarding user-friendly WYSIWYG approaches to XML-Editing (e.g. Microsoft Word Article Authoring Add-In) we assume that the learning curve is still to high for the authors to accept such an approach

(2) Completely re-create an XML representation of the article based upon the manuscript the authors send to us using:

(2.1.) XML authoring tools (e.g. Oxygen XML Author or the already mentioned Microsoft Word Article Authoring Add-In).

(2.2.) XML guessers (e.g. Lemon8) that produce an XML representation of the article automatically. The output of these tools is then checked and corrected (if necessary) using the tools mentioned in 2.1.

(3) Outsourcing the whole process to a 3rd party provider.

My question(s) are now:

(a) Are there other approaches to this task?

(b) What are your approaches to the task of producing JATS-XML and what are your experiences regarding the dimensions quality and cost of the results as well as the duration of the approach?

Hoping for an interesting exchange.

Best regards, Matthias Einbrodt

Current Thread