Re: [jats-list] How/When do you produce a JATS-XML version of you publication within your publication workflow

Subject: Re: [jats-list] How/When do you produce a JATS-XML version of you publication within your publication workflow
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxx>
Date: Mon, 22 Oct 2012 09:52:46 -0400
Dear Matthias,

I think you've fairly well covered the range of options available to
you for getting good tagged data. There are only two things I would
add:

1. The difficulty of producing the XML (by whatever means) and the
quality of the results are both quite sensitive to how ambitious you
are in the design of your tagging profile (the set of tags you decide
to use and the rules you use them by). Aim for a lower level and you
will get better results, at the cost of functionality and potential
application -- so you need to strike the right balance here. Keep in
mind when deciding on this design that it's not really about which tag
set you use (JATS, TEI or other) but about how you use it. The most
elaborately capable tagging scheme is not very useful if the data is
bad.

2. The question of scale is critical. Five articles per month is much
different from a fifty. I know this is obvious, but it bears
repeating, and you can't really apply the experience of someone at one
end of this scale usefully to that of someone at the other.

So my first bit of free advice is to seek out others who are already
doing this at a similar level of scale and ask them what they are
doing. If their field is close to yours and their data similar, so
much the better. (If you write me off list I might be able to be more
specific about whom or where you could ask. Of course this list is a
great place to start.)

I agree that having authors code their XML isn't yet viable and
probably won't be until tools are considerably more advanced. (Work in
this direction is starting to accelerate, however, and this is the
kind of thing that can change fast.) So for a small shop,
semi-automated conversions may be the way to go. (For a medium to
large shop, outsourcing may be more attractive.) More and more of this
is going on, and there are more choices in tools to help with it. When
doing this, don't be shy of using any and all tools available, in
combination.

And yes, this does mean you'll need to cultivate a significant level
of XML expertise in house, covering CSS and XPath/XSLT at least. (The
latter is also an open door to Schematron and XQuery.)

In the meantime, I think it's coming to be a truism that however you
do it, QA is essential and that semi-automated approaches can help
with this too. (These include Schematron validation and what we call
"false color proofs", essentially semantically-formatted galleys --
XSLT again.)

I hope we hear from others on this important topic, especially from
others who can speak more specifically.

Cheers,
Wendell

On Sun, Oct 21, 2012 at 7:04 AM, Einbrodt Matthias
<Matthias.Einbrodt@xxxxxxxx> wrote:
> Hello,
>
> my name is Matthias Einbrodt and I work for a relatively small University
Library in Bozen-Bolzano. We are currently planning to support a research
institution in our region to setup and publish an OA online journal. We will
use the Open Journal System for this task.
>
> What emerged so far from our first analysis is that we will be most probably
in charge of producing/layouting an (X)Html and PDF version of each article.
Our idea is that we want to start from an XML based representation (using JATS
or TEI) of the article and apply XSLT / XSL:FO transformations to get the
X(Html) and PDF representations.
>
> The most interesting question to us, however, is how to produce the XML
itself. We came up with the following possibilities:
>
> (1) We burden the author with the task. Although it seems as if there has
been some development regarding user-friendly WYSIWYG approaches to
XML-Editing (e.g. Microsoft Word Article Authoring Add-In) we assume that the
learning curve is still to high for the authors to accept such an approach
>
> (2) Completely re-create an XML representation of the article based upon the
manuscript the authors send to us using:
>
> (2.1.) XML authoring tools (e.g. Oxygen XML Author or the already mentioned
Microsoft Word Article Authoring Add-In).
>
> (2.2.) XML guessers (e.g. Lemon8) that produce an XML representation of the
article automatically. The output of these tools is then checked and corrected
(if necessary) using the tools mentioned in 2.1.
>
> (3) Outsourcing the whole process to a 3rd party provider.
>
> My question(s) are now:
>
> (a) Are there other approaches to this task?
>
> (b) What are your approaches to the task of producing JATS-XML and what are
your experiences regarding the dimensions quality and cost of the results as
well as the duration of the approach?
>
> Hoping for an interesting exchange.
>
> Best regards, Matthias Einbrodt

--
Wendell Piez | http://www.wendellpiez.com
XML | XSLT | electronic publishing
Eat Your Vegetables
_____oo_________o_o___ooooo____ooooooo_^

Current Thread