Re: [jats-list] NLM Book DTD 2.1 Book to JATS/BITS 1.0

Subject: Re: [jats-list] NLM Book DTD 2.1 Book to JATS/BITS 1.0
From: "Wendell Piez wapiez@xxxxxxxxxxxxxxx" <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 3 Feb 2015 20:14:59 -0000
Hi Derrick,

Yes, it is possible to automate the conversion of NLM DTD 2.1 into
JATS 1.0. There may be issues around the edges but these can generally
be discovered and managed. As Bruce remarks, these are mostly in
metadata and structuring of high-level wrappers ... with some other
miscellanous stuff here and there, turning old 'citation' into
'mixed-citation' and that sort of thing.

Basically there are two approaches to doing this kind of conversion:
working from the respective DTDs themselves (establishing a mapping at
all points of deviation), and working from (over) a data set. Although
reference to the DTDs is usually necessary, at least to an extent, and
is always good "due diligence", ordinarily a developer will prefer to
rely on a data-driven approach for all kinds of reasons. (For one, it
helps to give nice edges around the problem domain.) Of course, this
requires that your data is already well-controlled. (Such a conversion
has a way of exposing whatever issues of data quality you have -
automating this conversion is not the same thing as automating QA! :-)
And it means that the requirements for the conversion will be
considered as very local (even when they may also serve as suitable
"generic" mappings for anyone else going between these precise schema
variants).

As for why people haven't done this ... well actually, they have. But
more commonly, this conversion will be part of a production pipeline
(for a vendor or partner who accepts JATS), i.e. not with the overt
purpose of converting one archiving format into another. (Instead the
resulting JATS is simply exposed as an output.) This necessitates
something of a different approach to the edge cases.

Whether this is "easy" or "hard" -- well, that's a subjective or
rather highly context-dependent judgement. I would say the analysis is
much the harder part and the XSLT will be for the most part
straightforward. But for whatever reason, some developers find even
"straightforward" XSLT to be a lot to get their head around. :-(

But -- if any other technology is remotely well suited to this I'd love to
know!

Cheers, Wendell


On Tue, Jan 27, 2015 at 3:59 PM, Philips, Derrick philips@xxxxxxxx
<jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> Good afternoon JATS brethren. I hope those of you along the Eastern
seaboard
> are enjoying a relaxing snow day today!
>
>
>
> For many years, much of our publication data was exported to and stored in
> NLM Book DTD 2.1 (including journal articles).  In the past 6 months, with
> the help of some great vendors (Hi Bruce and Chandi!) and with a few
> committed staff members, we have begun producing all of our book content in
> BITS 1.0.  The long-term goal is to have all of our data standardized in
one
> format, whether it be BITS or JATS.  From my own research, I havenbt seen
> any evidence that someone has written, or even tried to write, a conversion
> script from NLM Book DTD 2.1 to JATS.  I have seen NLM Journal DTD 2.3 to
> JATS 1.0 scripts, but nothing for a version older than 2.3.
>
>
>
> Is anyone aware of anyone doing anything like that?  Is it even possible to
> automate the conversion of XML from NLM DTD v. 2.1 to JATS 1.0?
>
>
>
> Any help at all would be greatly appreciated.
>
>
>
> Thanks,
>
>
>
> Derrick
>
>
>
> Derrick Philips
>
> Senior IT Business Analyst
>
> American Academy of Orthopaedic Surgeons
>
> Direct: 847.384.4150
>
> Email: philips@xxxxxxxx
>
> AAOS.Org | OrthoPortal.Org | OrthoInfo.Org
>
>
>
> JATS-List info and archive
> EasyUnsubscribe (by email)



--
Wendell Piez | http://www.wendellpiez.com
XML | XSLT | electronic publishing
Eat Your Vegetables
_____oo_________o_o___ooooo____ooooooo_^

Current Thread