Re: [jats-list] Does Blue need a Lite version, to counter its creeping aquafication?

Subject: Re: [jats-list] Does Blue need a Lite version, to counter its creeping aquafication?
From: "Beck, Jeff (NIH/NLM/NCBI) [E] beck@xxxxxxxxxxxxxxxx" <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 22 Feb 2021 19:42:24 -0000
Hi Nina and Gerrit,

This looks like really interesting work.

You may have investigated this already, but we have a number of JATS and NLM
XML files available for text mining use from the PMC corpus. You can grab them
by FTP and do what you like with them. Probably you will want the Open Access
Subset

https://www.ncbi.nlm.nih.gov/pmc/tools/textmining/#oasubset

But you can supplement that with XML from the NIH Author Manuscript collection
if you need a few more articles.

JATS-Con will be on April 27 and 28 this year. We will be having an Open
Session on Wednesday. I hope you can give everyone who has been following
along on the JATS List an update of your progress.

Also, this work might be particularly interesting to the greater markup
community. Balisage just posted its Call for Participation for the meeting in
early August. http://www.balisage.net/Call4Participation.html

I know others there would be interested in hearing about this.

Good luck!

Jeff

________________________________
From: Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx
<jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Tuesday, February 16, 2021 12:12 PM
To: jats-list@xxxxxxxxxxxxxxxxxxxxxx <jats-list@xxxxxxxxxxxxxxxxxxxxxx>
Cc: nina_linn.reinhardt@xxxxxxxxxxxxxxxxxxxx
<nina_linn.reinhardt@xxxxxxxxxxxxxxxxxxxx>
Subject: [jats-list] Does Blue need a Lite version, to counter its creeping
aquafication?

Dear JATS Community,

As announced in a previous message to this list [1], Nina Reinhardt is
currently working on her master's thesis in which she tries to find a
consensus customization for the (estimated) 90% of JATS users that only
need about half of Blue's available elements and attributes.

My role in this is that I am co-supervising the thesis and that I came
up with the idea after another discussion on this list last year, in
which Tommie suggested that "a dozen different people (or small groups)
each craft[ed] a 'JATS Lite' and we compare[d] them" [2].

This was our first idea: To provide a form with a list of available
elements and attributes, and people would be able to put together their
favorite Lite customization interactively.

But then we thought that we should also offer a way for people to upload
representative JATS content from their production or repositories and
treat these collections as expressions of tagging preferences, or as
"de-facto customizations". And then she skipped the interactive form
part and focused entirely on analyzing these collections and which
metrics are applicable to them in order to identify consensus
customizations.

Nina has written a paper in which she describes her approach and what is
needed to find this lean consensus customization (your data!):
https://docs.google.com/document/d/1jYDT0TkYP9Tg31Ldd9gFmdwSiu98Q2mg_qOuhgnxp
Rc/

You may skip most technical discussions for the time being and navigate
right to the last section called "Data Collection". It is a call to
action that asks you to donate some of your valuable JATS files to
research. Or you can use some XSLT [3] in order to extract
element/attribute name lists from the JATS files yourselves so you need
not send potentially proprietary data to someone else.

Please donate generously, and if possible do it by March 1st. Nina's
thesis needs to be completed by June.

You are allowed to add comments and suggestions to the Google doc, you
may of course file issues and pull requests in the Github repo, and you
can contact Nina and/or me via this list or direct email messages if you
have questions or suggestions.

On behalf of Nina (and myself),

Gerrit

[1]
https://www.biglist.com/lists/lists.mulberrytech.com/jats-list/archives/20200
9/msg00019.html
[2]
https://www.biglist.com/lists/lists.mulberrytech.com/jats-list/archives/20200
4/msg00030.html
[3] https://github.com/nreinhar/JATS_Customizing_Analysis/

--
Gerrit Imsieke
Geschdftsf|hrer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsieke@xxxxxxxxx, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschdftsf|hrer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt

Current Thread