Re: [jats-list] Why is archiving JATS with a DOI not common?

Subject: Re: [jats-list] Why is archiving JATS with a DOI not common?
From: "David Haber dhaber@xxxxxxxxxx" <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sun, 1 May 2022 03:34:51 -0000
Hi Castedo,

>From a publisher perspective the unit of measure in STM is a published
version-of-record (VOR). A DOI is made of two components, the prefix (the
series of numbers before the slash) and the suffix. The suffix is usually some
generated set of numbers (publishers have their own rules about this). The
prefix is assigned by crossref and is linked to a publisher. So in your below
example, 10.1016 is the prefix for Elsevier content. This means that every
article published by Elsevier will have this prefix and since this prefix is
part of the DOI you mentioned, Elsevier controls the version of record of this
article (which is why access cost 25 bucks).

Now when you look at PMC and Zotero you are seeing the "preprint/author
manuscript" (or however it was defined when the author signed his copyright
transfer agreements). In most cases, the author likely still controls the
rights this version (although this depends on the licensing agreements signed
with by the publisher). This is still registered under the Elsevier DOI,
because that DOI represents the version of record. What you are probably
accessing is some version of the article made free to meet various funder or
institution open access requirements. The reason why these versions do not
have a unique DOI is that originally crossref only minted DOIs for the version
of record. So these free versions used the DOI of the version of record.

This has slightly shifted however, with the advent of preprint servers like
Biorxiv and the like. CrossRef has assigned these orgs a prefix similar to a
publishers prefix, so that they can mint DOIs, but part of the deal is that
these preprints DOIs will eventually point to the VOR.

Maybe we need to think of this a slightly different way. A published article
is considered the VOR. An author's manuscript is not considered published. It
is just hosted online. That is way you can have two versions or states of
content (a manuscript and a VOR) online, that both use the same DOI for access
purposes.

I grant this is problematic if one defines a DOI strictly as a digital object
identifier, but in STM publishing the definition of a DOI also includes the
concept of publication and all the steps that process (peer-review, editorial
decision making, production and composition) a publisher defines to create
their version of record.

DH

-----Original Message-----
From: Castedo Ellerman <castedo@xxxxxxxxxxx>
Sent: Saturday, April 30, 2022 5:06 PM
To: jats-list@xxxxxxxxxxxxxxxxxxxxxx
Cc: David Haber <dhaber@xxxxxxxxxx>
Subject: Re: [jats-list] Why is archiving JATS with a DOI not common?

CAUTION: This email originated from outside of ASM. Do not click links or open
attachments unless you recognize the sender and know the content is safe.

On 4/29/22 07:26, David Haber wrote:
> After reading your question a few times more, are you asking why the
> specific XML component or format of a given article does not have its
> own unique DOI?

More or less, yes, that is what I was wondering, thank you.

> So, perhaps a publisher HTML version would have a DOI, maybe the PDF
> would have a DOI, perhaps an ePub would have a DOI, and maybe the XML?
> And all these dois would be unique?
>
> If that is your question, then the reason is that the article is the
> unit of measure in scholarly publishing, and those other versions are
> just that, versions or different formats. The content is not unique to
> the format so therefore would not get a separate doi. It is true that
> different formats may display a piece of an article differently (or
> maybe not at all) but that does not make the format unique because the
> DOI represents the entire published object and all its formats because
> that is the unique piece we as publishers are shepherding to the world.

I have some clarifications to ask on a few of the terms you've used. I ask
specifically about the DOI 10.1016/j.tpb.2018.03.006. Here are three ways I
can resolve that DOI to three different digital objects:

1) Via doi.org I am sent to a web page where Elsevier requests $25 to view a
PDF file.

2) In Zotero I can enter the doi and I get a free PDF (which is labeled Author
manuscript)

3) I can enter the DOI on PubMed Central and freely see an HTML page (also
labeled Author manuscript)

I assume 1) resolves to different content than 2) and 3) because Elsevier
wants $25.

So we have one DOI which is representing two different sets of content here?
Or does the DOI represent only the $25 article and not the author manuscript?

What is the unit of measure in scholarly publishing in this case?

Is the Author manuscript provided by PubMed Central and Zotero part of the
entire published object or not part?

Is the PubMed Central web page content here not a published object?

Thank you,
 B  Castedo

Current Thread