Subject: Re: [jats-list] aff in- or outside of contrib From: "Charles O'Connor coconnor@xxxxxxxxxxxx" <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Fri, 30 Oct 2020 21:28:22 -0000 |
Howdy, Ibm reminded that I havenbt looked at the JATS list during the pandemic, because this is an issue I had to make a call on when subsetting the DTD for Aries workflows. I went with #2 for much the same reason as Pieter, separating different contributor types. This structure is especially useful in content that is likely to have long lists of non-byline authors/affiliations that may not actually be rendered, and if rendered, not at the beginning of the article. --Charles From: Pieter Lamers pieter.lamers@xxxxxxxxxxxx <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> Sent: Wednesday, May 27, 2020 7:01 AM To: jats-list@xxxxxxxxxxxxxxxxxxxxxx Subject: Re: [jats-list] aff in- or outside of contrib *** External email: use caution *** Hi all, Thanks for your thoughts and pennies. I have a few remarks: As for Debbie's #1, I can imagine authoring being used in an authoring system, but even there, one may be writing an article with co-authors from the same institute, so I still feel pumpkin should not prohibit contrib/xref. I have a potential use case for #2: we have a couple of translation sites (e.g. https://benjamins.com/online/hts), where original articles are being translated into various languages. The translators of those articles are currently added to the contrib-group with their @contrib-type eq 'translator'. The documentation suggests using a separate contrib-group with @content-type 'translators' for such contributors. When these translators also have their own affiliations, and have them presented in their own list, it might make sense to make aff/aff-alternatives children of contrib-group rather than article-meta. Apart from this consideration I also lean towards moving it all to #3. Best, Pieter On 26/05/2020 18:41, Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxx wrote: Hi, Adding my $0.02 to Debbie's analysis and useful breakdown. It is all about how much you wish to (pre) normalize the data, and for which sorts of operations; the form of the normalization would presumably depend on that. Given this, I think Debbie's analysis of the tradeoffs is correct. For an archival subsistence form given most real-world requirements, for its clarity and parsimony I would prefer option (3). However, I can also imagine a simple 'merge affiliations' transformation that would render either of the other forms into form (3), making it possible to use forms 1 or 2 at earlier stages. Making form (1) from form (3) is also a fairly trivial operation in principle (i.e. subject to considerations of defining 'identity', etc.). Even moving from forms (3) or (1) into form (2) is also possible if planned ahead for. Like Debbie, however, I think form (2) is probably optimized for the wrong thing (most of the time). I am also not writing as a participant in JATS4R. Mainly I'm pitching in to remind readers that transformations can ease the either/or problems with this sort of thing, assuming data quality (sometimes a big assumption I know). Cheers, Wendell On Tue, May 26, 2020 at 5:58 AM Gareth Oakes mailto:goakes@xxxxxxx <mailto:jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: I think if it was completely greenfields then Debbiebs option #3 is the way to go. Most of the JATS data we come across works that way, and itbs not hugely more difficult from an XML processing perspective. I think the point about being consistent of doing it one way or the other in your backfile is a very laudable idea. B Clearly there is no one-size-fits-all across publishers. I feel like a nice approach for a publisher would be to have a Schematron acting as an overlay on a base JATS schema. The overlay would impose the organization- and/or product-specific validation rules such as how you are meant to tag up <aff>s. I think thatbs a reasonably commonly used approach? Better than maintaining a customized schema p B // Gareth Oakes // Chief Architect, GPSL // http://www.gpsl.co B From: "Melissa Harrison mailto:m.harrison@xxxxxxxxxxxxxxxxx" <mailto:jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> Reply to: "mailto:jats-list@xxxxxxxxxxxxxxxxxxxxxx" <mailto:jats-list@xxxxxxxxxxxxxxxxxxxxxx> Date: Tuesday, 26 May 2020 at 18:26 To: "mailto:jats-list@xxxxxxxxxxxxxxxxxxxxxx" <mailto:jats-list@xxxxxxxxxxxxxxxxxxxxxx> Subject: Re: [jats-list] aff in- or outside of contrib B Hi there B On behalf of JATS4R B This working group thought very long and hard and had many discussions/heated debates about thisB - people have different reasons for following the different options and if they went for 1 option only in the recommendation, this would alienate theB people usingB the other option. Therefore, they had to come up with a more flexible model to ensure JATS4R can help standardise the standard as much as possible while making it accessible to everyone to implement. B Not helpful, I appreciate, when you are willing to change your data model! B Cheers Melissa B B Melissa Harrison Head of Production Operations Tel:B +44 1223 855340 http://elifesciences.org/ B B On Mon, May 25, 2020 at 11:44 PM Debbie Lapeyre mailto:dalapeyre@xxxxxxxxxxxxxxxx <mailto:jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: 1) contrib/aff IMO: JATS allows <aff> as a child of <contrib> precisely for the single-author case (which is what Authoring was also designed for, although almost nobody uses Authoring.) In the modern STEM world, it is not unusual to have 100+ authors. I do not favor contrib/aff, as it can lead to a lot of redundant data. If, as is also common, a single author has 4 or 5 institutional affiliations, the data proliferation gets even worse. 2) contrib-group/aff IMO: This one was allowed, so that publishers could group authors by institution, and only need to input the <aff> once, for the whole group. Rare nowadays, I hope. 3) <aff>s all together AFTER last <contrib-group> IMO: This is the cleanest. Each <aff> is only present once, eliminating redundant data. Yes, you need to use an <xref> on each author pointing to each applicable <aff>. But it is easy for one author to have 5 affiliations and for 100 authors to have only 6 between them. I think this is cleanest for querying as well, as you can write an XPath to find all the <aff>s with the characteristic you want (all from one country or all NIH or whatever) and then get the contributors who have the @rid on their <xref> that matches the @id you found on the <aff> or <aff>s you wanted. This is a Lapeyre not a Mulberry opinion. I do not work for or with JATS4R (good folks though). --Debbie > On May 25, 2020, at 5:38 PM, Pieter Lamers mailto:pieter.lamers@xxxxxxxxxxxx <mailto:jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > Hi All, > > We are looking into refactoring our article-meta structure with regards to affiliations. We now have two practices: > > 1. <aff> is a child of <contrib>, no xref linking needed. > 2. <aff> is a child of <article-meta> (or <contrib-group>), xref linking needed between <contrib> and <aff> > > We are a bit in doubt as to what the preferred format should be. > > A quick check on http://jats4r.org (https://jats4r.org/authors-and-affiliations) tells me that there is no preferred format for the choice we are facing: "It is the content-providerbs choice which to use". > > Sometimes it is suggested to follow the strictest variant of JATS where possible so we took a look at pumpkin (article-authoring). It appears that (2) is not possible, as <aff> cannot be a child of <contrib-group> or <article-meta>, even though the notes tell us that > > "The linkage from a contributor to an affiliation should be made using the ID/IDREF mechanism. The @id attribute of an <aff> element will be pointed to from one or more <contrib> elements." (https://jats.nlm.nih.gov/articleauthoring/tag-library/1.3d1/element/aff.html) > > This means that moving away from pattern (1) is making the document less compatible with pumpkin. Not that this is a compelling argument I guess. What I am thinking is: > > a. having <aff> separate means less redundancy in the file (argument for choosing (2) ) > b. having <aff> inside <contrib>B is closer to the semantics as I perceive them: affiliation is primarily a property of the author, not of the article (argument in favor of (1) ). > > The demand for statistics of any kind is growing. The other day we were asked to report numbers of articles with a first author affiliated with some affiliation in a list of German institutes. I could report this from an SQL copy of the data, but would like to see the JATS files with the flexible nature of XML as the place to ask, so we are going to add ROR if we can find it, and maybe other identifiers. This would save me from keeping all these data in sync with SQL. But in such a case it would be nice to have a single structure pattern to query and not multiple. > > Any thoughts, anyone? > > Best > Pieter > > > -- > Pieter Lamers > John Benjamins Publishing Company > Postal Address: P.O. Box 36224, 1020 ME AMSTERDAM, The Netherlands > Visiting Address: Klaprozenweg 75G, 1033 NN AMSTERDAM, The Netherlands > Warehouse: Kelvinstraat 11-13, 1446 TK PURMEREND, The Netherlands > tel: +31 20 630 4747 > web: http://www.benjamins.com > ================================================================ Deborah A LapeyreB B B B B B B mailto:mailto:dalapeyre@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc.B B B http://www.mulberrytech.com 17 West Jefferson StreetB B B B B Phone: 301-315-9631 (USA) Suite 207B B B B B B B B B B B B Fax:B B 301-315-8385 Rockville, MD 20850 ---------------------------------------------------------------- Mulberry Technologies: Consultancy for XML, XSLT, and Schematron ================================================================ B https://elifesciences.org B eLife Sciences Publications, Ltd is a limited liability non-profit non-stock corporation incorporated in the State of Delaware, USA, with company number 5030732, and is registered in the UK with company number FC030576 and branch number BR015634 at the address Westbrook Centre, Milton Road, Cambridge, CB4 1YG. http://www.mulberrytech.com/JATS/JATS-List/ http://lists.mulberrytech.com/unsub/jats-list/2708257 (by email) -- ...Wendell Piez... ...wendell -at- nist -dot- gov... ...wendellpiez.com... ...pellucidliterature.org... ...pausepress.org... ...http://github.com/wendellpiez... ...gitlab.coko.foundation/wendell... http://www.mulberrytech.com/JATS/JATS-List/ http://lists.mulberrytech.com/unsub/jats-list/2854576 (by email) -- Pieter Lamers John Benjamins Publishing Company Postal Address: P.O. Box 36224, 1020 ME AMSTERDAM, The Netherlands Visiting Address: Klaprozenweg 75G, 1033 NN AMSTERDAM, The Netherlands Warehouse: Kelvinstraat 11-13, 1446 TK PURMEREND, The Netherlands tel: +31 20 630 4747 web: http://www.benjamins.com http://www.mulberrytech.com/JATS/JATS-List/ http://lists.mulberrytech.com/unsub/jats-list/2963104 ()
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[jats-list] [ANN] courses - XSLT 1 , Liam R. E. Quin liam | Thread | |
[jats-list] [ANN] courses - XSLT 1 , Liam R. E. Quin liam | Date | |
Month |