Subject: Re: [jats-list] aff in- or outside of contrib From: "Pieter Lamers pieter.lamers@xxxxxxxxxxxx" <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Wed, 27 May 2020 11:01:11 -0000 |
Hi all, Thanks for your thoughts and pennies. I have a few remarks: As for Debbie's #1, I can imagine authoring being used in an authoring system, but even there, one may be writing an article with co-authors from the same institute, so I still feel pumpkin should not prohibit contrib/xref. I have a potential use case for #2: we have a couple of translation sites (e.g. https://benjamins.com/online/hts), where original articles are being translated into various languages. The translators of those articles are currently added to the contrib-group with their @contrib-type eq 'translator'. The documentation suggests using a separate contrib-group with @content-type 'translators' for such contributors. When these translators also have their own affiliations, and have them presented in their own list, it might make sense to make aff/aff-alternatives children of contrib-group rather than article-meta. Apart from this consideration I also lean towards moving it all to #3. Best, Pieter On 26/05/2020 18:41, Wendell Piez wapiez@xxxxxxxxxxxxxxx wrote: > Hi, > > Adding my $0.02 to Debbie's analysis and useful breakdown. > > It is all about how much you wish to (pre) normalize the data, and for > which sorts of operations; the form of the normalization would > presumably depend on that. > > Given this, I think Debbie's analysis of the tradeoffs is correct. For > an archival subsistence form given most real-world requirements, for > its clarity and parsimony I would prefer option (3). > > However, I can also imagine a simple 'merge affiliations' > transformation that would render either of the other forms into form > (3), making it possible to use forms 1 or 2 at earlier stages. Making > form (1) from form (3) is also a fairly trivial operation in principle > (i.e. subject to considerations of defining 'identity', etc.). Even > moving from forms (3) or (1) into form (2) is also possible if planned > ahead for. Like Debbie, however, I think form (2) is probably > optimized for the wrong thing (most of the time). > > I am also not writing as a participant in JATS4R. Mainly I'm pitching > in to remind readers that transformations can ease the either/or > problems with this sort of thing, assuming data quality (sometimes a > big assumption I know). > > Cheers, Wendell > > > > On Tue, May 26, 2020 at 5:58 AM Gareth Oakes goakes@xxxxxxx > <mailto:goakes@xxxxxxx> <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx > <mailto:jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>> wrote: > > I think if it was completely greenfields then Debbiebs option #3 > is the way to go. Most of the JATS data we come across works that > way, and itbs not hugely more difficult from an XML processing > perspective. I think the point about being consistent of doing it > one way or the other in your backfile is a very laudable idea. > > Clearly there is no one-size-fits-all across publishers. I feel > like a nice approach for a publisher would be to have a Schematron > acting as an overlay on a base JATS schema. The overlay would > impose the organization- and/or product-specific validation rules > such as how you are meant to tag up <aff>s. I think thatbs a > reasonably commonly used approach? Better than maintaining a > customized schema p > > // Gareth Oakes > > // Chief Architect, GPSL > > // www.gpsl.co <http://www.gpsl.co> > > *From: *"Melissa Harrison m.harrison@xxxxxxxxxxxxxxxxx > <mailto:m.harrison@xxxxxxxxxxxxxxxxx>" > <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx > <mailto:jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>> > *Reply to: *"jats-list@xxxxxxxxxxxxxxxxxxxxxx > <mailto:jats-list@xxxxxxxxxxxxxxxxxxxxxx>" > <jats-list@xxxxxxxxxxxxxxxxxxxxxx > <mailto:jats-list@xxxxxxxxxxxxxxxxxxxxxx>> > *Date: *Tuesday, 26 May 2020 at 18:26 > *To: *"jats-list@xxxxxxxxxxxxxxxxxxxxxx > <mailto:jats-list@xxxxxxxxxxxxxxxxxxxxxx>" > <jats-list@xxxxxxxxxxxxxxxxxxxxxx > <mailto:jats-list@xxxxxxxxxxxxxxxxxxxxxx>> > *Subject: *Re: [jats-list] aff in- or outside of contrib > > Hi there > > *On behalf of JATS4R* > > This working group thought very long and hard and had many > discussions/heated debates about thisB - people have different > reasons for following the different options and if they went for 1 > option only in the recommendation, this would alienate theB people > usingB the other option. Therefore, they had to come up with a more > flexible model to ensure JATS4R can help standardise the standard > as much as possible while making it accessible to everyone to > implement. > > Not helpful, I appreciate, when you are willing to change your > data model! > > Cheers > > Melissa > > > Melissa Harrison > > Head of Production Operations > > Tel:B +44 1223 855340 > > http://elifesciences.org <http://elifesciences.org/> > > On Mon, May 25, 2020 at 11:44 PM Debbie Lapeyre > dalapeyre@xxxxxxxxxxxxxxxx <mailto:dalapeyre@xxxxxxxxxxxxxxxx> > <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx > <mailto:jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>> wrote: > > 1) contrib/aff > > IMO: JATS allows <aff> as a child of <contrib> precisely for the > single-author case (which is what Authoring was also designed for, > although almost nobody uses Authoring.) > > In the modern STEM world, it is not unusual to have 100+ authors. > I do not favor contrib/aff, as it can lead to a lot of > redundant data. > > If, as is also common, a single author has 4 or 5 institutional > affiliations, the data proliferation gets even worse. > > 2) contrib-group/aff > > IMO: This one was allowed, so that publishers could group authors > by institution, and only need to input the <aff> once, for the > whole group. Rare nowadays, I hope. > > 3) <aff>s all together AFTER last <contrib-group> > > IMO: This is the cleanest. Each <aff> is only present once, > eliminating redundant data. Yes, you need to use an <xref> on > each author pointing to each applicable <aff>. But it is easy > for one author to have 5 affiliations and for 100 authors to > have only 6 between them. > > I think this is cleanest for querying as well, as you can write > an XPath to find all the <aff>s with the characteristic you want > (all from one country or all NIH or whatever) and then get the > contributors who have the @rid on their <xref> that matches > the @id you found on the <aff> or <aff>s you wanted. > > This is a Lapeyre not a Mulberry opinion. > I do not work for or with JATS4R (good folks though). > > --Debbie > > > > > On May 25, 2020, at 5:38 PM, Pieter Lamers > pieter.lamers@xxxxxxxxxxxx <mailto:pieter.lamers@xxxxxxxxxxxx> > <jats-list-service@xxxxxxxxxxxxxxxxxxxxxx > <mailto:jats-list-service@xxxxxxxxxxxxxxxxxxxxxx>> wrote: > > > > Hi All, > > > > We are looking into refactoring our article-meta structure > with regards to affiliations. We now have two practices: > > > > 1. <aff> is a child of <contrib>, no xref linking needed. > > 2. <aff> is a child of <article-meta> (or <contrib-group>), > xref linking needed between <contrib> and <aff> > > > > We are a bit in doubt as to what the preferred format should be. > > > > A quick check on jats4r.org <http://jats4r.org> > (https://jats4r.org/authors-and-affiliations) tells me that > there is no preferred format for the choice we are facing: "It > is the content-providerbs choice which to use". > > > > Sometimes it is suggested to follow the strictest variant of > JATS where possible so we took a look at pumpkin > (article-authoring). It appears that (2) is not possible, as > <aff> cannot be a child of <contrib-group> or <article-meta>, > even though the notes tell us that > > > > "The linkage from a contributor to an affiliation should be > made using the ID/IDREF mechanism. The @id attribute of an > <aff> element will be pointed to from one or more <contrib> > elements." > (https://jats.nlm.nih.gov/articleauthoring/tag-library/1.3d1/element/aff.html) > > > > This means that moving away from pattern (1) is making the > document less compatible with pumpkin. Not that this is a > compelling argument I guess. What I am thinking is: > > > > a. having <aff> separate means less redundancy in the file > (argument for choosing (2) ) > > b. having <aff> inside <contrib> is closer to the semantics > as I perceive them: affiliation is primarily a property of the > author, not of the article (argument in favor of (1) ). > > > > The demand for statistics of any kind is growing. The other > day we were asked to report numbers of articles with a first > author affiliated with some affiliation in a list of German > institutes. I could report this from an SQL copy of the data, > but would like to see the JATS files with the flexible nature > of XML as the place to ask, so we are going to add ROR if we > can find it, and maybe other identifiers. This would save me > from keeping all these data in sync with SQL. But in such a > case it would be nice to have a single structure pattern to > query and not multiple. > > > > Any thoughts, anyone? > > > > Best > > Pieter > > > > > > -- > > Pieter Lamers > > John Benjamins Publishing Company > > Postal Address: P.O. Box 36224, 1020 ME AMSTERDAM, The > Netherlands > > Visiting Address: Klaprozenweg 75G, 1033 NN AMSTERDAM, The > Netherlands > > Warehouse: Kelvinstraat 11-13, 1446 TK PURMEREND, The > Netherlands > > tel: +31 20 630 4747 > > web: www.benjamins.com <http://www.benjamins.com> > > > > > ================================================================ > Deborah A LapeyreB B B B B B B > mailto:dalapeyre@xxxxxxxxxxxxxxxx > <mailto:dalapeyre@xxxxxxxxxxxxxxxx> > Mulberry Technologies, Inc. http://www.mulberrytech.com > <http://www.mulberrytech.com> > 17 West Jefferson StreetB B B B B Phone: 301-315-9631 (USA) > Suite 207B B B B B B B B B B B B Fax:B B 301-315-8385 > Rockville, MD 20850 > ---------------------------------------------------------------- > Mulberry Technologies: Consultancy for XML, XSLT, and Schematron > ================================================================ > > > Image removed by sender. > > elifesciences.org <https://elifesciences.org> > > eLife Sciences Publications, Ltd is a limited liability non-profit > non-stock corporation incorporated in the State of Delaware, USA, > with company number 5030732, and is registered in the UK with > company number FC030576 and branch number BR015634 at the address > Westbrook Centre, Milton Road, Cambridge, CB4 1YG. > > JATS-List info and archive > <http://www.mulberrytech.com/JATS/JATS-List/> > > EasyUnsubscribe > <http://lists.mulberrytech.com/unsub/jats-list/2708257> (by email) > > > > -- > ...Wendell Piez... ...wendell -at- nist -dot- gov... > ...wendellpiez.com... ...pellucidliterature.org... ...pausepress.org... > ...github.com/wendellpiez. <http://github.com/wendellpiez.>.. > ...gitlab.coko.foundation/wendell... > JATS-List info and archive <http://www.mulberrytech.com/JATS/JATS-List/> > EasyUnsubscribe > <http://lists.mulberrytech.com/unsub/jats-list/2854576> (by email > <>) -- Pieter Lamers John Benjamins Publishing Company Postal Address: P.O. Box 36224, 1020 ME AMSTERDAM, The Netherlands Visiting Address: Klaprozenweg 75G, 1033 NN AMSTERDAM, The Netherlands Warehouse: Kelvinstraat 11-13, 1446 TK PURMEREND, The Netherlands tel: +31 20 630 4747 web: www.benjamins.com
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [jats-list] aff in- or outside , Wendell Piez wapiez@ | Thread | |
Re: [jats-list] aff in- or outside , Wendell Piez wapiez@ | Date | |
Month |