Subject: Re: [jats-list] citation "year" with suffix|
From: Alf Eaton <eaton.alf@xxxxxxxxx>
Date: Wed, 27 Feb 2013 11:45:03 +0000
I think it depends on whether the contents of <*-citation> are being treated as a string to which some semantics are added (mixed-citation), or as a set of data with no pre-defined rendering (element-citation). The former being more useful if you're marking up legacy content, and the latter being more desirable if you're creating reference lists from scratch. I'd like to be able to choose how the citations are rendered at display time, if at all possible, which basically makes element-citation a requirement for our newly-generated content; if using mixed-citation, any non-marked-up information would be lost. If it does turn out that lots of citations are having to be entered as plain text in a <comment>, though, then maybe mixed-citation will turn out to be more appropriate for those cases... Alf On 27 February 2013 10:41, Nikos Markantonatos <nikos@xxxxxxxxxx> wrote: > Hi Kaveh, > > >> I agree that for subjects with a lot of legacy citations which cannot >> easily be structured, mixed citation might be better, but for STM >> research where most refs are structured, I think that it would >> encourage more structure. > > As others have argued already, <element-citation> would simply encourage > more tag abuse and hidden encoding problems. <mixed-citation> only encodes > whatever metadata is known for a citation and leaves everything else, > including spacing and punctuation, unmarked. Clearly, we encourage XML > creators to supply citations as fully marked up as possible, but there are > practical reasons where this may not be possible. > > >> My preference is that at least for normal journal and book citation >> which has structure, the data should be pure, and devoid of textual >> embellishments like en-dashes and semi-colons. These can be put in by >> any intelligent renderer on the fly, and even produce the most >> beautiful typesetting. > > Textual embellishments may appear to be less crucial than the citation > metadata, yet they form an important part of an archival quality XML. > Renderers, no matter how intelligent they are, will never be able to > reinstate lost information. And this is because lost information is not > consistent and there are always good reasons behind that. Over the years and > over the course of dozens of legacy content migration projects, we have > found monstrous implicit rules hidden behind the idea of maintaining "pure > data". We had to apply reverse engineering techniques and inquire with > people who had originally encoded the information. > > Bottom line is that if you employed an intelligent renderer to display that > content, it would be a multi-week effort for each case, custom built for > each particular subset of content. I cannot believe that this is the idea > behind an archival quality XML. Not unless you are willing to associate each > complex renderer and store it along with the XML it corresponds to. But > clearly, this goes against the original idea of storing all the information > pertaining to an article in a single XML file. > > Tags like <mixed-citation>, <string-name> and <string-date> offer the power > to those who care to keep both the article structure and the associated > display information in a single XML file. This, in turn, helps keep > renderers simple and reusable across a vast variety of content. And I > consider this capability to be one of the strongest assets of the NLM/JATS > family of DTDs. > > > Best regards, > Nikos Markantonatos > Atypon > > > On 02/26/2013 05:24 PM, Kaveh Bazargan wrote: >> >> Hi Nikos >> >> I bow to the volume of data that you have in your organization, and my >> exposure to the variety of xml does not compare to yours. ;-) >> >> I agree that for subjects with a lot of legacy citations which cannot >> easily be structured, mixed citation might be better, but for STM >> research where most refs are structured, I think that it would >> encourage more structure. I would like us to try hard to make >> something structured and only use <comment> if there is no other >> option. >> >> Also I personally think that punctuation does not belong in data >> (unless there is no way of structuring). I believe that the Atypon DTD >> uses the x tag which makes it easy at least to remove the punctuation, >> but for me, putting the punctuation in verbatim in the data goes >> against the spirit of structure. >> >> My preference is that at least for normal journal and book citation >> which has structure, the data should be pure, and devoid of textual >> embellishments like en-dashes and semi-colons. These can be put in by >> any intelligent renderer on the fly, and even produce the most >> beautiful typesetting. But know there is plenty of "unintelligent" >> rendering engines out there that rely on a helping hand from >> typographic niceties peppered in the XML. If this is any part of the >> reason for leaving punctuation, then it worries me greatly. >> >> And just an example of how ridiculous mixed citation can get, here is >> an example from a recently published paper: >> >> <mixed-citation> >> US CDC 1990. International notes earthquake disaster: Luzon, >> Philippines. Mortality and Morbidity Weekly Report 39(34): 573-577. >> </mixed-citation> >> >> This has arguably _less_ structure than a printed reference. The >> latter at least has bold and italic which hint at what each item might >> represent. ;-) >> > > > -- > Confidentiality Notice: This email and any attachments are for the sole use > of the intended recipient(s) and contain information that may be > confidential and/or legally privileged. If you have received this email in > error, please notify the sender by reply email and delete the message. Any > disclosure, copying, distribution or use of this communication by someone > other than the intended recipient is prohibited.