Re: [jats-list] Element for wrapping a group of xref elements

Subject: Re: [jats-list] Element for wrapping a group of xref elements
From: Rajagopal CV <cvr3@xxxxxxxxxxxxxxxx>
Date: Fri, 1 Feb 2013 17:40:03 +0530
On Fri, Feb 1, 2013 at 2:01 PM, GNU XML <gnu.xml@xxxxxxxxx> wrote:
> On Thu, Jan 31, 2013 at 11:41 PM, Kaveh Bazargan <kaveh@xxxxxxxxxxxxxxxx>
wrote:
> On 31 January 2013 17:09, Alf Eaton <eaton.alf@xxxxxxxxx> wrote:
>> On 24 January 2013 16:26, Kaveh Bazargan <kaveh@xxxxxxxxxxxxxxxx> wrote:
>>
>> > So in a very generic terminology, a citation to a reference should be:
>> >
>> > <reference>ref1, ref2, ref3, ref19, ref22, ref23, ref24</reference>
>> >
>> > with no regard for what the rendered output should look like.
>>
>> That's essentially the same as <xref rid="ref1 ref2 ref3 ref19 ref22
>> ref23 ref24"/> - which would be ideal, but the problem is that most of
>> the time we're marking up cross-references in text that already
>> contains punctuation, e.g.
>> (<xref rid="ref1">Smith, 1999</xref>; <xref rid="ref2">Jones, 2003</xref>)
>>
>> To be able to reformat those references using numeric reference labels
>> instead of author-year, there has to be some way of knowing where the
>> punctuation starts and ends, so either:
>
> [...]
>
>>
>> Not sure I quite understand, but my argument is that we don't really
>> need "labels" at all. If references are correctly structured,
>> identifying the author, year, etc, then we choose at "run-time"
>> whether to have numeric or author-year labels in the text. We can of
>> course put a default label in, but in principle one output would
>> automatically give:
>>
>> (Smith, 1999; Jones, 2003)
>>
>> and another:
>>
>> [18, 25]
>>
>> and of course any "contraction" can be automatically generated too, e.g.
>>
>> [18, 2225, 27, 3337]
>
>
> Real life, situations are not as simple as the ones listed above. See
> instances like,
>
> (Smith, 1999)   => author, year -- parenthetical citation
> Smith (1999)   => author (year) -- textual
> Smith  => author name alone cited
> 1999  => year alone cited
>
> (Smith, 1999a,b) => two citations of the author in the same year
>
> Smith, Jones, Jefferson, Edison, Newton (1999) => all author names
> Smith et al (1999) => another variant of the same citation
>
> Smith [3] => In numeric scheme, author names are also cited and quite
common
>
> In cross references of equations, theorems and theorem like
> environments, there are myriad instances that evade automatic and
> straight forward generation of labels in body text. I won't say it is
> impossible, but the resources spent on will be disproportionate and
> often not fun.
>
> More importantly, we are forgetting the vital aspect of freedom of
> author to communicate in the way he wants it to happen. XML is to
> assist the author and not that authors should play to the conveniences
> of XML or technologies.

Let me paste some typical <cross-refs> appearing in the XMLs of Elsevier DTD
in various document environments.

<ce:cross-refs refid="b06.0005 b06.0010 b06.0015 b06.0020
b06.0025">15</ce:cross-refs>
<ce:cross-refs refid="b18.0110 b18.0205">22,41</ce:cross-refs>

<ce:cross-refs refid="b18.0055 b18.0060 b18.0065 b18.0080 b18.0085
	b18.0090 b18.0095 b18.0100 b18.0105">1113,1621</ce:cross-refs>

<ce:cross-refs refid="b18.0020 b18.0105 b18.0125 b18.0130
	b18.0135 b18.0140 b18.0145 b18.0150 b18.0155 b18.0160
	b18.0165 b18.0170">4,21,2534</ce:cross-refs>
<ce:cross-refs refid="b18.0020 b18.0145 b18.0150 b18.0155
	b18.0160 b18.0165 b18.0170">4,2934</ce:cross-refs>

<ce:cross-refs refid="b18.0050 b18.0160 b18.0165
b18.0170">10,3234</ce:cross-refs>
<ce:cross-refs refid="b18.0020 b18.0105 b18.0160 b18.0165
b18.0170">4,21,3234</ce:cross-refs>
<ce:cross-refs refid="b18.0020 b18.0115">4,23</ce:cross-refs>
<ce:cross-refs refid="b18.0020 b18.0050 b18.0115 b18.0120 b18.0125
	b18.0130 b18.0175 b18.0180 b18.0185 b18.0190 b18.0195 b18.0200
	b18.0205">4,10,2326,3541</ce:cross-refs>

<ce:cross-refs refid="b02.0115 b02.0120 b02.0125 b02.0130
	b02.0135 b02.0140 b02.0145 b02.0150 b02.0155 b02.0160 b02.0165
	b02.0170 b02.0175 b02.0180 b02.0185">2337</ce:cross-refs>

<ce:cross-refs refid="fd07.0045 fd07.0050">(3.5a) and (3.5b)</ce:cross-refs>
<ce:cross-refs refid="tbl6.2 tbl6.3">Tables 6.2 and 6.3</ce:cross-refs>
<ce:cross-refs refid="fd15.0010 fd15.0015">(6.1a) and (6.1b)</ce:cross-refs>
<ce:cross-refs refid="fd15.0060 fd15.0070">(6.5) and (6.7)</ce:cross-refs>

<ce:cross-refs refid="fd03.0095 fd03.0100 fd03.0110">(7.8b), (7.9) and
(7.10)</ce:cross-refs>

<ce:cross-refs refid="fd03.0415 fd03.0495">(7.38) and (7.46)</ce:cross-refs>

<ce:cross-refs refid="fd03.0510 fd03.0515 fd03.0520">(7.47a) through
(7.47c)</ce:cross-refs>
<ce:cross-refs refid="fd03.0800 fd03.0805">(7.72a) and
(7.72b)</ce:cross-refs>
<ce:cross-refs refid="s20.0005 s20.0010 s20.0015 s20.0030">Secs. 8.1
through 8.4</ce:cross-refs>
<ce:cross-refs refid="fd20.0425 fd20.0405">(8.26c) and (8.25)</ce:cross-refs>
<ce:cross-refs refid="fd08.0195 fd08.0200">Equations (9.14a) and
(9.14b)</ce:cross-refs>
<ce:cross-refs refid="s21.0005 s21.0010 s21.0030 s21.0035
s21.0040">Secs. 10.1 through 10.5</ce:cross-refs>
<ce:cross-refs refid="fd21.0270 fd21.0275">(10.22c) and
(10.23)</ce:cross-refs>
<ce:cross-refs refid="fd21.0555 fd21.0575">(10A.5) and
(10A.6c)</ce:cross-refs>
<ce:cross-refs refid="fd05.0005 fdA.5">(A.1) and (A.5)</ce:cross-refs>

<ce:cross-refs refid="tblD.1 tblD.2">Tables D.1 and D.2</ce:cross-refs>

<ce:cross-refs refid="tbl2.2 tbl2.3 tbl2.4 tbl2.5 tbl2.6 tbl2.7
tbl2.8">Tables 2.22.8</ce:cross-refs>

<ce:cross-refs refid="fig03.0085 fig03.0090">Figs. 6.17 and
6.18</ce:cross-refs>

<ce:cross-refs refid="fig03.0130 fig03.0135 fig03.0140 fig03.0145
fig03.0150">Figs. 6.266.30</ce:cross-refs>

<ce:cross-refs refid="b06.0180 b06.0185 b06.0190">3638</ce:cross-refs>


It often proves challenging to create accurate HTML from these cross-refs.
To create the proper links in the filtered HTML either you have to parse
the label to get the correct chunks of text to link ids with, or you have
to generate the labels automatically from the refids. In either way it is
going to be bit difficult.

So my suggestion is to go for simple <xref> patterns.

For eg.

Instead of the following tagging
    <xref rid="b18.0020 b18.0050 b18.0115
			b18.0120 b18.0125 b18.0130 b18.
			0175 b18.0180 b18.0185 b18.0190
			b18.0195 b18.0200 b18.0205">4,10,2326 and 3541</xref>

I would like it to be tagged as

    <xref rid="b18.0020">4</xref>, <xref rid="b18.0050">10</xref>,
		<xref rid="b18.0115 b18.0120 b18.0125 b18.0130">23-26</xref> and
		<xref rid=" b18.0175 b18.0180 b18.0185 b18.0190
			b18.0195 b18.0200 b18.0205">3541</xref>

This will make my translation process much easier. I also feel that the
idea of <xref-group> suggested by Alf sounds good for the compressed
labels so that we can tag either as

<xref rid="t5 t6 t7 t8">5-8</xref>
or
<xref-group>
	<xref rid="t5">5-</xref>
	<xref rid=t6/>
	<xref rid="t7"/>
	<xref rid="t8">8</xref>
</xref-group>

the latter, though looks less geeky, is always good from transformation
filters
and the authors can have the freedom to use the text "5-8" or
"5 through 8" etc.

I would prefer to follow the Einstein's maxim: "Make it simple not simpler".
:-)

--
Rajagopal CV

Current Thread