[xsl] Double records when grouping and sorting with Muenchian method

Subject: [xsl] Double records when grouping and sorting with Muenchian method
From: "Jean-Pierre Lamon" <jpl@xxxxxxxxxx>
Date: Sat, 8 Jun 2013 12:24:51 +0200
Hi,

Firstable, sorry, I'm not very "uptodate" with XSL, I'm learning but...
I have a problem with doubles entries and sorting (Muenchian grouping).
I'll try to be as clear as possible.
Processor : XSL 1.0 and MSXML. I want to make a bibliography sorted and
grouped by author/titles. The same book can be in different places in the
XML (a bibliography organized by subject). The result will be an index in a
PDF with FOP.

The XML structure, a quite standard MARCXML structure for bibliographic
records

<collection>
  <level_1>
   <record>
	<recno>4</recno> --> sequential record number
	<indexsort>LEBENSFORMEN IM SPAETMITTELALTER 1200 1350</indexsort>
	<datafield tag="100" ind1="1" ind2=" ">
        <subfield code="a">Descudres, Georges</subfield>
      </datafield>
       <datafield tag="245" ind1="1" ind2="0">
         <subfield code="a">Lebensformen im Spdtmittelalter 1200-1350 /
</subfield>
         <subfield code="c">Georges Descoeudres</subfield>
     	 </datafield>
			....
     </record>
  </level_1>
</collection>

The indexes

Author index :

<xsl:key name="idxaut"
match="datafield[@tag=100]|datafield[@tag=700]|datafield[@tag=710]"
use="."/>

Title index

<xsl:key name="idxtit" match="indexsort" use="normalize-space(.)"/>

IndexSort is the title of the book without the first article and
"uppercased"


Selecting the records by authors : datafield tag="100" or "700" or "710"

<xsl:for-each
select="level_1/record/datafield[@tag=100]|level_1/record/datafield[@tag=700
]|level_1/record/datafield[@tag=710]">

--> sort by author

      <xsl:sort select="normalize-space(.)" lang="de-CH"/>

-> find the author in the index and displaying the first one. OK

	<xsl:for-each select="current()[count(. | key('idxaut', .)[1]) =
1]">

-> display the 1 occurrence of the author

		<xsl:value-of select="."/>

-> Must find the book titles for this author

		<xsl:for-each select="key('idxaut', .)">

-> sort by title (indexsort)

		<xsl:sort select="../indexsort"/>
			<xsl:variable name="title">
			  <xsl:value-of select=" datafield[@tag=245]"/>
			<xsl:variable>

-> the same book may be referenced many times in the XML but under another
level_1 node. I want to group them. Search the book titles (referenced and
indexed by indexsort) and displaying the first one

  <xsl:for-each select="../indexsort[count(. |
key('idxtit',../indexsort)[1]) = 1]">
			<xsl:value-of select="$title"/>

-> for the same book in the xml, with the same title, display only the
record number between |

<xsl:for-each select="key('idxtit', ../indexsort)">
	<xsl:value-of select="concat(' | ',../no_rec)"/>
</xsl:for-each>

	</xsl:for-each>
  </xsl:for-each>
</xsl:for-each>
....

But the result is :-(

Descoeudres Georges
--- Auf Biegen und Brechen : physikalische Grenzen des Blockbaus | 1 | 4
Descudres, Georges
--- Fr|hes und hohes Mittelalter | 3 | 5 | 7
--- Lebensformen im Spdtmittelalter 1200-1350 | 2 | 6 | 8
--- Lebensformen im Spdtmittelalter 1200-1350 | 2 | 6 | 8 <-- I don't want
you
Fuchs, Karin
--- Fr|hes und hohes Mittelalter | 3 | 5 | 7

I don't understand why a title is doubled. Sure, one of you will understand
at the first look what's wrong with my code :-) If more data or explanations
needed, no problem, let me know.

Regards
JP

Current Thread