[xsl] n-value of preceding lb-element

Subject: [xsl] n-value of preceding lb-element
From: Ingo Mittendorf <eikm2@xxxxxxxxxxxxxxxx>
Date: Mon, 13 Aug 2001 17:00:38 +0100 (BST)
I've more or less just started using XSL stylesheets and I'm still in the
phase where I don't get quite the results that I want. I hope someone can
help me with the following little problem.

What I basically want to do is extract an HTML table from one of my XML
documents. The table itself is not the problem, the problem is mainly to
get the data into the table, especially the data for one of the columns.


I'm working with XML text documents that are marked up according to the TEI
DTD. (It's not necessary to know anything about TEI for this here though.)

TEI has an element <corr> which marks corrected text. The erroneous
original text can be recorded in this element's SIC attribute, eg:

	<corr sic="milions">millions</corr>

In my document I have also marked up sentence-like segments, and
linebreaks (empty element <lb/>). There are n(umber) attributes for both
the segment and the line-break elements.


Below I'm appending an XML document that emulates the relevant fragments
from my text (this document itself is not TEI!). I also include a
DTD for this, generated with XML Spy (and subsequently slightly modified).
Both documents are well-formed, and the XML text is valid against the DTD.


What I want to generate from the example text below is this table (no
more, no less):

LINE	ERROR      CORRECTION
1	sayed      said
2	tearfull   tearful
2	four       for
5	milions    millions

I've been trying for some time now to get there, with little success. The
line is especially tricky. My instruction to the XSL processor would be
something like this in plain English: 'Starting from the current <corr>
element, go to the preceding <lb/> element and print its n-value.' Please
note that <lb/> may occur both within and outside the <s> segment element.
(See example text below for details.)

I haven't got a clue what the correct location path for the n-value of
the <lb/> is that precedes a given corr (except that it - probably - ends
in "lb/@n" and that the preceding-axis will turn up somewhere in the
path); I've tried a lot. (I usually get "line 1" in each and every
table row, or nothing at all. I also tend to get the table above - minus
the lines - three times in a row, although I only need it once.)

I would be very grateful for any help (and would be very much relieved if
I could get this working).


Best regards,


Ingo Mittendorf

University of Cambridge, Department of Linguistics


XML document and assigned DTD following:

=============================================================================
[lbreak2.xml]

[NOTE: the first <example> gives a correct text, the second <example> the
same text with errors, the last <example> shows the text with
complete markup]
-----------------------------------------------------------------------------

<?xml version="1.0" encoding="ISO-8859-1"?>
<!-- edited with XML Spy v3.5 (http://www.xmlspy.com) by Ingo Mittendorf
(University of Cambridge, Department of Linguistics) -->
<!DOCTYPE linebreak-experiment SYSTEM "lbreak2.dtd">

<linebreak-experiment>

  <example n="1a" source="The Mirror, 21.07.2001, p. 1, adapted">

    <note>CORRECT TEXT (structural markup only)</note>

    <p>
      <lb n="1"/><s n="1">Earlier the couple had said a
      <lb n="2"/>tearful farewell as they waited for
      <lb n="3"/>the result. </s><s n="2">It was a passion
      <lb n="4" rend="hyphen"/>ate finale for the TV romance that has
      <lb n="5"/>gripped millions.</s>
    </p>

  </example>


  <example n="1b">

    <note>TEXT WITH ERRORS (structural markup only; all errors mine, I.M.)</note>

    <p>
      <lb n="1"/><s n="1">Earlier the couple had sayed a
      <lb n="2"/>tearfull farewell as they waited four
      <lb n="3"/>the result. </s><s n="2">It was a passion
      <lb n="4" rend="hyphen"/>ate finale for the TV romance that has
      <lb n="5"/>gripped milions.</s>
    </p>

  </example>


  <example n="1c">

    <note>TEXT WITH COMPLETE MARKUP</note>

    <p>
      <lb n="1"/><s n="1">Earlier the couple had <corr sic="sayed">said</corr> a
      <lb n="2"/><corr sic="tearfull">tearful</corr> farewell as they waited <corr sic="four">for</corr>
      <lb n="3"/>the result. </s><s n="2">It was a passion
      <lb n="4" rend="hyphen"/>ate finale for the TV romance that has
      <lb n="5"/>gripped <corr sic="milions">millions</corr>.</s>
    </p>

  </example>

</linebreak-experiment>

==============================================================================
[lbreak2.dtd]
------------------------------------------------------------------------------

<?xml version="1.0" encoding="UTF-8"?>
<!-- edited with XML Spy v3.5 (http://www.xmlspy.com) by Ingo Mittendorf
(University of Cambridge, Department of Linguistics) -->
<!--DTD generated by XML Spy v3.5 (http://www.xmlspy.com)-->
<!-- ... and slightly modified, I.M. -->


<!ELEMENT corr (#PCDATA)>
<!ATTLIST corr
          sic CDATA #REQUIRED>


<!ELEMENT example (note, p+)>
<!ATTLIST example
          n CDATA #REQUIRED
          source CDATA #IMPLIED>


<!-- lb = line-break: -->

<!ELEMENT lb EMPTY>
<!ATTLIST lb
          n CDATA #REQUIRED
          rend (no-hyphen | hyphen) "no-hyphen">


<!ELEMENT linebreak-experiment (example+)>


<!ELEMENT note (#PCDATA)>


<!-- p = paragraph: -->

<!ELEMENT p (lb, s+)>


<!-- s = sentence: -->

<!ELEMENT s (#PCDATA | lb | corr)*>
<!ATTLIST s
          n CDATA #REQUIRED>


<!-- end of DTD -->
==============================================================================


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread