[xsl] Grouping text nodes

Subject: [xsl] Grouping text nodes
From: James Cummings <cummings.james@xxxxxxxxx>
Date: Wed, 3 Aug 2005 10:49:08 +0100
Hi there,

I have some XHTML I'm trying to transform to add more structure to it.
 It is a copy of the Latin Vulgate Bible.  Currently the XHTML looks
something like this:
-----
<div class="chapter">
<span class="chapter-num">1</span>
        <div class="poetrystartchapter">
                    <span class="vn" id="x1_1">1</span>&nbsp;Beatus vir qui
                    non abiit in consilio impiorum,<br/> et in via peccatorum
                    non stetit,<br/> et in cathedra pestilenti&aelig;
non sedit&nbsp;;<br/>
                    <span class="vn" id="x1_2">2</span>&nbsp;sed in lege
                    Domini voluntas ejus,<br/> et in lege ejus meditabitur
die
                    ac nocte.<br/>
                    <span class="vn" id="x1_3">3</span>&nbsp;Et erit tamquam
                    lignum quod plantatum est secus decursus aquarum,<br/>
quod
                    fructum suum dabit in tempore suo&nbsp;:<br/> et folium
                    ejus non defluet&nbsp;;<br/> et omnia
                    qu&aelig;cumque faciet prosperabuntur.<br/>
...</div>...</div>
-----
What I want to get is something like:
-----
<div type="chapter" n="1">
             <milestone type="poetrystartchapter"/>
             <lg xml:id="x1_1" n="1">
                    <l xml:id="x1_1-1">Beatus vir qui
                    non abiit in consilio impiorum,</l>
                   <l xml:id="x1_1-2">et in via peccatorum non stetit,</l>
                    <l xml:id="x1_1-3">et in cathedra
pestilenti&aelig; non sedit </l>
              </lg>
              <lg xml:id="x1_2" n="2">
                    <l xml:id="x1_2-1"> sed in lege Domini voluntas ejus,</l>
                    <l xml:id="x1_2-2">et in lege ejus meditabitur die
ac nocte.</l>
               </lg>
                <lg xml:id="x1_3">
                     <l xml:id="x1_3-1"> Et erit tamquam
                    lignum quod plantatum est secus decursus aquarum,</l>
                    <l xml:id="x1_3-2"> quod fructum suum dabit in
tempore suo :</l>
                    <l xml:id="x1_3-3"> et folium ejus non defluet;</l>
                    <l xml:id="x1_3-4"> et omnia qu&aelig;cumque
faciet prosperabuntur.</l>
                     </lg>
<milestone type="EndOfpoetrystartchapter"/>
...</div>
-----
My problem is when I'm looking backwards to create the @xml:id for
each of the lines whilst grouping the text nodes into lines.
Sometimes there is extra existing structure which seems to get in the
way, where the <div> (if present at all) starts after the first line

-----
 <div class="chapter"><span class="chapter-num">118</span>
                <span class="vn" id="x118_1">1</span>&nbsp;Alleluja.
                    <div class="poetry"><span
class="speaker">Aleph.</span> Beati
                    immaculati in via,<br/> qui ambulant in lege Domini.<br/>
                    <span class="vn" id="x118_2">2</span>&nbsp;Beati qui
                    scrutantur testimonia ejus&nbsp;;<br/> in toto corde
                    exquirunt eum.<br/>
-----
Which is supposed to  come out something likelike:
-----
 <div type="chapter" n="118">
                <lg xml:id="x118_1" n="1">
                     <l xml:id="x118_1-1">Alleluja.
                      <milestone type="poetry"/>
                    <seg type="speaker">Aleph.</seg> Beati immaculati
in via,</l>
                     <l xml:id="x118_1-2"> qui ambulant in lege Domini.</l>
                 </lg>
                  <lg>
                     <l xml:id="x118_2-1"> Beati qui scrutantur
testimonia ejus; </l>
                     <l xml:id="x118_2-2"> in toto corde  exquirunt eum.</l>
                  </lg>
                   <milestone type="Endofpoetry"/>
... </div>
-----
At the moment when matching  text() to create the lines, I then look
back (preceding:: or preceding-sibling:: ) to the span grab the
span/@id to create the l/@xml:id... but in instances like psalm 118
where another div or span gets in the way it tends to muck up.

So I'm convinced there is probably an entirely better way to do this.
Any suggestions?

Many Thanks,
-James

--
James Cummings, Cummings dot James at GMail dot com

Current Thread