[xsl] empty elements to filled without overlapping hierachies

Subject: [xsl] empty elements to filled without overlapping hierachies
From: James Cummings <James.Cummings@xxxxxxxxxxxxxx>
Date: Wed, 12 Nov 2003 10:15:17 +0000 (GMT)
Hi there,

I'm playing about with some old etexts and if I take something
and with a quick regex in emacs change the originally COCOA
'tags' to be wellformed empty elements I get something like:
------
<body>
  <TITLE value="A NEW WAY TO PAY OLD DEBTS" />
    <SN value="ACTUS PRIMUS, SCENA PRIMA:" />
	<SSD value="WELBORNE. TAPWELL. FROTH." />
	<Q value="WELBORNE." /> NO BOUZE? NOR NO TOBACCO?
	<Q value="TAPWELL. " /> NOT A SUCKE SIR,
		NOR THE REMAINDER OF A SINGLE CANNE
		LEFT BY A DRUNKEN PORTER, ALL NIGHT PALDE TOO.
	<Q value="FROTH." /> NOT THE DROPPING OF THE TAPPE FOR YOUR MORNINGS
	DRAUGHT, SIR,
	'TIS VERITIE I ASSURE YOU.
[...]
	<Q value="WELBORNE." /> THUS YOU DOGGEBOLT,
	AND THUS.      <SSD value="BEATES, AND KICKS HIM." />
	<Q value="TAPWELL." /> CRY OUT FOR HELPE.
[...]
	<Q value="FROTH." /> ASKE MERCIE.  <SSD value="ENTER ALLWORTH," />
[...]
	<Q value="ALWORTH." /> A STRANGE HUMOR.  <SSD value="EXEUNT." />
     <SN value="ACTUS PRIMI, SCENA SECUNDA." />
	<SSD value="ORDER. AMBLE. FURNACE. WATCHALL." />
	<Q value="ORDER." /> SET ALL THINGS RIGHT, OR AS MY NAME IS ORDER,
	AND BY THIS STAFFE OF OFFICE THAT COMMANDS YOU; [...]
	<Q value="AMBLE." /> YOU ARE MERRIE
	GOOD MASTER STEWARD.
	<Q value="FURNACE." /> LET HIM;ILE BEE ANGRY.
</body>
-----

Not really the best text, but if I can solve the problems for
this one, there are of course better ones.
What I'm wondering is what sort of strategy one could use in
XSLT (including XSLT2) to change these to filled tags.  e.g.:
  <Q value="WELBORNE." > NO BOUZE? NOR NO TOBACCO?</Q>
My worry is that doing it step by step one might run into a
problem of overlapping trees.  i.e. If <Q> (a speaker) stretched
across <SN> (a scene).  My thought was to process this in steps
from the smallest granualarity out to the largest.

Of course there is an additional problem with lines like:
  <Q value="FROTH." /> ASKE MERCIE.  <SSD value="ENTER ALLWORTH," />
Sometimes a stage direction (<SSD>) is in the middle of a speech,
and sometimes it isn't.  Is the only way to tell manually? Or could
one look to see if there was text() following it? i.e. should that
be:
  <Q value="FROTH." > ASKE MERCIE. </Q> <SSD value="ENTER ALLWORTH," />

  or

  <Q value="FROTH." > ASKE MERCIE.  <SSD value="ENTER ALLWORTH," />
 [...] </Q>

(FROTH. does have more lines here.) So I guess what I'm looking
to end up with is something like:

<body>
<TITLE value="A NEW WAY TO PAY OLD DEBTS" />
 <SN value="ACTUS PRIMUS, SCENA PRIMA:">
	<SSD value="WELBORNE. TAPWELL. FROTH." />
	<Q value="WELBORNE."> NO BOUZE? NOR NO TOBACCO?</Q>
	<Q value="TAPWELL. "> NOT A SUCKE SIR,
	NOR THE REMAINDER OF A SINGLE CANNE
	LEFT BY A DRUNKEN PORTER, ALL NIGHT PALDE TOO.</Q>
	<Q value="FROTH."> NOT THE DROPPING OF THE TAPPE FOR YOUR
MORNINGS DRAUGHT, SIR,
	'TIS VERITIE I ASSURE YOU.
	[...]</Q>
	<Q value="WELBORNE."> THUS YOU DOGGEBOLT,
	AND THUS. </Q>  <SSD value="BEATES, AND KICKS HIM." />
	<Q value="TAPWELL." > CRY OUT FOR HELPE.
	[...]</Q>
	<Q value="FROTH." > ASKE MERCIE.  <SSD value="ENTER ALLWORTH,"
/>
	[...]</Q>
	<Q value="ALWORTH."> A STRANGE HUMOR.</Q> <SSD value="EXEUNT."
/>
 </SN>
 <SN value="ACTUS PRIMI, SCENA SECUNDA.">
	<SSD value="ORDER. AMBLE. FURNACE. WATCHALL." />
	<Q value="ORDER."> SET ALL THINGS RIGHT, OR AS MY NAME IS ORDER,
	AND BY THIS STAFFE OF OFFICE THAT COMMANDS YOU; [...]</Q>
	<Q value="AMBLE."> YOU ARE MERRIE
	GOOD MASTER STEWARD.</Q>
	<Q value="FURNACE." > LET HIM;ILE BEE ANGRY.
	[...]</Q>
 </SN>
</body>
-----

I've not even considered the adding in of linebreak milestones yet.

Any suggestions?

-James

---
Dr James Cummings, Oxford Text Archive, James.Cummings@xxxxxxxxxxxxxx

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread