RE: [xsl] Extracting text between nodes

Subject: RE: [xsl] Extracting text between nodes
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Wed, 13 Feb 2008 21:38:18 -0000
> I have html code that looks like this;
> 
> <td><font><b>Title</b><br/>Some Text<i> and some more italic 
> text</i><b> maybe even some more</b><a 
> href="http://whatever.com";>And an anchor</a></font></td>
> 
> I want to extract only the text between the first <br/> tag 
> and the last <a> anchor tag, without the anchor's text but 
> including all text in child elements such as the <i> and <b> 
> - "Some Text and some more italic text maybe even some more"
> 
> How can I do it with xsl?

In 2.0, assuming <font> is the context node, I think the most direct
translation of your requirement is

.//text()[. >> current()/br[1] and . << current()/a[last()]]

Note that this gives you a sequence of text nodes, if you want to turn this
into a string, use string-join(). Remember to escape < with &lt;

But I do wonder how general-purpose this is; are there lots of instances
like this, and do they vary much?

Michael Kay
http://www.saxonica.com/ 

Current Thread