RE: [xsl] Another tokenize() question

Subject: RE: [xsl] Another tokenize() question
From: "Michael Kay" <mhk@xxxxxxxxx>
Date: Tue, 10 Aug 2004 19:08:00 +0100
> Ok.  This *basically* works, but with a line like:
> <l>Why ha<supplied>l</supplied>dest &thorn;u were agaynes me</l>
> it turns it into:
> <l><w>Why</w> <w>ha</w><supplied>l</supplied><w>dest</w> 
> <w>&thorn;u</w>
> <w>were</w> <w>agaynes</w> <w>me</w></l>
> or if I change it to l//text()
> <l><w>Why</w> <w>ha</w><supplied><w>l</w></supplied><w>dest</w>
> <w>&thorn;u</w> <w>were</w> <w>agaynes</w> <w>me</w></l>
> When really:
> <l><w>Why</w> <w>ha<supplied>l</supplied>dest</w> <w>&thorn;u</w>
> <w>were</w> <w>agaynes</w> <w>me</w></l>
> is what is wanted.

Presumably you have confidence that if an element starts in the middle of a
word, then it ends within the same word? Otherwise you have an interleaving

You could start by replacing all the spaces with <sp/> elements, and then
process the structure along the lines:

<xsl:template match="*">
<xsl:for-each-group select="child::node()" group-starting-with="sp">
    <xsl:when test="self::sp">
      <w><xsl:apply-templates select="current-group() except ."/></w>
      <xsl:apply-templates select="current-group()"/>

Michael Kay

Current Thread