RE: [xsl] tokenize() for text wrap

Subject: RE: [xsl] tokenize() for text wrap
From: "Richard Lewis" <richardlewis@xxxxxxxxxxxxxx>
Date: Thu, 10 Feb 2005 17:13:14 +0000
On Thu, 10 Feb 2005 16:39:58 +0000, "Richard Lewis"
<richardlewis@xxxxxxxxxxxxxx> said:
> 
> On Thu, 10 Feb 2005 16:08:34 -0000, "Michael Kay" <mike@xxxxxxxxxxxx>
> said:
> > The regex in tokenize() is a regular expression that the separator must
> > match. There's no way of constraining the tokens, only the separator.
> > 
> > .{30,} matches any string of 30 characters or more, but .{30,}? also
> > matches
> > a zero-length string. I'm not sure what this would achieve even if it
> > worked!
> > 
> If you test this with sed you'll need a regex like this:
> 
> echo "string..." | sed "s/\(.\{,30\}\) /\1\n/g"
> 
> But if you check the wierd regex syntax for XPath:
> http://www.w3.org/TR/xpath-functions/#regex-syntax
> 
> it says "X{n,}? matches X, at least n times". Bizarre, isn't it?
> 
> > I think you can use xsl:analyze-string for this. Use a regex that matches
> > the required token together with the following separator; treat these as
> > two
> > subgroups by parenthesizing the regex, and in the xsl:matching-substring
> > child, pick up the token value as regex-group(1).
> > 
> OK, I've tried this but I can't work out the right regex. I've tried:
> (.{30,}?\s+)(\s+)
> (.{30,}\s+)(\s+)
> .{30,}\s+
> .{30,}?\s+
> 
> and they all produce no matches.
> 
I've got this:

<xsl:variable name="regex">(.{30,}?)\s+</xsl:variable>

<xsl:analyze-string select="normalize-space($text)" regex="$regex"
flags="s">
    <xsl:matching-substring>
        <tspan dy="{...}">
            <xsl:value-of select="regex-group(1)" />
        </tspan>
    </xsl:matching-substring>
</xsl:analyze-string>

but there don't seem to be any matching-substrings (or
non-matching-substrings), I get no tspan elements in the result tree.

Richard.

Current Thread