|
Subject: RE: [xsl] tokenize() for text wrap From: "Richard Lewis" <richardlewis@xxxxxxxxxxxxxx> Date: Thu, 10 Feb 2005 16:39:58 +0000 |
On Thu, 10 Feb 2005 16:08:34 -0000, "Michael Kay" <mike@xxxxxxxxxxxx>
said:
> The regex in tokenize() is a regular expression that the separator must
> match. There's no way of constraining the tokens, only the separator.
>
> .{30,} matches any string of 30 characters or more, but .{30,}? also
> matches
> a zero-length string. I'm not sure what this would achieve even if it
> worked!
>
If you test this with sed you'll need a regex like this:
echo "string..." | sed "s/\(.\{,30\}\) /\1\n/g"
But if you check the wierd regex syntax for XPath:
http://www.w3.org/TR/xpath-functions/#regex-syntax
it says "X{n,}? matches X, at least n times". Bizarre, isn't it?
> I think you can use xsl:analyze-string for this. Use a regex that matches
> the required token together with the following separator; treat these as
> two
> subgroups by parenthesizing the regex, and in the xsl:matching-substring
> child, pick up the token value as regex-group(1).
>
OK, I've tried this but I can't work out the right regex. I've tried:
(.{30,}?\s+)(\s+)
(.{30,}\s+)(\s+)
.{30,}\s+
.{30,}?\s+
and they all produce no matches.
>
> > -----Original Message-----
> > From: Richard Lewis [mailto:richardlewis@xxxxxxxxxxxxxx]
> > Sent: 10 February 2005 15:50
> > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > Subject: [xsl] tokenize() for text wrap
> >
> > Hello XSL list,
> >
> > I'm just trying an idea for text wrapping when transforming
> > XML to SVG.
> >
> > Inside a named template to which a long string of non-marked up text
> > ($text) and some other bits and pieces is passed I have the following
> > for-each:
> >
> > <xsl:for-each select="tokenize($text, '.{30,}?\s+')">
> > <tspan dy="{...}">
> > <xsl:value-of select="." />
> > </tspan>
> > </xsl:for-each>
> >
> > The idea is that it splits the $text string up at the first
> > space after
> > 30 characters (this number is actually a variable in the real
> > thing) and
> > then formats each token as a tspan element. The output, however,
> > contains the correct number of tspan elements (which also have the
> > correct 'dy' attributes) but all but the last one is empty.
> > (I think the
> > text in the last one is the correct, though).
> >
> > (I'm working with Saxon 8.2B; I've tried different combinations of
> > $flags for the tokenize() function but the result is always the same)
> >
> > Any ideas what might be wrong with it?
> >
> > Cheers,
> > Richard
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| RE: [xsl] tokenize() for text wrap, Michael Kay | Thread | RE: [xsl] tokenize() for text wrap, Richard Lewis |
| RE: [xsl] tokenize() for text wrap, Michael Kay | Date | [xsl] aborting element creation, Joris Gillis |
| Month |