Subject: Re: [xsl] java Regex call From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx> Date: Thu, 10 Jul 2003 13:51:53 +0100 |
Dave, >> So the string is broken into two matching substrings ('ABC_PARA' >> and '_PARA') with a non-matching substring of '_' in between. And >> you get two lots of output because you're generating one for each >> of the matching substrings. > > Ah! Getting very close to stateful here Jeni? In what way? > I hadn't figured the 'iteration' idea. I'd have expected (without > thinking too closely) all the matches to have come up in the > <xsl:matching-substring> section, then all the non matching in the > following <xsl:non-matching-substring> section. I can't think of many situations in which that would be what you want. A typical example of when you might want to use <xsl:analyze-string> is to replace all the newline characters in a string with <br> elements. You can do this with: <xsl:analyze-string select="$string" regex="\n"> <xsl:matching-substring> <br /> </xsl:matching-substring> <xsl:non-matching-substring> <xsl:value-of select="." /> </xsl:non-matching-substring> </xsl:analyze-string> If you had "all matching substrings, then all non-matching substrings" then you'd get all the <br> elements and then all the text, which wouldn't be much good. > http://www.w3.org/TR/xslt20/#element-analyze-string > only hints at this 'ordering', The spec says: The input string is thus partitioned into a sequence of substrings, some of which match the regular expression, others which do not match it. Each substring will contain at least one character. This sequence of substrings is processed using the xsl:matching-substring and xsl:non-matching-substring child instructions. A matching substring is processed using the xsl:matching-substring element, a non-matching substring using the xsl:non-matching-substring element. As elsewhere in XPath 2.0, the term "sequence" means an ordered list of items. > While the xsl:matching-substring instruction is active, ... the > regex-group parameter is sequential whilst active I guess. I don't understand what you mean by "sequential". > I might reasonably expect that regexp-group(4) would get hold of the > fourth match string? That's not what it does. regex-group() is used to get the value of the substring matched by a subexpression in the regular expression. For example, if you had a regular expression: (\d{4})-(\d{2})-(\d{2}) and matched the string "2003-07-10" then regex-group(1) would give you "2003", regex-group(2) would give you "07" and regex-group(3) would give you "10". There's no easy way to get hold of the "fourth matching substring". If you use the position() function within <xsl:matching-substring> or <xsl:non-matching-substring> then you'll get the position of the matching/non-matching substring amongst all the other (matching and non-matching) substrings. > Using: > > <xsl:variable name="res" as="item()*"> > <xsl:analyze-string select="$input" regex="{$regex}"> > <xsl:matching-substring> > <GROUP1><xsl:value-of select="regex-group(1)" /></GROUP1> > <GROUP2><xsl:value-of select="regex-group(2)" /></GROUP2> > <GROUP3><xsl:value-of select="regex-group(3)" /></GROUP3> > </xsl:matching-substring> > <xsl:non-matching-substring> > <mismatch><xsl:value-of select="."/></mismatch> > </xsl:non-matching-substring> > </xsl:analyze-string> > </xsl:variable> > > <xsl:copy-of select="$res"/> > > gives: > > <GROUP1>ABC_PARA</GROUP1> > <GROUP2>ABC</GROUP2> > <GROUP3/> > <mismatch>_</mismatch> > <GROUP1>_PARA</GROUP1> > <GROUP2/> > <GROUP3/> > > Which is close to usable, though very messy. I'm not sure what you're trying to get, so can't advise how to get it cleanly. Cheers, Jeni --- Jeni Tennison http://www.jenitennison.com/ XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] java Regex call, David Carlisle | Thread | RE: [xsl] java Regex call, David . Pawson |
RE: [xsl] URGENT : Algorithm for ex, Jarno . Elovirta | Date | Re: [xsl] java Regex call, Jeni Tennison |
Month |