Subject: [xsl] csv to xml converter bug From: "Andrew Welch" <andrew.j.welch@xxxxxxxxx> Date: Tue, 10 Jul 2007 11:29:18 +0100 |
The csv-to-xml solution here: http://andrewjwelch.com/code/xslt/csv/csv-to-xml.html
<token/> <token/> <token/> <token>"foo,bar"</token> <token/> <token/> <token>x</token> <token/> <token/>
The x should be at position 5 but is at position 7 because the commas either side of the quoted values aren't being included with the value itself, and are generation extra tokens in the xsl:non-matching-substring block.
I've tried various ways to modify the solution to fix the bug, but always ran into problems with other strings, such as:
If you include leading or trailing commas with the quoted values then the empty value at position 2 here gets consumed. Maybe a better regex would help here, but I couldn't write one... (Or perhaps if the non-matching-substring block had access to some information about the matching-substring block...)
I had a dig around the net and found a regex[1] that could be sufficient to just use with tokenize, but it causes the error:
FORX0002: Error at character 2 in regular expression ",(?=([^\"]*\"[^\"]*\")*(?![^\"...": expected ())
<xsl:variable name="regex" as="xs:string">,(?=([^\"]*\"[^\"]*\")*(?![^\"]*\"))</xsl:variable>
<xsl:function name="fn:getTokens" as="xs:string+"> <xsl:param name="str" as="xs:string"/> <xsl:sequence select='for $t in tokenize($str, $regex) return replace($t, "^,""|"",$|("")""", "$1")'/> </xsl:function>
It's an unusual looking regex (to my novice eye) - any explanation as to whats going on would be great.
thanks andrew
[1] http://weblogs.asp.net/prieck/archive/2004/01/16/59457.aspx -- http://andrewjwelch.com
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] Help Required on position(), Pankaj Bishnoi | Thread | RE: [xsl] csv to xml converter bug, Michael Kay |
[xsl] Help Required on position(), Pankaj Bishnoi | Date | RE: [xsl] csv to xml converter bug, Michael Kay |
Month |