Subject: RE: [xsl] Converting CSV to XML without hardcoding schema details in xsl From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Sat, 24 Jun 2006 08:41:06 +0100 |
> >There's a lot of potential backtracking here: it might be better to > >replace each "(.*)," with "[^,]*" or with "(.*?),". > > [Pantvaidya, Vishwajit] Does "[^,]*" work the same as "(.*)," > - I understand that ^ is start of line metachar. How does the > former match the alphabet chars? No, within square brackets, ^ means "not". So [^,]* matches a sequence of any characters except comma. The problem with your expression is that (.*) matches as many characters as it can. Then it sees ",", so it backtracks to find the last comma. Then it sees the next (.*), and has to backtrack again; and so on. > > > > >My own instinct would be to use something like: > > > >([^"]*,|"[^"]*",)* > > > > [Pantvaidya, Vishwajit] Oxygen would not accept this regex as > "it matches a zero-length string". Perhaps then you want to change the final "*" to a "+". > Anyway, how does this regex work - it does not seem to have > anything that matches the alphabet chars. See above: [^"] matches everything except quotes. > And does the ,|" match comma or double quotes - because > actually some field will have both. The first alternative, [^"]*, matches any field that ends with a comma, and doesn't contain a quotation mark. The second alternative, "[^"]*,", matches any field that begins and ends with quotes (followed by a comma), and might contain a comma between the quotes. It's very hard to find out what the exact rules for CSV files used by a particular product are: for example, how it represents a field that contains quotation marks as well as commas. (That's one of the great advantages of XML< you can find a specification!) If you know the exact rules for your particular flavour of CSV, you can adapt the regex to match (well, you can if you study a bit more about regular expressions). > > > Maybe this conversion is easier done with some Java code. > I'm sure it can be done using regular expressions but it looks as if you need to do some learning in this area. Michael Kay http://www.saxonica.com/
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Converting CSV to XML wit, Pantvaidya, Vishwaji | Thread | RE: [xsl] Converting CSV to XML wit, Nathan Young -X \(na |
RE: [xsl] Forcing a call to a Java , Michael Kay | Date | RE: [xsl] returning nodes (not a st, Michael Kay |
Month |