Subject: Re: [xsl] library for parsing RTF From: Kevin Grover <kevin@xxxxxxxxxxxxxxx> Date: Tue, 29 Jun 2010 12:51:01 -0700 |
On Sun, Jun 27, 2010 at 16:07, Andriy Gerasika <andriy.gerasika@xxxxxxxxx> wrote: >> >> For a language as rich as RTF, regular expressions are not going to get >> you all that far: they are probably only suitable for writing the >> lexical analyzer (or tokenizer). >> > > RTF syntax is not that complex for requiring BNF parser. > > assuming the following RTF: > {\rtf1\ansi{\fonttbl\f0\fswiss Helvetica;}\f0\pard > This is some {\b bold} text.\par > } > > it can be easily converted w/ regular expressions to something like: > <g><rtf>1</rtf><ansi/><g><fonttbl/><f>0</f><fswiss/>Helvetica<sc/></g><f>0</f ><pard/> > This is some <g><b/>bold</g> text.<par/> > </g> > > where "g" equals to RTF's curly braces(group) and "sc" to semicolon in RTF. > > not sure if BNF parser will produce something better... > This seems about as useful as a regex C compiler, that compiles main() { printf ("Hello world!\n"); } and _nothing_ else. Just because you can make an regex for _one instanace_ of a grammer does not mean that you can (easily) use regexs to parse a generic format. RTF is generic - there are MANY valid ways to say similiar things in RTF.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] library for parsing RTF, Andriy Gerasika | Thread | Re: [xsl] library for parsing RTF, Maurice Mengel |
Re: [xsl] Populating keys with valu, David Carlisle | Date | [xsl] debugging document() call, Lars Huttar |
Month |