|
Subject: Re: [xsl] library for parsing RTF From: Kevin Grover <kevin@xxxxxxxxxxxxxxx> Date: Tue, 29 Jun 2010 12:51:01 -0700 |
On Sun, Jun 27, 2010 at 16:07, Andriy Gerasika
<andriy.gerasika@xxxxxxxxx> wrote:
>>
>> For a language as rich as RTF, regular expressions are not going to get
>> you all that far: they are probably only suitable for writing the
>> lexical analyzer (or tokenizer).
>>
>
> RTF syntax is not that complex for requiring BNF parser.
>
> assuming the following RTF:
> {\rtf1\ansi{\fonttbl\f0\fswiss Helvetica;}\f0\pard
> This is some {\b bold} text.\par
> }
>
> it can be easily converted w/ regular expressions to something like:
>
<g><rtf>1</rtf><ansi/><g><fonttbl/><f>0</f><fswiss/>Helvetica<sc/></g><f>0</f
><pard/>
> This is some <g><b/>bold</g> text.<par/>
> </g>
>
> where "g" equals to RTF's curly braces(group) and "sc" to semicolon in RTF.
>
> not sure if BNF parser will produce something better...
>
This seems about as useful as a regex C compiler, that compiles
main() { printf ("Hello world!\n"); }
and _nothing_ else.
Just because you can make an regex for _one instanace_ of a grammer
does not mean that you can (easily) use regexs to parse a generic
format. RTF is generic - there are MANY valid ways to say similiar
things in RTF.
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: [xsl] library for parsing RTF, Andriy Gerasika | Thread | Re: [xsl] library for parsing RTF, Maurice Mengel |
| Re: [xsl] Populating keys with valu, David Carlisle | Date | [xsl] debugging document() call, Lars Huttar |
| Month |