Re: [xsl] library for parsing RTF

Subject: Re: [xsl] library for parsing RTF
From: Andriy Gerasika <andriy.gerasika@xxxxxxxxx>
Date: Mon, 28 Jun 2010 02:07:26 +0300
For a language as rich as RTF, regular expressions are not going to get
you all that far: they are probably only suitable for writing the
lexical analyzer (or tokenizer).


RTF syntax is not that complex for requiring BNF parser.


assuming the following RTF:
{\rtf1\ansi{\fonttbl\f0\fswiss Helvetica;}\f0\pard
This is some {\b bold} text.\par
}

it can be easily converted w/ regular expressions to something like:
<g><rtf>1</rtf><ansi/><g><fonttbl/><f>0</f><fswiss/>Helvetica<sc/></g><f>0</f><pard/>
This is some <g><b/>bold</g> text.<par/>
</g>

where "g" equals to RTF's curly braces(group) and "sc" to semicolon in RTF.

not sure if BNF parser will produce something better...

Current Thread