Subject: Re: HTML to DocBook translation From: Marcus Carr <mrc@xxxxxxxxxxxxxx> Date: Fri, 06 Feb 1998 09:12:35 +1100 |
Thomas G. Lockhart wrote: > > I want to translate some HTML documents into the DocBook format. I've > > started with the following dsl: > > Has someone a working (it must not be perfect) solution? > > I'm embarrassed to say that I started my one-time translation by writing a little perl script to do brute force pattern > substitution, then hand-edited from there. > > Let me know if you come up with your much more elegant solution, or if you want me to send my pitifully inadequate one :) I'll preface this by pointing out that I'm not intimately familiar with DSSSL, so I make no claims about whether this approach is more or less appropriate than any other, just that we used it and the circumstances and scope of the project seem similar. We converted a large amount of legislation from HTML to a proprietary DTD recently. We used OmniMark to validate the HTML, used about four incremental stages (also OmniMark) to get almost to where we wanted to be, then finished it by hand. These types of conversions are never pretty, but the option of using pattern matching and rules based on element context minimises the hassles. -- Regards Marcus Carr email: mrc@xxxxxxxxxxxxxx _______________________________________________________________ Allette Systems (Australia) email: info@xxxxxxxxxxxxxx Level 10, 91 York Street www: http://www.allette.com.au Sydney 2000 NSW Australia phone: +61 2 9262 4777 fax: +61 2 9262 4774 _______________________________________________________________ DSSSList info and archive: http://www.mulberrytech.com/dsssl/dssslist
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
HTML to DocBook translation, Alexander Taranov | Thread | ANNOUNCE: Modular DocBook Styleshee, Norman Walsh |
ANNOUNCE: Modular DocBook Styleshee, Norman Walsh | Date | Shading in DSSSL, Yan Sorkin |
Month |