Re: HTML->DocBook? (Re: HTML -> RTF)

Subject: Re: HTML->DocBook? (Re: HTML -> RTF)
From: Oisin McGuinness <oisin@xxxxxxxx>
Date: Mon, 22 May 2000 17:16:41 -0400
 Gary Lawrence Murphy <garym@xxxxxxxxxx> asked for a html to
DocBook style sheet...

A posting last Jun 11 1999 to comp.text.sgml 
 From: cbbrowne@xxxxxxxxxxxx (Christopher Browne)
 Newsgroups: comp.text.sgml
 Subject: Re: Search for Holy Grail: {html,ps,text}2sgml

contained a quite nice little stylesheet to do this; look on for this, or I could send it to you if you can't find it.

The author said in the posting:
I use the DSSSL listed below to turn HTML that uses a small subset of
the available HTML tags into something that's pretty easy to integrate
into DocBook.

It's useful enough for expressing the very limited structuring that HTML
provides, essentially being aware of:
a) Headings <H1>, <H2>, ...
b) Paragraphs
c) Some modifiers (<TT>, <B>)
d) Itemized lists
e) URLs

That's a tiny subset of HTML, and is mapped onto a tiny subset of what
DocBook offers.  It happens to be enough to be fairly useful.  But I'd
not call it a complete "conversion."

