Subject: Re: html to xml From: Joe English <jenglish@xxxxxxxxxxxxx> Date: Fri, 27 Oct 2000 11:08:57 -0700 |
Sebastian Rahtz wrote: > hmm. having been fighting this tidy-then-transform system for the last > day or two, can anyone tell me how they solve two (related) problems? > > a) as we know, authors scatter <h1>, <h3> etc across their document > like pointers. my target DTD needs structured divisions. who has some > good XSLT code to sort it out? [...] > b) HTML allows PCDATA practically anywhere, so far as I can see. so > I get > <h3>Hello</h3> > I am the walrus > where my target DTD wants something more like > <h3>Hello</h3> > <p>I am the walrus > > How do others deal with this? One technique that I've found very useful for this sort of transformation is to use a SAX-like filter that attempts to validate the source document against the target document type, inserting start/end-tags or renaming elements as needed when it encounters an error. I usually don't use XSLT for this sort of thing, although I suppose it would be possible with creative use of moded templates. --Joe English jenglish@xxxxxxxxxxxxx XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: html to xml, Sebastian Rahtz | Thread | XSLT text output - better formattin, Lyn K. Finman |
RE: Entity Reference Question, Tony Graham | Date | Namespaces., Alejandro Raiczyk |
Month |