Re: html to xml

Subject: Re: html to xml
From: Joe English <jenglish@xxxxxxxxxxxxx>
Date: Fri, 27 Oct 2000 11:08:57 -0700
Sebastian Rahtz wrote:

> hmm. having been fighting this tidy-then-transform system for the last
> day or two, can anyone tell me how they solve two (related) problems?
>
>  a) as we know, authors scatter <h1>, <h3> etc across their document
>  like pointers. my target DTD needs structured divisions. who has some
>  good XSLT code to sort it out? [...]
>  b) HTML allows PCDATA practically anywhere, so far as I can see. so
>  I get
>    <h3>Hello</h3>
>    I am the walrus
>  where my target DTD wants something more like
>   <h3>Hello</h3>
>   <p>I am the walrus
>
>   How do others deal with this?


One technique that I've found very useful for this sort
of transformation is to use a SAX-like filter that
attempts to validate the source document against the
target document type, inserting start/end-tags or renaming 
elements as needed when it encounters an error.

I usually don't use XSLT for this sort of thing, although
I suppose it would be possible with creative use of moded
templates.


--Joe English

  jenglish@xxxxxxxxxxxxx


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread