RE: [xsl] editing HTML inside <![CDATA[

Subject: RE: [xsl] editing HTML inside <![CDATA[
From: "Michael Kay" <mhk@xxxxxxxxx>
Date: Wed, 19 May 2004 15:34:39 +0100
> I have the following html code inside a xml file
> 
> 
> <SummaryHTML>
> <![CDATA[<html>
> <html>
> <body>
> .....
> .....
> </body>
> </html>
> ]]>
> 
> 
> but using this I can only get the raw html code into
> another html file but not able to access any nodes
> inside the html content.
> 

There are no nodes inside the CDATA section, there is only text. CDATA tells
the parser to treat the contents as text, not as markup.

Your choices are:

(a) avoid wrapping the HTML in CDATA in the first place (in which case you
will have to make sure it is well-formed XML, by putting it through the tidy
utility)

(b) extract the HTML from the CDATA as text, and parse it to turn it into a
tree of nodes. Saxon offers an extension saxon:parse to do this (but this
again relies on it being well-formed XML).

MIchael Kay

Current Thread