Re: [xsl] html inside xml

Subject: Re: [xsl] html inside xml
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Fri, 12 Oct 2001 09:58:30 +0100
Hi Henry,

> I have a question which stuck me for a while, I use Microsoft.XMLDOM
> to generate xml files, and display over the server. The html
> information for the product is stored in the SQL server. While I
> extract the html content from the database and append to an xml
> node, the Microsoft.XMLDOM automatically convert the "<" and ">"
> tags into "& lt;" and "& gt;" ( I put a space between & and "lt;
> gt;" otherwise the browser will display "<" and ">" automatically),
> which make my xml file looks as this:

If the HTML information is well-formed in XML terms (e.g. IMG elements
use the XML empty-element syntax rather than the HTML empty-element
syntax, and all the HTML elements have end tags), then I recommend
that you change the way in which you construct the DOM so that the
HTML is included in your XML file as elements rather than as a string.

For example, if the string is held in the HTMLString variable, then
you could add an start and end tag to it:

  HTMLString = '<snippet>' + HTMLString + '</snippet>';

to make it a well-formed XML document, then read that into an XMLDOM
object:

  HTMLDOM.loadXML(HTMLString);

Then you can copy the nodes in the HTMLDOM over to the XMLDOM that
you're working on, e.g. using Chris Bayes' domtodom.js
(http://www.bayes.co.uk/xml).

That way, the XML document will have the HTML embedded in it properly,
and you'll be able to just copy it into the output.

If the HTML information is *not* well-formed in XML terms, then
preferably you should run it through HTML Tidy
(http://www.w3.org/People/Raggett/tidy/) to make it well formed, and
go through the process above.

If that's not an option for you, then rather than copying the content
of the value element in your XML, you should get its string value
(using xsl:value-of) and disable output escaping so that the XSLT
processor outputs < rather than &lt; when it serialises the output. So
something like:

  <xsl:value-of select="value" disable-output-escaping="yes" />

This will generate HTML that's not well-formed in XML terms, so it's
useless if you're trying to generate XHTML. I'm also not certain how
well MSXML works with disable-output-escaping, and in general it's
something to avoid, which is why I talked through the more complicated
(and more reliable and more long-term) solutions first.

I hope that helps,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread