Re: [xsl] XHTML gives an error

Subject: Re: [xsl] XHTML gives an error
From: Badrul Anuar <askbard@xxxxxxxxx>
Date: Thu, 02 Jul 2009 22:26:43 +0100
Michael Kay wrote:
It's best not to call it XHTML when it isn't, you'll only confuse people. If
it were XHTML, you wouldn't have a problem.
Thank you for your comment. It was my mistake because in the first place I checked the source file, I saw this

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
<html xmlns="http://www.w3.org/1999/xhtml"; lang="en" xml:lang="en">

Than I thought the file is XHTML.

John Cowan's TagSoup handles this document just fine. Just put it on your
classpath and use the -x:org.ccil.cowan.tagsoup.Parser option when calling
Saxon.
Thank you so much for the suggestion.
Regards,

Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay


-----Original Message-----
From: Badrul Anuar [mailto:askbard@xxxxxxxxx] Sent: 02 July 2009 22:02
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [xsl] XHTML gives an error


Michael Kay wrote:
Usually, if you want help understanding an error message,
it helps to
tell people what the error message is.

Does anybody know how to clean any XHTML file or to
convert into XML.
If I have the XML, the it would be easier for me to use XSL.
You're confused. If it's XHTML then it already is XML, and doesn't need converting. If the error message is telling you that it's not XML, this means that it's not XHTML either.
Sorry for not giving the exact error. Firstly, when I try to run the XSL and received an error.
The error was "The 'meta' start tag on line 11 does not match the end tag of 'head'. Line 14, position 3."


So, I try to clean the XHTML using Tidy and receive another error "Reference to undeclared entity 'nbsp'. Line 293, position 16."
The command used to clean the XHTML .. tidy -m filename.htm


Since I received another error, I changed all &nbsp; to empty and finally I can run my XSL file.

Since I have to run Tidy and remove the &nbsp;, I'm thinking about to change all the XHTML to XML.
Sorry for misunderstanding.


If you have any suggestion on how to minimize the work of cleaning XHTML and remove &nsbp, it would be better.
The url for the XHTML is
http://tools.cisco.com/security/center/viewAlert.x?alertId=18366


Thank you for your explanation.

Regards
Badrul

Regards,

Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay

Current Thread