Re: [xsl] Procesing XHTML files with DOCTYPE statements

Subject: Re: [xsl] Procesing XHTML files with DOCTYPE statements
From: "Mukul Gandhi" <gandhi.mukul@xxxxxxxxx>
Date: Tue, 11 Jul 2006 21:26:04 +0530
I am not very sure. But I think you can write a small utility Java
class implementing a custom EntityResolver (using a SAX parser). You
should use this EntityResolver to resolve URI references. You must
implement the resolveEntity() method to return an InputSource pointing
to an empty DTD.

This way, you can get rid of DTD references. Then feed the document
(after it has passed the entity resolution phase) to JAXP.

Regards,
Mukul

On 7/11/06, dvint@xxxxxxxxx <dvint@xxxxxxxxx> wrote:
This is the first time I've had to process XHTML files with XSLT. I'm
using saxon and getting an error that it can't find the DTD referenced in
the file that I'm processing. File has:

<!DOCTYPE html
 PUBLIC "-//W3//DTD XHTML 1.0 Transitional//EN"
         "http://www.w3.org/tr/xhtml1/DTD/xhtml1-transitional.dtd";>

Result is:

Error on line 4 column 107 of
file:/C:/dev/LanguageDetection/RM0000010ZQ000X.htm
l:
 Error reported by XML parser: Cannot read from
http://www.w3.org/tr/xhtml1/DTD
/xhtml1-transitional.dtd
Transformation failed: Run-time errors were reported

This problem goes away as soon as I delete the DOCTYPE info, but I don't
want to (can't) do this for every file. Is there some way around this
error? Note that the DTD does exist at the URL provided, but the default
setup in Saxon doens't seem to find it.

This stylesheet is doing basically an identiy transformation with one
change in the body element to insert a new comment. Here is the stylesheet
in case there might be a way to work around this problem:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet
 version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
 xmlns="http://www.w3.org/1999/xhtml";
 xmlns:t3="http://tms.toyota.com/t3";
>

<xsl:param name="language" select="'en'" />

<xsl:variable name="commentText">
<xsl:choose>
       <xsl:when test="$language='en'">
               text 1 goes here        </xsl:when>
       <xsl:when test="$language='fr'">
               text 2 goes here
       </xsl:when>
       <xsl:when test="$language='sp'">
               Text 3 goes here        </xsl:when>
       <xsl:otherwise>UNRECOGNIZED LANGUAGE SPECIFIED</xsl:otherwise>
</xsl:choose>
</xsl:variable>

<xsl:output method="html"
       omit-xml-declaration="no"
       doctype-public="-//W3//DTD XHTML 1.0 Transitional//EN"
       doctype-system="http://www.w3.org/tr/xhtml1/DTD/xhtml1-transitional.dtd";
       indent="no"/>


<xsl:template match="*"> <xsl:choose> <xsl:when test="name(.)='body'"> <xsl:element name="{name(.)}"> <xsl:for-each select="@*"> <xsl:attribute name="{name(.)}" namespace="{namespace-uri(.)}"><xsl:value-of select="."/></xsl:attribute> </xsl:for-each> <xsl:comment> <xsl:value-of select="$commentText"/> </xsl:comment> <xsl:apply-templates/> </xsl:element></xsl:when> <xsl:otherwise> <xsl:element name="{name(.)}"> <xsl:for-each select="@*"> <xsl:attribute name="{name(.)}" namespace="{namespace-uri(.)}"><xsl:value-of select="."/></xsl:attribute> </xsl:for-each> <xsl:apply-templates/> </xsl:element> </xsl:otherwise> </xsl:choose> </xsl:template>

<xsl:template match="comment()">
       <xsl:comment><xsl:value-of select="."/></xsl:comment>
</xsl:template>

</xsl:stylesheet>

Current Thread