Re: [xsl] How to disable/turn off the inclusion of DTD in html/xhtml to xml

Subject: Re: [xsl] How to disable/turn off the inclusion of DTD in html/xhtml to xml
From: Jack Bush <netbeansfan@xxxxxxxxxxxx>
Date: Thu, 22 Jul 2010 07:10:12 -0700 (PDT)
Hi David,

Thank you to jirka & possibly Michael Kay for responding to a
similar thread on 
Saxon Help which I haven't received until now.

I have made
the following changes based on your suggestion but not sure whether 
it is
correct since the issue still remains:

<?xml version="1.0"?>
<!DOCTYPE
catalog
PUBLIC "-//OASIS/DTD Entity Resolution XML Catalog V1.0//EN"
"http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd";>
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"> 
    <public
        publicId="-//W3C//DTD XHTML 1.0 Transitional//EN" 
       
uri="///E:/xhtml1-transitional.dtd"/>
    <public 
       
publicId="-//W3C//ENTITIES Latin 1 for XHTML//EN" 
       
uri="///E:/xhtml-lat1.ent"/>
    <public 
        publicId="-//W3C//ENTITIES
Symbols for XHTML//EN" 
        uri="///E:/xhtml-symbol.ent"/>
    <public
        publicId="-//W3C//ENTITIES Special for XHTML//EN" 
       
uri="///E:/xhtml-special.ent"/>
    <system
       
systemId="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"; 
       
uri="///E:/xhtml1-transitional.dtd"/>
    <system
       
systemId="http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent"; 
       
uri="///E:/xhtml-lat1.ent"/>
    <system
       
systemId="http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent"; 
       
uri="///E:/xhtml-symbol.ent"/>
    <system
       
systemId="http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent"; 
       
uri="///E:/xhtml-special.ent"/>
</catalog>

Below is a new error that I
haven't seen in the past:

Parse catalog: ///e:/tmp/catalog.xml
Loading
catalog: ///e:/tmp/catalog.xml
Default BASE: file:/e:/tmp/catalog.xml
public:
-//W3C//DTD XHTML 1.0 Transitional//EN
       
///E:/Tmp/xhtml1-transitional.dtd
PUBLIC: -//W3C//DTD XHTML 1.0
Transitional//EN
        file:/E:/Tmp/xhtml1-transitional.dtd
public:
-//W3C//ENTITIES Latin 1 for XHTML//EN
        ///E:/Tmp/xhtml-lat1.ent
PUBLIC: -//W3C//ENTITIES Latin 1 for XHTML//EN
       
file:/E:/Tmp/xhtml-lat1.ent
public: -//W3C//ENTITIES Symbols for XHTML//EN
        ///E:/Tmp/xhtml-symbol.ent
PUBLIC: -//W3C//ENTITIES Symbols for
XHTML//EN
        file:/E:/Tmp/xhtml-symbol.ent
public: -//W3C//ENTITIES
Special for XHTML//EN
        ///E:/Tmp/xhtml-special.ent
PUBLIC:
-//W3C//ENTITIES Special for XHTML//EN
        file:/E:/Tmp/xhtml-special.ent
system: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
       
///E:/Tmp/xhtml1-transitional.dtd
SYSTEM:
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
       
file:/E:/Tmp/xhtml1-transitional.dtd
system:
http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent
        ///E:/xhtml-lat1.ent
SYSTEM: http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent
       
file:/E:/xhtml-lat1.ent
system:
http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent
       
///E:/xhtml-symbol.ent
SYSTEM:
http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent
       
file:/E:/xhtml-symbol.ent
system:
http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent
       
///E:/xhtml-special.ent
SYSTEM:
http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent
       
file:/E:/xhtml-special.ent
resolveSystem(http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd)
Resolved system: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd
        file:/E:/xhtml1-transitional.dtd
resolveSystem(file:/E:/xhtml-lat1.ent)
resolvePublic(-//W3C//ENTITIES Latin 1
for XHTML//EN,file:/E:/xhtml-lat1.ent)
Resolved public: -//W3C//ENTITIES Latin
1 for XHTML//EN
        file:/E:/xhtml-lat1.ent
resolveSystem(file:/E:/xhtml-symbol.ent)
resolvePublic(-//W3C//ENTITIES
Symbols for XHTML//EN,file:/E:/xhtml-symbol.ent)
Resolved public:
-//W3C//ENTITIES Symbols for XHTML//EN
        file:/E:/xhtml-symbol.ent
resolveSystem(file:/E:/xhtml-special.ent)
resolvePublic(-//W3C//ENTITIES
Special for XHTML//EN,file:/E:/xhtml-special.ent)
Resolved public:
-//W3C//ENTITIES Special for XHTML//EN
        file:/E:/xhtml-special.ent
<?xml version="1.0" encoding="UTF-8"?>
<!-- polist.xml --><!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 
Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
<report>
 
<title>Selected Purchase Orders</title>
  <po
filename="///C:/temp/XSLT-2E-examples/Chapter8/po38292.xml" />
  <po
filename="///C:/temp/XSLT-2E-examples/Chapter8/po38293.xml" />
  <po
filename="///C:/temp/XSLT-2E-examples/Chapter8/po38294.xml" />
  <po
filename="///C:/temp/XSLT-2E-examples/Chapter8/po38295.xml" />
</report>
Recoverable error on line 17 
  FODC0002: java.io.FileNotFoundException:
 
http://www.w3.org/C:/temp/XSLT-2E-examples/Chapter8/po38292.xml
Recoverable
error on line 17 
  FODC0002: java.io.FileNotFoundException:
 
http://www.w3.org/C:/temp/XSLT-2E-examples/Chapter8/po38293.xml
Recoverable
error on line 17 
  FODC0002: java.io.FileNotFoundException:
 
http://www.w3.org/C:/temp/XSLT-2E-examples/Chapter8/po38294.xml
Recoverable
error on line 17 
  FODC0002: java.io.FileNotFoundException:
 
http://www.w3.org/C:/temp/XSLT-2E-examples/Chapter8/po38295.xml
<?xml
version="1.0" encoding="UTF-8"?>
<html>
  <head>
    <title>Selected Purchase
Orders</title>
  </head>
  <body style="font-family: sans-serif;">
   
<h1>Selected Purchase Orders - Unsorted</h1>
  </body>
</html>

The main
source is as follows. I have deliverately added the DTD entry into
polist.xml:

<?xml version="1.0"?>
<!-- polist.xml -->
<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
<report>
 
<title>Selected Purchase Orders</title>
  <po
filename="///C:/temp/XSLT-2E-examples/Chapter8/po38292.xml"/>
  <po
filename="///C:/temp/XSLT-2E-examples/Chapter8/po38293.xml"/>
  <po
filename="///C:/temp/XSLT-2E-examples/Chapter8/po38294.xml"/>
  <po
filename="///C:/temp/XSLT-2E-examples/Chapter8/po38295.xml"/>
</report>

None
of the secondary sources have DTD entry in them.

The stylesheet.xsl look like
this:

  <xsl:template match="/">
    <html>
      <head>
       
<title><xsl:value-of select="/report/title"/></title>
      </head>
     
<body style="font-family: sans-serif;">
        <h1>Selected Purchase Orders -
Unsorted</h1>
        <xsl:for-each select="/report/po">
         
<xsl:apply-templates 
           
select="document(@filename)/purchase-order"/>
        </xsl:for-each>
     
</body>
    </html>
  </xsl:template>

This is a different example but on the
same issue.

The error appears to have been as a result of the merging of
http://www.w3.org with full path of secondary sources, but no reference to
local 
DTDs?

Thanks to all,

Jack

----- Original Message ----
From: David
Carlisle <davidc@xxxxxxxxx>
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Cc: Jack Bush
<netbeansfan@xxxxxxxxxxxx>
Sent: Thu, 22 July, 2010 8:03:02 PM
Subject: Re:
[xsl] How to disable/turn off the inclusion of DTD in html/xhtml to 
xml

On
22/07/2010 09:40, Jack Bush wrote:
> Hi David,
>
> First of all, thank you
very much for responding to this thread.
>
>> Are you supplying a local dtd?I
can see from the log you posted that you are
>> supplying local entity files
but I only see the w3c URI for the DTD (and you
>>> will get a 503 if you try
to reference that more than a couple of times a 
>day)
>
> Yes, I do supply
the local dtd.

But what I meant was do you reference it in the catalog.


>
Why does it start to pop up if I try to
> reference it more than a couple of
times a day?

because the w3c let you download it, but if you try again their
server 
automatically decides that you are going to hit them with 10000000
requests a second so prevents being swamped by banning your Io address 
and
you get 503 (forbidden) errors instead.


>
>> When you asked this on the
saxon list (where you posted the catalog) Jirka
>> pointed out some errors in
your catalog, the log you posted here looks 
>the>same
>> though. So I assume
the error is the same and your catalog needs to reference 
>a
>> local copy of
the dtd.
>
> Yes, I did post the same question on Saxon list but didn't get a
response from
> Jirka on the saxon list who pointed out some errors in the
catalog.xml.

jirka replied to the list (I got the reply for example)

>
>
Below is the catalog.xml for your reference:
>
> <?xml version="1.0"?>
>
<!DOCTYPE catalog
> PUBLIC "-//OASIS/DTD Entity Resolution XML Catalog
V1.0//EN"
>
"http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd";>
>
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
> <group
prefer="system" xml:base="file:///E://">
> <public
> publicId="-//W3C//DTD
XHTML 1.0 Transitional//EN"
>
uri="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"/>

This is
wrong, it should point at your local copy. As it is it says
"if you see the
public id for xhtml fetch a dtd frm w3.org, which is 
exactly what you want to
avoid.

> <system
>
systemId="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";
>
uri="xhtml1-transitional.dtd"/>

This one is correct as long as
xhtml1-transitional.dtd is in the same 
directory as your catalog

> <system
>
systemId="xhtml1-transitional.dtd"
> uri="xhtml1-transitional.dtd"/>
> <uri
>
name="corporateStyleSheet.xsl"
> uri="corporateStyleSheet.xsl"/>


> <uri
>
</group>
> </catalog>
>
> Could you resend the suggestion from Jirka again?
The saxon list is archived:
http://sourceforge.net/mailarchive/message.php?msg_name=4C45C40B.5080005%40ko
sek.cz


>
> Thanks a lot,
>
> Jack
>
>

David
________________________________________________________________________
The
Numerical Algorithms Group Ltd is a company registered in England
and Wales
with company number 1249803. The registered office is:
Wilkinson House, Jordan
Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for
all viruses by Star. The service is
powered by MessageLabs.
________________________________________________________________________

Current Thread