Re: [xsl] XML source with DOCTYPE declaration

Subject: Re: [xsl] XML source with DOCTYPE declaration
From: Jeni Tennison <mail@xxxxxxxxxxxxxxxx>
Date: Thu, 19 Apr 2001 18:12:59 +0100
Hi Zeljko,

> I couldn't find any statement like this. What I found is the following:
> <!ENTITY % NS.prefixed "IGNORE">
> <!ENTITY % XHTML.prefixed "%NS.prefixed;">
> <!ENTITY % XHTML.xmlns "";>
> <!ENTITY % XHTML.prefix "">
> <!ENTITY % XHTML.xmlns.attrib "xmlns   %URI.datatype;  #FIXED  
> '%XHTML.xmlns;          %XLINK.xmlns.attrib;">

You'll probably find somewhere else something that looks approximately

<!ATTLIST html

The <!ENTITY stuff you've shown above are setting up a number of
entities that can be substituted into the DTD. They make it very
difficult to understand a DTD, but they make it very easy to maintain
it. The last entity declaration you show is setting up an entity that
defines an xmlns attribute (a default namespace declaration) with a
fixed value of ''. That's the namespace
for XHTML.

Basically, this means that whenever you use the XHTML DTD, you pull in
those definitions.  The fixed value for the xmlns attribute is used,
so when you have in your input:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN"

It's *just* as if you had:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN"
<html xmlns="";>

As far as a validating parser is concerned (and as far as an XSLT
processor is concerned) it's as if you had included the xmlns
attribute (the default namespace declaration) in your XHTML file.

Now, having that default namespace declaration means that in fact all
the elements in your XHTML document are considered to be in the XHTML
namespace.  The XPaths that you use in the stylesheet have to select
or match the elements *in that namespace*, rather than in the null

So, in your stylesheet, you need to declare the XHTML namespace *with
a prefix*.  Usually you do that in the xsl:stylesheet element.  You
then, whenever you refer to one of those XHTML elements, need to use
that prefix with the name of the element.  So your stylesheet should
look something like:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"

        <xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes"/>

        <xsl:template match="/">
                <xsl:text>ROOT element found !</xsl:text>

        <!-- Note - html element is in the XHTML namespace, so you
             need to use the 'html' prefix -->
        <xsl:template match="/html:html">
                <xsl:text>HTML tag found !!!</xsl:text>

> <html xmlns="";
>       version="-// W3C//DTD XHTML Basic 1.0//EN">>
>         <head xmlns=""; profile="">
>                 <title xmlns=""/>
>         </head>
>         <body title="article" xmlns="";>
>                 <p xmlns="";>Some Content
> !</p>
>         </body>
> </html>
> What I'm surprised of is that every tag got an 'xmlns' attribute
> added, although only a simple <xsl:copy-of/> was executed.

When you use xsl:copy-of, then it does a deep copy of the node you
select, including all the namespace nodes on the elements that it
finds.  You'd hope that an XSLT processor would recognise the fact
that if the parent element has an equivalent namespace declaration,
then it doesn't need to add the namespace node, but obviously not with
the processor that you're using.

> Hope somebody can help me here and tell me how to get the upper
> sample running. If possible the stylesheet should use the default
> namespace, meaning if possible the expression matching patterns
> should not use prefixes.

Just a note on your last 'If possible...'.  In XPath, if you don't
give a prefix for an element or attribute name then the XPath
processor will assume that you want the element or attribute to be in
the *null* namespace.  This is 'a good thing' in that otherwise there
wouldn't be a way to select or match nodes in the null namespace, but
a bad thing in that it's a right pain to have to specify namespace
prefixes for all the namespaces that you use, and it's something that
often trips people up (and stops Norm Walsh from assigning a namespace
to DocBook ;).

In XSLT 2.0, there will be support for you to specify that the default
namespace should be used in the interpretation of XPaths... but not

I hope that helps,


Jeni Tennison

 XSL-List info and archive:

Current Thread