Re: [xsl] namespace prefix weirdness

Subject: Re: [xsl] namespace prefix weirdness
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Tue, 17 Aug 2004 15:19:30 +0100
Hi Bruce,

> I'm finding, however, some bizarre behavior with namespace prefixes.
> Why might I end up with something like the "_0" prefix on that mods
> element below?

Element and attribute names are qualified names: names that are made
up of a namespace URI and a local part. In an XML document, a prefix
is used in place of the namespace URI (to save you from having to type
the namespace URI repeatedly). To any namespace-aware processor, the
prefix that gets used doesn't matter at all (but of course to humans,
it can matter a lot). The namespace prefix is associated with a
namespace URI via a namespace declaration. The default namespace
(declared with an xmlns attribute) is the namespace URI that's
associated with any element that doesn't have a prefix in its name.

When you generate an element using XSLT, the XSLT processor needs to
decide which prefix to use with the element. The only real constraint
on the prefix is that the element must end up in the correct
namespace; other than that, the XSLT processor is free to use whatever
prefix it wants for that namespace, although most try to guess an
appropriate prefix based on that used in either the source or
stylesheet. In addition to namespace declarations added to support the
element and attribute namespaces, the XSLT processor will also add
namespace declarations for any namespace node in the result tree.
These might be added due to in-scope namespace declarations on literal
result elements, due to explicitly created namespace nodes (XSLT 2.0
only), due to xs:QName values (in XSLT 2.0 only), or due to copied
namespace nodes from the source document.

So far, so good, I hope. Let's look at the output that you're getting:

>           <modsCollection xmlns="http://www.loc.gov/mods/v3";>
>              <key xmlns="">test</key>
>              <_0:mods xmlns=""
> xmlns:_0="http://www.loc.gov/mods/v3";
ID="Mitchell1996b">>
>                 <titleInfo xmlns="http://www.loc.gov/mods/v3";>
>                    <title>Introduction</title>
>                    <subTitle>Public Space and the City</subTitle>
>                 </titleInfo>

Here, the <modsCollection> element is in the
'http://www.loc.gov/mods/v3' namespace, which is the default namespace
for the <modsCollection> element (and its descendants, unless it's
overridden by another default namespace declaration). The <key>
element is in no namespace -- the default namespace is overridden and
reset to "". The <_0:mods> element is in the
'http://www.loc.gov/mods/v3' namespace, as is the <titleInfo> element
and its children.

I'm guessing (and it's never a good idea to do that, but Mike will
doubtless correct me if I'm wrong), but I suspect that the reason
Saxon is using a _0 prefix on the <_0:mods> element is that it's
trying to honor the namespace declarations in your stylesheet and
therefore doesn't want to override the default namespace declaration
that's in-scope when the <_0:mods> element is generated. (There are
good reasons for Saxon to try to honor the namespace declarations that
you use, since namespace prefixes are sometimes used in content as
well as in element names; it's impossible for Saxon to identify the
places where that may be the case, so, possibly, it errs on the side
of caution.)

The <_0:mods> element is getting generated by the <xsl:copy-of>
instruction in:

> <xsl:template match="mods:mods" mode="enhanced-bib">
>      <key>test</key>
>      <xsl:copy-of select="."/>
> </xsl:template>

You don't show the <xsl:stylesheet> element of your stylesheet, but
from your output it's clear that there's no default namespace
declaration, or if there is it looks like xmlns="". There's similarly
no default namespace declaration on the <xsl:template> element, so
there's no default namespace in-scope on the <xsl:copy-of>
instruction. Try doing:

  <xsl:copy-of select="." xmlns="http://www.loc.gov/mods/v3"; />

and see if that makes any difference.

You're in a bit of a tricky spot here because you want the default
namespace to be different in different parts of your output. The
default namespace needs to be 'http://docbook.org/docbook-ng' in some
places and 'http://www.loc.gov/mods/v3' in others. (In "sane"
documents -- see
http://lists.xml.org/archives/xml-dev/200204/msg00170.html -- you
would only have one default namespace throughout, but your documents
are neurotic.)

You can manage it in the way that you're doing, by putting explicit
default namespace declarations wherever you generate an element, but
it gets quite tedious.

What I suggest is that you use separate stylesheet modules for the
output that's in the DocBook namespace and the output that's in the
MODS namespace, and include one into the other. That way, you can
declare the default namespace once for the entire stylesheet module,
and rest assured that whenever you generate an element without a
prefix, that element will appear in the correct namespace. Something
like:

<xsl:stylesheet version="2.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                xmlns:db="http://docbbook.org/docbook-ng";
                xmlns:mods="http://www.loc.gov/mods/v3";
                xmlns="http://docbook.org/docbook-ng";
                exclude-result-prefixes="mods">

<xsl:include href="mods.xsl" />

<xsl:template match="/">
  <xsl:variable name="temp">
    <xsl:apply-templates mode="enhanced-bib"/>
  </xsl:variable>
  <xsl:apply-templates select="$temp" mode="modified"/>
</xsl:template>

<xsl:template match="db:article" mode="enhanced-bib">
  <article>
    <xsl:apply-templates mode="enhanced-bib"/>
  </article>
</xsl:template>

<xsl:template match="db:info | db:section" mode="enhanced-bib">
  <xsl:copy-of select="."/>
</xsl:template>

<xsl:template match="db:bibliography" mode="enhanced-bib">
  <bibliography>
    <xsl:apply-templates select="mods:modsCollection" mode="enhanced-bib"/>
  </bibliography>
</xsl:template>

<xsl:template match="/" mode="modified">
  <xsl:copy-of select="*"/>
</xsl:template>

</xsl:stylesheet>


--- mods.xsl ---
<xsl:stylesheet version="2.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                xmlns:mods="http://www.loc.gov/mods/v3";
                xmlns="http://www.loc.gov/mods/v3";>

<xsl:template match="mods:modsCollection" mode="enhanced-bib">
  <modsCollection>
    <xsl:apply-templates select="mods:mods"  mode="enhanced-bib"/>
  </modsCollection>
</xsl:template>

<xsl:template match="mods:mods" mode="enhanced-bib">
  <key>test</key>
  <xsl:copy-of select="."/>
</xsl:template>

</xsl:stylesheet>

Note that the <key> element will be generated in the MODS namespace;
if you really want it to be in no namespace then do:

  <key xmlns="">test</key>

or:

  <xsl:element name="key" namespace-uri="">test</key>

I also modified your code where you were creating double copies of
things; I think this is a cut-down version of the original stylesheet,
so you aren't really using this code anyway.

Current Thread