Re: [xsl] When are <!DOCTYPE> and svg namespace references material?
Subject: Re: [xsl] When are <!DOCTYPE> and svg namespace references material?|
From: "C. M. Sperberg-McQueen" <cmsmcq@xxxxxxxxxxxxxxxxx>
Date: Wed, 3 Feb 2010 12:01:47 -0700
On 3 Feb 2010, at 10:57 , Ylvisaker, Steve wrote:
There is a concern that our SVG graphics implementation may be
introducing external reference dependencies outside our local
network. An example graphic is:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd
<svg version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink
" x="0px" y="0px" width="357.553px" height="216.893px" viewBox="0 0
357.553 216.893" enable-background="new 0 0 357.553 216.893"
Our graphics are, for the most part, generated by Adobe Illustrator
CS3 but we are running xslt transformations against them with Saxon
and viewing the graphics with a variety of tools: Firefox, InkScape,
Ai CS3, Antenna House formatter and Saxon-PE 188.8.131.52
I have isolated my work station (no corporate network or internet)
and all of these applications work fine. But I don't know if they
are trying to make an external reference, failing and driving on, or
if the <!DOCTYPE> and W3 name space references are little more than
The only way to be certain would be to use some system utility which
and reports attempts to open network ports.
The short answer is that none of the relevant specs themselves require
qualification that such network resources be read, but they also don't
The longer answer has several parts.
(1) The presence of a DOCTYPE declaration does not, in principle, mean
that the external DTD file must be dereferenced, though that is often
effect in practice.
The URI "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd" given as the
system identifier for the DTD must be consulted by any processor
DTD-based validation on the data. The presence of a DOCTYPE declaration
does not constitute an instruction to validate the document, and in
it would be good if processors like Firefox allowed you to specify
you want validation performed or not. But in practice, many programs
provide that kind of user control; instead they assume that if a DOCTYPE
declaration is present, they must or should validate the document.
programs, a request that they read a particular document amounts in
an instruction that they should validate it, too, if a DOCTYPE
Note that a program validating the document may or may not actually
network: the authoritative source for the document is the server
but if your system has a caching proxy and the DTD is in the cache,
not necessarily be any network traffic. And software built to work
of a particular kind may have and consult a locally cached copy of the
instead of retrieving it from the network. In the case of DTDs served
W3C servers, the DTDs change very infrequently and the expiration
set to encourage local caching; experience on those servers shows that
surprising numbers of programs and packages are willing to request the
resource thousands of times in the same minute, whether the requests
or fail. When this happens frequently, it can place a bit of a strain
the server involved, so well behaved software should arrange for some
for a more complete account of some relevant issues.
(2) Many programs will fail gracefully (or relatively gracefully) if
can't get to the DTD.
Many programs which attempt validation whenever they see a DOCTYPE
will shrug their shoulders and proceed without validation if they don't
succeed in retrieving the required external resources (such as the DTD).
The logic of this behavior is not completely clear (if you think
is required, why would you proceed anyway if you can't perform
but it's not uncommon.
(3) Namespace names serve purposes of uniqueness and documentation.
seldom need to be dereferenced.
The URIs "http://www.w3.org/2000/svg" and "http://www.w3.org/1999/xlink"
in your sample graphic identify certain constructs in the XML as being
in the SVG or the XLink namespaces, respectively. The crucial effect of
this is to ensure that when the same local name is used in two different
namespaces, markup can reliably be assigned to one or to the other.
is no need to dereference the namespace URI in order for software to
Any software responsible for processing a particular vocabulary will
know, given an element named (for example) "desc", whether it's the
element they know about (e.g. the SVG desc), or some other "desc"
(any desc in any other namespace). That also does not require that the
URI be dereferenced; software built to process SVG, for example, will
certainly have the SVG URI hard-coded into it somewhere.
On the other hand, namespace documents are occasionally used to provide
links (e.g. via a RDDL document) to relevant resources, e.g. schema
in various schema languages. And so software may occasionally
a namespace URI to see if it can find relevant resources there.
And of course if a human is trying to understand what this SVG stuff is,
then they might do worse than dereference the URI to see if it provides
any useful human-readable information, or pointers to such
SVG and XLink URIs do in fact do this.)
Three of the applications, Firefox, InkScape and Adobe CS3 care
about the name of the xmlns URL.
They should: they include special code to process SVG, and that code
should work on SVG elements and attributes but not on random markup in
Something other than www.w3.org trips them up. Antenna House and
Saxon don't seem to care.
I don't know why Antenna House behaves as it does.
Saxon, not being an SVG processor, will almost certainly not care what
namespace URI is used. But if the namespace URI in the input document
and the one in the stylesheet don't use, you are unlikely to be getting
the transformation you had in mind.
With the <!DOCTYPE> declaration I can reference www.w3.org as above,
or reference an internal network URL or drop the declaration all
together and none of the applications perform differently. All of
this is, of course, anecdotal data at best. It would be great to
know for sure what is going on.
It sure would :)
My question: Is there ever an attempt to make an external reference
to www.w3.org from either the <!DOCTYPE> declaration or the xmlns
I hope the details above help a bit, even though the answer is
a rather disappointing "it depends on the program". Most XML specs
work very hard to provide a declarative semantics for what they
define, and the result is that conforming software has a fair bit
of leeway as to what they do in particular cases.
If your organization is worried about things not working if the
network goes down, I think your experiments show that that worry is
not well founded. I think you would be best advised not to try
to strip out the references to external resources.
* C. M. Sperberg-McQueen, Black Mesa Technologies LLC