Re: [xsl] "Illegal character in authority" error running java Saxon under Windows

Subject: Re: [xsl] "Illegal character in authority" error running java Saxon under Windows
From: Owen Rees <owen.rees@xxxxxx>
Date: Wed, 20 Feb 2008 11:13:38 +0000
--On Wednesday, February 20, 2008 10:04:36 AM +0000 Andrew Welch wrote:

Spaces aren't allowed in URIs, but the spec does say:

"System identifiers (and other XML strings meant to be used as URI
references) may contain characters that, according to [IETF RFC 3986],
must be escaped before a URI can be used to retrieve the referenced
resource. The characters to be escaped are the control characters #x0
to #x1F and #x7F (most of which cannot appear in XML), space #x20, the
delimiters '<' #x3C, '>' #x3E and '"' #x22, the unwise characters '{'
# x7B, '}' #x7D, '|' #x7C, '\' #x5C, '^' #x5E and '`' #x60, as well as
all characters above #x7F. "

http://www.w3.org/TR/REC-xml/#dt-sysid

So it should be fine to have spaces in system identifiers.  The next
step should be to try a newer version of Xerces (or whichever parser
you're using) and go from there...

The XML spec may say that spaces in the system ID are converted to %20 before being handed off to whatever deals with the URI but that leads you into one of the less clear areas of the URI syntax if the encoded space is in the authority part of the URI. There is also the issue that a URI reference starting '/' is a relative reference so must be resolved against some base URI before you can know what is valid for the particular scheme.


The part of RFC3986 that permits percent encoded octets in the authority part of a hierarchical URI does so in order to allow non-ASCII characters and implies that it should not be used otherwise although the wording is not particularly clear. I think that this is suggesting that space is not a valid character in the authority part of a URI. Syntax conformance of that part of a URI is delegated to the O/S anyway so it is not surprising that there are differences when you go beyond what the RFC recommends (names that conform to the DNS syntax).

--
Owen Rees
========================================================
Hewlett-Packard Limited.   Registered No: 690597 England
Registered Office:  Cain Road, Bracknell, Berks RG12 1HN

Current Thread