Re: Was: [xsl] mode and moved to Namespaces

Subject: Re: Was: [xsl] mode and moved to Namespaces
From: ac <ac@xxxxxxxxxxxxx>
Date: Mon, 18 Apr 2011 14:41:37 -0400
Hi Wendell,

Thank you for your interesting response.

First, I may try to go even further than you have and say that natural languages are not designed to carry meaning (semantics) but rather for the emitters to try to move the context focus of the receiver, helping the receiver to correlate concepts (e.g. neural patterns) in his context (e.g. knowledge, typically to generate more views/concepts/neural patterns. The most useful words are "thing", "have", "do", "I", "you", "they", which are all pretty meaningless without stakeholder context. In other words, I more than clearly agree with the lack of clear definition of the human convention and tradition -based "natural" languages.

Never the less, these human languages are known to have vocabularies, however twisted those may be, and vocabularies are sets of names (e.g. symbols) for things, actions, and qualities, to name just a few. As such, not in XML terms, but in logical terms, or conceptually, the English language is a name space.

Second, let's just remember that this thread started as I tried to show various potential namespace use cases. The dictionary is just one, and not the main one for me. Also, this is in no way against attributes and/or elements.

Third, there are many cases, as I also tried to point out in one of the previous messages of this thread, where the transformation programmer does not have control or knowledge of the element and attribute names, or even of it is going to be an attribute or an element. I offered a simple example of GML data that needs to be parsed. Documents may use GML data in many different ways and under very different names (e.g. source, destination, position, center, etc). Still they all need to be parsed under the same rules. By using a namespace, the issue is nicely generalized, authors only have to use the right namespace and use all the significant names that they require, as well, the programmer simply writes a template that matches the namespace.

This logic is also easily extended to the dictionary use case.

The example you provide:

<word en-m="Mr" en-f="Mrs" fr-m="M." fr-f="Mme" ... />

can indeed work, but if the dictionary is to be better designed, and here I am sorry for oversimplifying my previous example, what you could have could look like (mind you, this is still a simplification):
<word en:title="Mr" en-f:title="Mrs" fr:title="M." fr-f:title="Mme" ... />
<word en:verb="do" fr:verb="faire" it:... />
<word en:noun="chair" fr:noun="chaise" ... />


or alernately:
<word en:title="Mr" en-f:title="Mrs" fr:title="M." fr-f:title="Mme" ... />
<word en:verb="do" fr:verbe="faire" it:... />
<word en:noun="chair" fr:nom="chaise" ... />

Which could not be expressed as you proposed, as well as comparatively quite painfully otherwise.

Of course, the dictionary should also manage singular and plural instances, adding more namespace requirements and one could reach 4 namespaces per supported language. But just think of what the other options imply and how node count and complexity would increase.

My propositions for improvement would be to first ensure that namespaces are searched efficiently and then consider hierarchical namespaces. Why should they be limited to a single flat level? Logically, name spaces can clearly be subset of others, yet XML does not support that.

As for starting up with XML and XSLT, I would not recommend starting with hierarchical namespaces, not any more than with higher order functions.

Thank you.

Regards,
ac



Dear Andre,

On 4/17/2011 7:30 PM, ac wrote:
I am surprised that, with all these XML and XSLT gurus around the
table, using more than 8 namespaces in a stylesheet or application,
seems like such a strange, "out of bounds", thing. Am I crazy, or
missing something big, or is everyone sleeping at the wheel? What are
naspaces for? How can this be an issue? I am puzzled.

It's not an issue. It's just that in the normal use case for XSLT (admitting that there may really be no such thing), we just don't need so many of them.

Don't natural languages at least each have their own "natural"
namespace? If an application supports i18n and localization, should
it use less namespaces than the number of locale it supports? What
about translation dictionaries? Should we reinvent a different
mechanism to implement natural namespaces, rather than simply use XML
namespaces?

I'm not sure where to begin with this.


The deployment of namespaces and the management of the scoping issues
related to them is a non-trivial problem in XML applications, and an
interesting one. But until you actually offer a specific analogy or
comparison with natural languages, it's hard to say whether any would
make sense.

As others have remarked, XML offers myriad ways of distinguishing
semantics not only with element and attribute types but also their
combination, what we ordinarily call "context". While namespaces can
conceivably be used for this purpose, I can't think off hand of any
that have caught on widely ... not because they don't work well
(although they might not), but because they are less "normal" given
XML's ordinary document-centric conceptual model. Part of this may be
simply force of habit, but it also has to do with tools,
implementation and maintainability, especially for markup languages
and applications intended to be used by dozens or thousands of people
without much expert support.

What about the 22 "basic" namespaces that I quoted, as well as other
 basic XML namespaces? Should one not use RDF when using StratML, or
XSD, or Atom?

Should names like "position" be in the same namespace whether it is
referring to time, or space, or both? Should one not manage time if
one is managing space or vice versa?

That depends. Natural languages handle such problems situationally, through what you might call "operational context" -- combined with feedback for disambiguation. (That happens on this list routinely, as when you say X and I say "did you mean XY" and you say "no, I meant XX".) Chomsky notwithstanding, not many of the rules of natural languages seem to be hard wired: the brain seems to be more like a flash (writable) ROM. Linguists and psychologists debate over which rules and features of natural languages might have a genetic and physiological basis.

Your question seems to imply that natural languages each have their
own namespace, as in "English", "French" and so forth. But linguists
will certainly dispute this: the boundaries between languages are
actually much more fluid than this, and are constantly being
negotiated (see "feedback" above). Not only are the boundaries
themselves not stable over time, but even the clarity of a boundary
may change. And many (most, or all) people are multilingual, even if
the only languages any single person speaks are "English" or "French".

Should artificial languages be any different? On the other hand, as I
said I'm not sure what bearing this has on the question. (It is
interesting though.)  Most newcomers to XML find namespaces to be
probably the least intuitive, most confusing and most
"un-language-like" aspect of XML (possibly since they're no longer
about labeling things, but about labeling the labels). That tells you
something, I think.

Yet even the example you offer doesn't really use namespaces in the
way you suggest they should work:

<word en:instance="Mr" en-f:instance="Mrs" fr:instance="M."
fr-f:instance="Mme" ... />

In XML, the namespace qualifies the name, not the content, of the
thing being named (element, attribute, PI, variable, etc.). (The
content may be qualified by all kinds of things, not only the name of
the node it appears on, but the names of ancestors, neighbors, etc.)
But here you are not qualifying the name "instance" four different
times, but rather the content of the namespace-qualified attribute.

This is as if I had

<en:title>Moby Dick; or, The White Whale</en:title>
<fr:title>Moby Dick; ou, La Baleine Blanche</fr:title>

(With apologies to those who know the whale in question is a very
masculine whale, even if the noun that identifies him isn't.)

The more normal way to do this would be

<title xml:lang="en">Moby Dick; or, The White Whale</title>
<title xml:lang="fr">Moby Dick; ou, La Baleine Blanche</title>

And (to bring this back on topic) part of the reason for this is that
you generally want both titles to match the same template in XSLT.
(And when you don't, there are easy ways to distinguish them.)

In effect, I think the technique you offer here is trying to get
around the limitation in XML that attributes cannot be qualified by
attributes; hence "en:instance" and "en-f:instance" etc. But
namespaces can only go so far with that -- as is indicated here by the
fact that you are (sometimes) overloading the namespace with
information respecting both language and morphology (here, the gender
of the noun). Really what you have here is a graph structure, in which
data points are being discriminated with multiple qualifiers.

Plus, there's no particular gain here from the fact that all the
attributes have the same local name. Semantically (at least as given),
it's identical to having

<word en="Mr" en-f="Mrs" fr="M." fr="Mme" ... />

What kind of XML data should stylesheets transform, and to what XML
data should they transform it to, so that stylesheets do not use more
than 8 namespaces?

If you don't need them, then why have them? Hundreds of examples of XSLT are available on the net to consider with only a namespace or two (or none but the XSLT namespace itself).

Why get all the power of XSLT3, functional programming, and higher
order functions, for XML, if the subject and object XML is limited to
8 namespaces, including a whole set of predefined ones reserved for
the language itself?

Who says it's limited?


My stylesheets rarely need more than three or four namespaces declared
explicitly. But I couldn't begin to count the number of different
namespaces I've ever used or seen used -- much less the ones I've used
without ever knowing it.

Cheers, Wendell

Current Thread