Re: [xsl] XML tags as map keys and impact on XSLT/XPath

Subject: Re: [xsl] XML tags as map keys and impact on XSLT/XPath
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Fri, 18 Jun 2010 11:14:42 +0100
On 18/06/2010 07:28, Wolfgang Laun wrote:
Every now and then, people (not me) want to represent a Map<K,V>  in XML by using
s.th. like
    <map>
      <k1>v1</k1>
      <k2>v2</k2>
      ...
    </map>
with ki from K and vi from V. Apart from the obvious limitation for K's values,
I feel that this is somehow violating the spirit of XML. But this is not a list
for XML, and I don't want to risk a red or yellow card.

So, more specifically: Doesn't such a "structure" complicate the writing
of XSLT constructs? Aren't there any statements or expressions that
won't be usable at all? (I don't need an exhaustive list of what isn't
possible - I'm more interested in a general judgment.)


I agree: in general it's a poor way of using XML, and it makes it more difficult to process using XSLT.


The exception is when the set (k1, k2, k3....) is very predictable and unlikely to change. There's room for debate about this. A schema used in the XMark benchmark uses the names of continents as element names (<Asia>, <Europe>, etc). That feels wrong to me, just as it would feel wrong to use the names of continents as Java variable names. There are many borderline cases: should one use <home-phone>, <work-phone>, and <mobile-phone>, or should one use <phone role="home"> etc? Probably the latter, because it makes it easier to change the set of roles and easier to process all phone numbers in a generic way. But in the end, drawing the line between data and metadata is subjective.

The objection about XSLT processing can in principle be overcome if the elements in the set are declared in a schema as having either a common type or as members of a substitution group; doing this gives you a handle in a schema-aware stylesheet to define match patterns that match all elements in the set, or path expressions that select them all. But fixing the set of values in a schema in some ways compounds the error, because it makes it even more difficult to change the value set over time as requirements evolve.

SGML old-timers at this point will start reminiscing about architectural forms. I mention this only to point out that it's an issue that has been around for a long time.

Michael Kay
Saxonica

Current Thread