[xsl] Matching data types

Subject: [xsl] Matching data types
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Sun, 13 Jan 2002 12:55:31 +0000
David wrote:
> wel some of its in some bits of omnimark but the processing model is
> so different there (you have access to start and end tags as
> strings, if you want and variables have this strange property of
> being updatable:-)

Ooh, weird! It'll never catch on, you know.

> hmm, of course a schema language that types its elements with xpath
> patterns rather than schema complex types might look more
> schematron-y than xml schema-like, but I know what you mean..

I think it's about what you view the match pattern in xsl:template as
actually doing. I think that it does two things:

  - it tests whether a node is an instance of a particular data type,
    so the processor knows whether to use it to process a node or not
  - it asserts that the nodes that are processed by the template are
    of a particular type, (a bit like a function with only one
    parameter)

But of course you're right that you can do things with patterns that
you can't do with data types, most importantly test the location of
the instance of the element or attribute within the source document.

Also, the pattern syntax focuses on matching elements based on their
*ancestors*. Data typing, on the other hand, focuses on checking the
content of an element.

It seems to me that in a schema-aware context there are three things
that you might want to test/assert:

  - the location of the element in the instance document
  - the identity of the element declaration for the element
  - the identity of the type definition for the element

In some cases, the element declaration that's used for the element
dictates the location of the element in the document, but this isn't
always the case (when the element declaration is global, or you have
recursively nested structures in the markup language).

Likewise, in some cases you can identify the type definition for an
element from its element declaration, but the xsi:type attribute can
assign a different (derived) type to a particular instance of the
element.

Getting hold of the element declaration for an element tells you what
substitution group the element belongs to, but you might want to just
straight to that test.

Similarly, getting hold of the type definition for an element tells
you exactly where the type of the element is positioned in the type
hierarchy, but you might want to test within that hierarchy
immediately.

So the kind of sentences that you need to put together to test/assert
element types look like:

  ElementType := ElementInstancePattern
                 ("declared" "as" ElementDeclaration)?
                 ("in" "substitution" "group" ElementDeclaration)?
                 ("of" "type" ("derived" "from")? TypeDefinition)?

Where ElementInstancePattern is a path pattern without predicates,
where the last step uses the child:: axis and a name test.
ElementDeclaration and TypeDefinition are discussed below.
                 
The sentences for attributes look similar, except you don't have to
worry about substitution groups:

  AttributeType := AttributeInstancePattern
                   ("declared" "as" AttributeDeclaration)?
                   ("of" "type" ("derived" "from")? TypeDefinition)?

Where AttributeInstancePattern is a path pattern without predicates,
where the last step uses the attribute:: axis.
                   
Simple typed values don't have a location in the instance document and
don't have declarations, just type definitions. However, you can't
just have a keyword to give their type because this would conflict
with ElementInstancePatterns (this kind of thing is a problem in the
current XPath 2.0 WD definition of DataType, because "item" could mean
either 'any item' or 'a simple typed value of type item'). I think
something that looks like a node test would work:

  SimpleType    := "data" "(" ")"
                   ("of" "type" ("derived" "from")? TypeDefinition)?

Finally, if you don't care whether it's a node or a simple typed
value, you could use item():

  ItemType      := "item" "(" ")"


One of the big problems in testing/asserting the declaration or type
definition for a node (something that's glossed over in the current
XPath 2.0 WD) is that because XML Schema allows local declarations
(and anonymous type definitions), the 'name' of an element declaration
is always qualified by a path through the schema that leads to the
element declaration, not a simple single name.

For example, assuming a schema with no target namespace, if you had:

  <xs:element name="name" type="xs:string" />

  <xs:element name="person">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="name" type="personName" />
        ...
      </xs:sequence>
    </xs:complexType>
  </xs:element>

Then you have two element declarations for the 'name' element - a
global one and a local one. These have to be addressed through a kind
of path - to adapt the path described in the XML Schema Formal
Description WD
(http://www.w3.org/TR/xmlschema-formal/#section-structures-names) the
two elements would be addressed with:

  /element::name
  /element::person/type::*/element::name

[There's a temptation here to cut out the anonymous complex type
 definition step, to make a shorthand /element::person/element::name.
 I think that would be OK (and better than using //, since // would
 imply 'descendant'). On the other hand, making it /person/name would
 be misleading, because it makes you think it's referring to elements
 in the source document rather than declarations in the schema.
 Perhaps /!person/!name (or some other character to indicate
 declarations).]

You rarely have to address type definitions through particularly long
paths because the only type definitions that can have names are global
type definitions, but you can imagine situations where you might want
to. For example, to assert that a particular variable holds a sequence
of nodes that validates against the anonymous complex type within the
person element declaration, you'd need to use:

  /element::person/type::*


If you used sentences of this form to test/assert item types, you
could say that the match attribute of xsl:template can take two forms:

  - a pattern that includes predicates in the path pattern and
    matches solely against the instance document
  - a pattern that doesn't include predicates in the path pattern, and
    can include assertions about the type of the matched item

The latter, I would have thought, would enable the processor to
statically check at least some of the paths and function calls used
within the template, and would enable the processor to narrow down
matching templates quite quickly.
  

Anyway, these are just some ideas. From what I gather from exchanges
on XML-Dev, the WGs are working on this area at the moment and have
some ideas of their own about how to proceed with it.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread