Subject: Re: [xsl] Content constructors and sequences From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx> Date: Thu, 10 Jan 2002 11:57:16 +0000 |
Hi Kevin, > I like the ideas in this, but (isn't there always one?) as I think > other people have said the tree or sequence representation is > difficult. You mean managing which is used when a sequence is assigned to a variable (does it create a new tree [as in XSLT 1.0] with the variable assigned to the document node of that tree, or is the variable simply assigned to the sequence that is created from its content)? > Anyway, I am probably missing something important but can't you just > create a tree using the new semantics of copy-of on a sequence of > simple values. The only disadvantage I can see is if you want to > really have to a sequence of simple values you will need to convert > it back from a tree but that doesn't sound like it would be > difficult. Particularly if processors are smart enough to retain the > data in its native format as it has been suggest they might. There are a few things that make it not as straight-forward as you might think - which is why XSLT as it stands from being adequate for manipulating sequences (and why we're getting all this functionality from XSLT being added to XPath). First, what does it mean to copy a simple typed value into a tree (by tree I'm assuming you're specifically meaning node trees as in the XPath data model, not a more general 'tree' data type that, since XPath doesn't have that concept)? If it means creating a new type of node, called simple typed value nodes, then it could possibly work, but this would mean extending the XML Infoset. If it means creating text nodes from each of those simple typed value, then you lose separating between the simple typed values because (as you know) text nodes get combined together when a document is created. You might be able to separate out the values in some cases (assuming that simple typed values get whitespace added around them when they are converted to text nodes), but not if some of the simple typed values were strings with spaces in them. Stopping text nodes from being merged would be one possibility, so that each simple typed value existed in its own text node. The second problem is more significant. One of the features of node sets (and node sequences), what enables us to work with them in fact, is that they do not contain *copies* of the nodes, but the nodes themselves. This means that you can take a node from a sequence and find out its ancestors and siblings in the document that it came from - those don't change when you put it in a sequence. It is hard to see how this behaviour could be replicated if there were only trees (documents) in XSLT. One method that I think the WG considered (given that it was in an earlier draft of the data model) was introducing a new type of node - a reference node - which could stand in for the node itself. I'm not exactly certain why it was dropped; again it would be an extension to the XML Infoset. It would certainly be interesting to see a proposal that made these two extensions to the the data model - added simple typed value nodes and reference nodes. This would enable documents to hold simple typed values and references to nodes from other documents, which would mean XSLT could do the same "sequence manipulation" as XPath has to now. [I suspect that there would be strong resistance from XQuery, since these new types of nodes would be added at a fundamental level, to a shared data model, despite the fact that they're only required in XSLT.] > I would like to see sequences in XSLT, but I don't think putting > them in along side trees is the natural approach. More as an > integral part of the tree structure. We're used, in XSLT, to seeing documents (trees) as the basic structure in which information is held. Just to give a quick review of the data model in XPath/XSLT 1.0 - the first class objects are strings, numbers, booleans, node sets and result tree fragments - node sets contain nodes (which are not first-class objects) - nodes have various properties, including children - a node set (the order of the children can be worked out from the nodes' document order) - there are conceptually two kinds of node sets: - node sets containing new nodes (result trees), which can only be generated using XSLT - node sets containing existing nodes, which can only be generated using XPath There are several problems with this data model: - there's an enforced division between the two types of node set: because you can't use XSLT to create a node set containing existing nodes, you can only construct those node sets using the relatively limited functionality of XPath - there's no way of natively representing a list of strings or numbers - there's a very restricted set of data types (no dates, for example) The new data model tries to address those problems and (I think) in the process rationalises some of the weirdness of the old data model: - the first class object type is a sequence (look, like LISP!) - sequences contain items of two types: simple typed values or nodes - simple typed values can be of various kinds, the XML Schema datatypes - nodes have various properties, including children (a sequence of nodes) As currently designed, the old division between node sets containing new nodes and node sets containing existing nodes is being imported into XSLT: there's a division between sequences that contain new nodes, which can only be generated using XSLT, and sequences that contain existing nodes, which can only be generated using XPath. What's more, because the sequence is the primary type in the new data model, generating sequences of simple typed values is going to be much more important in XPath/XSLT 2.0. One way of helping is to provide the facility for people to write XPath functions in XSLT, which you can now do with xsl:function. However, this leads to code being spread out amongst lots of functions (since basically the only thing from XSLT that functions give you access to is variable assignment), and is overkill for the simple things. So we need something else to give powerful sequence manipulation without ducking out to functions. There are two options (as I see it): - add more programming constructs to XPath to make it more capable so that it has sufficient constructs for manipulating node sequences containing existing nodes (and other items!) more easily - enable XSLT to produce sequences containing things other than new nodes To go back to your comment: > I would like to see sequences in XSLT, but I don't think putting > them in along side trees is the natural approach. More as an > integral part of the tree structure. Hopefully what I've described above about the way the data model has changed shows that actually sequences are the basic building block of XSLT now, not tree structures. If you start seeing XSLT as building sequences of new nodes, then the step to building sequences containing other items as well isn't very far. The biggest stumbling block, as we've been discussing, is variables, because variables implicitly construct a document node whose children are set to the new nodes generated by the content of the variable, rather than just being set to the sequence of new nodes. If that can be resolved, I think everything fits together rather neatly. [I also think that it gives a neat parallel between XQuery and XSLT - XQuery constructs sequences without using XML syntax; XSLT constructs sequences with XML syntax. You can view XQuery as the non-XML version of XSLT and XSLT as the XML version of XQuery (putting aside the fundamental differences in approach, demonstrated by comparing xsl:for-each and FLWR expressions).] Cheers, Jeni --- Jeni Tennison http://www.jenitennison.com/ XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Content constructors and , Kevin Jones | Thread | RE: [xsl] Content constructors and , Kevin Jones |
RE: [xsl] use cases for d-o-e, Joerg Pietschmann | Date | Re: [xsl] use cases for d-o-e, Joerg Pietschmann |
Month |