Subject: Re: XSLT V 1.1 From: Jeni Tennison <mail@xxxxxxxxxxxxxxxx> Date: Sat, 16 Sep 2000 20:25:03 +0100 |
Paul, >----- Original Message ----- >From: Jeni Tennison <mail@xxxxxxxxxxxxxxxx> > >> This solution wouldn't handle situations where there are multiple documents >> being accessed through the same document() call: >> >> <class-files> >> <file href="class1.xml" /> >> <file href="class2.xml" /> >> <file href="class3.xml" /> >> </class-files> >> >> <xsl:for-each select="document(class-files/file/@href)/classes/class"> >> <xsl:sort select="@name" /> >> ... >> </xsl:for-each> > >What is this? This should not work in current XSLT. Sorry, Paul, both should and does. When the first argument of document() is a node set, then it is treated as if it were multiple calls to node set, one on each of the string values of the nodes, and then unioned together. So the above document() call translates to: document('class1.xml') | document('class2.xml') | document('class3.xml') >From the Rec (Section 12.1): "When the document function has two arguments and the first argument is a node-set, then the result is the union, for each node in the argument node-set, of the result of calling the document function with the first argument being the string-value of the node, and with the second argument being the second argument passed to the document function." >Could you please write some accurate usecase? And >then I'l try to show how to do it without document() 'magic' . This is a bit of functionality that I have needed in the past to achieve the things I needed to achieve, in particular to create extensible solutions where I couldn't predict in advance how many files the stylesheet would need to access. I don't know how more 'accurate' a use case can be than being a case that has been used. >> If document() accepted only single strings (or nodes), sorting a collection >> of classes drawn from several files would, I think (?), only be possible by >> going through an intermediary result tree fragment. > >And what's wrong with usage of intermediate variable ? I wasn't saying there was anything 'wrong' with using an intermediate variable. However, there are four issues involved in using one: Firstly, you can't do it in XSLT 1.0 without using extension functions and thus reducing the portability of your code. I realise that that's probably not an issue in this thread as you are specifically talking about XSLT 1.1, but it is an indication of why document() *with* this functionality may have been included in XSLT 1.0. Possibly if variables had generated node sets in the first place, then the powers that be would not have defined it in this way. Secondly, it's more verbose. The alternative code (given implicit rtf->node-set conversion) would be: <xsl:variable name="docs"> <xsl:for-each select="class-files/file"> <xsl:copy-of select="document(@href)" /> </xsl:for-each> </xsl:variable> <xsl:for-each select="$docs/classes/class"> <xsl:sort select="@name" /> ... </xsl:for-each> Of course the more verbose code may be regarded as a Good Thing. There's always a balance to be drawn between readability and the size (and hence storage space and parse/processing time) of the stylesheet; different projects will have different priorities. Thirdly, there are the issues of the base URI to be used for retrieving further information about the class files. Let's say for the sake of argument that the class files themselves are in different directories and each have further references out - perhaps they point to a module definition - and that those references are relative to the class files themselves. <module href="modules/module1.xml" /> In the initial code, the class nodes that are iterated over are within the initial document itself, and it is therefore possible to identify the base URI for resolving these references. In the above variable declaration, on the other hand, a new RTF is generated - it's not *pointing to* the nodes, it's making a new copy of them. It is therefore harder (if not impossible, depending on the XML schema) to tell what base URI should be used to resolve these references. Finally, creating a copy of each of the documents involved means that, in a naive implementation at least, not only are the documents themselves stored, but so is a copy of each of them, which would presumably have an adverse effect on the memory consumption of the XSLT processor. >If good old xsl:for-each is so 'bad' for aggregation that it should >live *inside* document() why not place xsl:sort into document() ? >Just kidding. I want to take for-each *out* of document(). Not >to place *more* 'handy things' into document(). [and snip explanation for bias against 'handiness'] I can quite see your point and can imagine how frustrating it must be to be presented with impenetrable code day after day. As usual, the goal has to be to 'make the simple things easy and the complex things possible', for both the stylesheet author and the stylesheet maintainer. Within XSLT 1.0 there is no way to convert a result tree fragment into a node set. This means that certain 'complex things' (and even some 'simple things') would be impossible if it weren't for 'handy' functions and operators. document() is not the only place where an implicit for-each takes place. For example, the '=' operator performs an implicit for-each whenever a node set is used on either side. For example: <xsl:for-each select="element[not(. = preceding-sibling::element)]"> <xsl:value-of select="position()" />. <xsl:value-of select="@name" /> </xsl:for-each> gives a numbered list of the names of those elements that do not have a preceding element with the same content. To do this without the implicit for-each behaviour and without rtf to node-set conversion would be (I think) impossible (the numbering's the tricky bit). The implicit for-each with '=' was presumably designed to make this kind of thing possible. With implicit rtf to node-set conversion it would be possible even without the implicit for-each: <xsl:variable name="unique-elements"> <xsl:for-each select="element"> <xsl:if test="not(preceding-sibling::element(. = current()))"> <xsl:copy-of select="." /> </xsl:if> </xsl:for-each> </xsl:variable> <xsl:for-each select="$unique-elements"> <xsl:value-of select="position()" />. <xsl:value-of select="@name" /> </xsl:for-each> However, I imagine that this would be a lot less efficient as well as being more verbose. In fact there have been a couple of questions here in short order saying "how can I select unique nodes *case-insensitively*?" By analogy with the above, the solution is: element[not(translate(., $upper, $lower) = translate(preceding-sibling::element, $upper, $lower))] However, this does not work because the translate() function has no implicit for-each: it converts the node set to a string by taking the string value of the first node, and operates only on that. I hesitate to suggest it for fear of raising your ire yet more, Paul, but perhaps there's an argument for having these string functions perform with implicit for-eaches to permit the above. Another use case (which kind of brings me full circle) would be where I have a collection of nodes that identify, say, data sets: <dataset> <data number="1" /> <data number="2" /> <data number="3" /> </dataset> and I want to access all of the documents that are of the form 'dataN.xml' where N is the number as indicated in the XML above. This isn't possible (as far as I can tell) in XSLT 1.0 (but will be within XSLT1.1). I would dearly love to be able to do: document(concat('data', dataset/data/@number, '.xml')) to be able to retrieve them all at once. But perhaps this just marks me out as a 'good perl hacker' despite my relative ignorance of Perl ;) As David Carlisle has pointed out, the above functionality will be made possible when(/if?) it becomes possible to define XSLT functions for use in XPath expressions. Then I could do something like: my:document(dataset/data/@number) with: <xsl:function name="my:document"> <xsl:param name="numbers" /> <xsl:variable name="first-doc" select="document(concat('data', $numbers[1], '.xml'))" /> <xsl:choose> <xsl:when test="count($numbers) > 1"> <xsl:return select="$first-doc | my:document($numbers[position() > 1])" /> </xsl:when> <xsl:otherwise> <xsl:return select="$first-doc" /> </xsl:otherwise> </xsl:choose> </xsl:function> to perform both those implicit for-eaches explicitly (and actually recursively) for me. I'm not sure whether user-defined functions like this are more or less transparent to the person who has to maintain the code? All in all, I think the important thing as we move on to the next stage in XSLT evolution is that those design patterns that we find ourselves using time and time again (like selecting unique nodes) should be made easier through the introduction of functions (and XSLT elements) that *both* decrease the verbosity of the code *and* enhance its readability. Introducing user-defined functions will help this a great deal. For example, instead of the above XPath to select unique nodes, why not something like: elements[my:unique(., ../elements)] Allowing authors to create and share their own functions, and to use them in the way they want to use them, will quickly identify those that are useful and those that are not, how many arguments they should take, what type they should be and how they should be used. Cheers, Jeni Jeni Tennison http://www.jenitennison.com/ XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: XSLT V 1.1, Paul Tchistopolskii | Thread | Re: XSLT V 1.1, Paul Tchistopolskii |
Re: My favourite XSLT enhancement r, Steve Muench | Date | RE: Rendering: filling tag attribut, Robert C. Lyons |
Month |