RE: [xsl] questions regarding array implementation

Subject: RE: [xsl] questions regarding array implementation
From: "Michael Kay" <mhk@xxxxxxxxx>
Date: Wed, 25 Aug 2004 08:32:21 +0100
>     I've used RTF's to implement arrays in XSLT. The implementaion is
> shown below. With respect to that, I have a couple of questions that I
> hope someone will answer :
> 
> 1.  To my mind, the most significant difference between arrays and
> lists are that the access times for any two elements in the array are
> the same. 

Yes, if you define the terms "array" and "list" in terms of the way they are
implemented rather than the interface they offer. (In Java, "List" is an
interface that has a number of different implementations with radically
different performance characteristics).

It's hard to say anything about performance in the abstract, without
knowledge of the implementation, because optimizers have unlimited scope.
However, in the absense of a very smart optimizer (or lazy evaluator, etc),

(a) your addElement template will have O(n^2) performance, because each time
you add an element, you copy all the existing elements.

(b) it's impossible to know whether array:item[$index] has linear
performance [O(n)] or constant performance [O(1)] without either knowing the
implementation details, or taking measurements. In Saxon it would have
linear performance; but if you used a variable, so it becomes $x[$index], it
would take linear time to initialize the variable and constant time to index
into it. I'd suggest writing it as [number($index)] so the processor knows
at compile time that it's a numeric predicate and not a boolean one. (In
2.0, you can declare the type of the parameter.)

In XSLT 2.0 you can do all of this using sequences. This makes life much
simpler, though interestingly it doesn't necessarily improve performance
that much. In the new edition of my XSLT Programmer's Reference (just out) I
include a comparison between three implementations of the knight's tour
stylesheet: the first one (1.0)used structured character strings to hold the
working data, the second (1.1) used temporary trees, and the third (2.0)
used sequences. The third one was by far the shortest and most readable, but
the fastest implementation was the first, largely because copying a
character string is much faster than copying a sequence of 64 integers.
Until, of course, the optimizer recognizes a better way of doing it.

Michael Kay


When viewed from the XSL level, my implementation seems to
> satisfy that condition. However, since RTF's are being used in the
> implementation, the question would become : "when using XPath
> expressions to select from within the fragment, will the access time
> for any two elements at the same level of nesting be the same?". Would
> the answer to this be dependant on the XSL processor chosen, or is
> there something fundamental in the language which causes the answer to
> my question to be the same regardless of the processor?
> 
> 2.  Is there a better / more efficient way of implementing 
> this under XSL 2.0?
> 
> Thanks in advance,
> Kenneth
> 
> -----------------------  Implementation  ----------------------------
> 
> <?xml version="1.0"?>
> <xsl:stylesheet version="1.0"
> 	xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
> 	xmlns:xalan="http://xml.apache.org/xalan";
> 	xmlns:array="urn:internal:dummy:namespace"
> 	exclude-result-prefixes="array">
> 
> 
> 	<xsl:template name="addElement">
> 		<xsl:param name="array" />	<!-- will not 
> be present in the
> invocation when adding first element -->
> 		<xsl:param name="element" />
> 		
> 		<xsl:variable name="arrayNodes" 
> select="xalan:nodeset($array)" />
> 		<xsl:element name="root" 
> namespace="urn:internal:dummy:namespace">
> 			<xsl:if 
> test="count($arrayNodes/array:root/*) != 0">
> 				<xsl:for-each 
> select="$arrayNodes/array:root/array:item[position()
> &gt;= 1 and
> 						position() 
> &lt;= count($arrayNodes/array:root/array:item)]">
> 					<xsl:copy-of select="." />
> 				</xsl:for-each>
> 			</xsl:if>
> 			<xsl:element name="item" 
> namespace="urn:internal:dummy:namespace">
> 				<xsl:copy-of select="$element" />
> 			</xsl:element>
> 		</xsl:element>
> 	</xsl:template>
> 
> 	<xsl:template name="getElementAt">
> 		<xsl:param name="array" />
> 		<xsl:param name="index" />
> 		
> 		<xsl:variable name="itemFragment"
> select="xalan:nodeset($array)/array:root/array:item[$index]" />
> 		<xsl:choose>
> 			<xsl:when test="$itemFragment/*">
> 				<xsl:copy-of select="$itemFragment/*" />
> 			</xsl:when>
> 			<xsl:otherwise>
> 				<xsl:value-of select="$itemFragment" />
> 			</xsl:otherwise>
> 		</xsl:choose>
> 	</xsl:template>
> 		
> 	<xsl:template match="/">
> 		<xsl:variable name="arrayVar">
> 			<xsl:call-template name="addElement"> 
> <!-- add first element -->
> 				<xsl:with-param name="element" 
> select="1" />
> 			</xsl:call-template>
> 		</xsl:variable>
> 		<xsl:variable name="arrayWithThreeItemsInIt">
> 			<xsl:call-template name="addElement"> 
> <!-- add third element -->
> 				<xsl:with-param name="array">
> 					<xsl:call-template 
> name="addElement"> <!-- add second element -->
> 						<xsl:with-param 
> name="array" select="$arrayVar" />
> 						<xsl:with-param 
> name="element" select="2" />
> 					</xsl:call-template>
> 				</xsl:with-param>
> 				<xsl:with-param name="element" 
> select="$arrayVar" />
> 			</xsl:call-template>
> 		</xsl:variable>
> 		<xsl:call-template name="getElementAt">
> 			<xsl:with-param name="array" 
> select="$arrayWithThreeItemsInIt" />
> 			<xsl:with-param name="index" select="2" />
> 		</xsl:call-template>
> 	</xsl:template>
> 	
> </xsl:stylesheet>

Current Thread