Re: [xsl] String conversion problem when string is large

Subject: Re: [xsl] String conversion problem when string is large
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Wed, 21 Mar 2012 13:16:31 -0400
Dear Mike,

On 3/20/2012 7:50 PM, Michael Kay wrote:
I have much to learn. Those two things look identical to me.

Sadly, it's a very common mistake. Any variable-like element with
content (xsl:variable, xsl:param, xsl:with-param) creates a temporary
tree; in this case a tree containing a document node and a single text
node child. That's a much more heavyweight data structure than a simple
string, as your memory problems demonstrate. It behaves like a string in
most circumstances, but not all - for example, conversion to a boolean
behaves differently. That makes it quite hard to optimize. Saxon does
its best by using a special tree model that can only hold a document
node with a single text node child. I happened to notice in your stack
trace that this wasn't being used here, it was using a standard TinyTree
- I'm not sure why the optimization isn't kicking in, there can be any
number of reasons. But using a string is so much better, it reduces the
size of your code and makes all this optimization effort (which is
really just amelioration of bad code) completely unnecessary.

Even better, declare it as a string so the optimizer knows what to expect!

Just to confirm, do I read correctly that


<xsl:param name="string" as="xs:string">string</xsl:param>

will be passed and processed as a string rather than a tree?

I have been working in the belief (also in analogous cases) that it does, but as long as we're on the topic I wonder if you can speak to it.

Plus, I am wondering whether I can know that this will happen in any XSLT 2.0 processor, not only Saxon.

The main reason I need this settled in my mind is that, as you say, this comes up very commonly; and one of the most mystifying pieces of advice we give to beginners is to prefer using @select, when possible, to a literal value in a parameter or variable assignment, even though the latter is legal and it works, under one definition of "work". If the workaround is really this easy (and if it even suggests the nature of the underlying problem, as this appears to), that might help.

In other words, if

<xsl:param name="string" as="xs:string">string</xsl:param>

is long for

<xsl:param name="string" select="'string'"/>

we should see similarly that

<xsl:param name="string">string</xsl:param>

is another way of saying

<xsl:param name="string" as="document-node()">
  <xsl:document>string</xsl:document>
</xsl:param>

Is this correct?

Thanks as always,
Wendell

======================================================================
Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================

Current Thread