RE: [xsl] What is the Maximum number of recurssions

Subject: RE: [xsl] What is the Maximum number of recurssions
From: Scott Trenda <Scott.Trenda@xxxxxxxx>
Date: Tue, 06 Mar 2012 12:15:22 -0500
I don't know if anyone noticed or picked up my last response here, but I was stress-testing this template and comparing the results against a string-replace from another language, and came across an error. Turns out that doing a direct replacement in the $mid-string section, instead of another recursive call, ends up not only simpler but more accurate as well, since we don't have to worry about potentially chopping up another instance of the search string with the $mid-string substring. Here's the updated template.



<!-- Replaces all instances of $search in $string with $replace. -->
<xsl:template name="str:replace">
	<xsl:param name="string" select="''" />
	<xsl:param name="search" select="''" />
	<xsl:param name="replace" select="''" />
	<xsl:variable name="half" select="floor(string-length($string) div 2)" />
	<xsl:variable name="midpoint1" select="$half - (string-length($search) - 1)" />
	<xsl:variable name="midpoint2" select="$half + (string-length($search) - 1)" />
	<xsl:variable name="mid-string" select="substring($string, $midpoint1 + 1, $midpoint2 - $midpoint1)" />
	<xsl:choose>
		<xsl:when test="not(string($string) and string($search) and contains($string, $search))">
			<xsl:value-of select="$string" />
		</xsl:when>
		<xsl:when test="string-length($string) &lt; string-length($search) * 2">
			<xsl:value-of select="concat(substring-before($string, $search), $replace, substring-after($string, $search))" />
		</xsl:when>
		<xsl:when test="$mid-string and contains($mid-string, $search)">
			<xsl:call-template name="str:replace">
				<xsl:with-param name="string" select="substring($string, 1, $midpoint1 + string-length(substring-before($mid-string, $search)))" />
				<xsl:with-param name="search" select="$search" />
				<xsl:with-param name="replace" select="$replace" />
			</xsl:call-template>
			<xsl:value-of select="$replace" />
			<xsl:call-template name="str:replace">
				<xsl:with-param name="string" select="substring($string, $midpoint2 - string-length(substring-after($mid-string, $search)) + 1)" />
				<xsl:with-param name="search" select="$search" />
				<xsl:with-param name="replace" select="$replace" />
			</xsl:call-template>
		</xsl:when>
		<xsl:otherwise>
			<xsl:call-template name="str:replace">
				<xsl:with-param name="string" select="substring($string, 1, $half)" />
				<xsl:with-param name="search" select="$search" />
				<xsl:with-param name="replace" select="$replace" />
			</xsl:call-template>
			<xsl:call-template name="str:replace">
				<xsl:with-param name="string" select="substring($string, $half + 1)" />
				<xsl:with-param name="search" select="$search" />
				<xsl:with-param name="replace" select="$replace" />
			</xsl:call-template>
		</xsl:otherwise>
	</xsl:choose>
</xsl:template>


~ Scott





-----Original Message-----
From: Scott Trenda [mailto:Scott.Trenda@xxxxxxxx] 
Sent: Friday, March 02, 2012 9:26 AM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: RE: [xsl] What is the Maximum number of recurssions

Well, scratch that "wouldn't want to" part after all. I guess I'm a sucker for interesting problems with unconventional solutions. :)

Turns out MSXML doesn't optimize the classic str:replace pattern for tail recursion after all, and it bombs out with a stack overflow after ~550 replacements (1 iteration per replacement), in my experiment. The following template worked for 400000+ replacements by the time I stopped the test:

<!-- Replaces all instances of $search in $string with $replace. --> <xsl:template name="str:replace">
	<xsl:param name="string" select="''" />
	<xsl:param name="search" select="''" />
	<xsl:param name="replace" select="''" />
	<xsl:variable name="half" select="floor(string-length($string) div 2)" />
	<xsl:variable name="midpoint1" select="$half - (string-length($search) - 1)" />
	<xsl:variable name="midpoint2" select="$half + (string-length($search) - 1)" />
	<xsl:variable name="mid-string" select="substring($string, $midpoint1 + 1, $midpoint2 - $midpoint1)" />
	<xsl:choose>
		<xsl:when test="not(string($string) and string($search) and contains($string, $search))">
			<xsl:value-of select="$string" />
		</xsl:when>
		<xsl:when test="string-length($string) &lt; string-length($search) * 2">
			<xsl:value-of select="concat(substring-before($string, $search), $replace, substring-after($string, $search))" />
		</xsl:when>
		<xsl:when test="$mid-string and contains($mid-string, $search)">
			<xsl:call-template name="str:replace">
				<xsl:with-param name="string" select="substring($string, 1, $midpoint1)" />
				<xsl:with-param name="search" select="$search" />
				<xsl:with-param name="replace" select="$replace" />
			</xsl:call-template>
			<xsl:call-template name="str:replace">
				<xsl:with-param name="string" select="$mid-string" />
				<xsl:with-param name="search" select="$search" />
				<xsl:with-param name="replace" select="$replace" />
			</xsl:call-template>
			<xsl:call-template name="str:replace">
				<xsl:with-param name="string" select="substring($string, $midpoint2 + 1)" />
				<xsl:with-param name="search" select="$search" />
				<xsl:with-param name="replace" select="$replace" />
			</xsl:call-template>
		</xsl:when>
		<xsl:otherwise>
			<xsl:call-template name="str:replace">
				<xsl:with-param name="string" select="substring($string, 1, $half)" />
				<xsl:with-param name="search" select="$search" />
				<xsl:with-param name="replace" select="$replace" />
			</xsl:call-template>
			<xsl:call-template name="str:replace">
				<xsl:with-param name="string" select="substring($string, $half + 1)" />
				<xsl:with-param name="search" select="$search" />
				<xsl:with-param name="replace" select="$replace" />
			</xsl:call-template>
		</xsl:otherwise>
	</xsl:choose>
</xsl:template>

Try it out, I didn't get a chance to try it in larger and complex scenarios. I'd welcome any bug reports, tweaks, comments, suggestions, etc. :)

~ Scott


-----Original Message-----
From: Scott Trenda [mailto:Scott.Trenda@xxxxxxxx]
Sent: Friday, March 02, 2012 7:36 AM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: RE: [xsl] What is the Maximum number of recurssions

I encountered this recently, when I had to retool a str:rtrim template to use bisection because the previous step-through-the-whitespace method was bombing out on large data sets.

Out of curiosity, has anyone thought about a bisection approach for str:replace (XSLT 1.0)? I was able to use bisection for my str:rtrim template, but only because the character I was looking for (' ') was only one character long. Is it possible to use bisection when searching for a dynamic-length string in the subject string? The basic problem theoretically is that you don't want to bisect the subject string at a point that's in the middle of an occurrence of the search string.

I suppose it's possible if you tweak the basic bisection algorithm - one idea I had was to check for contains(substring($subject, string-length($subject) / 2 - string-length($search), string-length($subject) / 2 - string-length($search))) before splitting the string at string-length($subject) / 2; my assumption is that you'll never split an occurrence of the search string if you check a sufficiently wide string surrounding the search area. I also suppose that you'd be able to optimize the conditions for where you'd stop searching, based on the length of the search string.

Basically, it sounds like there are some potential improvements to be made, but I wouldn't want to do them myself if someone else has already done them. :) I'm also guessing that the fundamental pattern in str:replace is one that would be optimized for tail recursion pretty quickly, so I don't know if there's even any potential gain to be had here.

Any thoughts?

~ Scott


-----Original Message-----
From: Dimitre Novatchev [mailto:dnovatchev@xxxxxxxxx]
Sent: Thursday, March 01, 2012 3:23 PM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [xsl] What is the Maximum number of recurssions

On Thu, Mar 1, 2012 at 12:36 PM, Karl Stubsjoen <kstubs@xxxxxxxxx> wrote:
> I'm just curious, what is the maximum number of times a template can 
> recurse?

I guess you men the number of nested recursive calls (nestedness).
There is no fixed maximum and in some cases a correctly written template/function can have nestedness measured in the millions.

Such are the cases with tail-recursive functions/templates -- and this requires that the XSLT processor can recognize and optimize tail recursion.

For example, this transformation run with Saxon 6.5.4 successfully prints all numbers from 1 to one million:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
  <xsl:call-template name="printToN"/>
 </xsl:template>

 <xsl:template name="printToN">
  <xsl:param name="pUpTo" select="1000000"/>
  <xsl:param name="pDoneWith" select="0"/>

  <xsl:if test="$pUpTo > $pDoneWith">
   <xsl:value-of select="$pDoneWith+1"/>
   <xsl:text>&#xA;</xsl:text>

   <xsl:call-template name="printToN">
    <xsl:with-param name="pUpTo" select="$pUpTo"/>
    <xsl:with-param name="pDoneWith" select="$pDoneWith+1"/>
   </xsl:call-template>
  </xsl:if>
 </xsl:template>
</xsl:stylesheet>

However, other XSLT processors (AltovaXML) produce this error: "XSLT instruction stack overflow"

A DVC (Divide and Conquer) style recursion is also a practical solution and processing a sequence of N items requires a maximum nestedness of only log(N). This recursive style is remarkable in that it doesn't require (as required in the case of tail recursion) any special abilities from the XSLT processor and works correctly with any XSLT processor. Thus the maximum callstack depth when processing a sequence of one million items with DVC-style recursion, is only 19.

> I would imagine it would be different from engine to engine.
>  Is there any other engine intelligence at play here, for example the 
> engine has decided that you've written and endless looping recursion, 
> instead of one that knows you'll eventually reach the end of your 
> source xml?


This is generally impossible to do, because whether or not an algorithm finishes often depends on the specific data that is fed to it -- typical examples are many numerical methods that converge only if the data is limited in a specific interval or by any other condition.

Cheers,
Dimitre.

>
> Thanks,
> Karl..
>
> --
> Karl Stubsjoen
> MeetScoresOnline.com
> (602) 845-0006
>



--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what you're doing is work or play
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.

Current Thread