[xsl] Divide and Conquer implementations of str-foldl and str-map (Was: Re: big recursive function)

Subject: [xsl] Divide and Conquer implementations of str-foldl and str-map (Was: Re: big recursive function)
From: Dimitre Novatchev <dnovatchev@xxxxxxxxx>
Date: Wed, 28 Nov 2001 08:23:13 -0800 (PST)
> >   I haven't seen yet different way to process characters inside a node
> > text (except making an extension function but it's cheating :-). It looks
> > like the right way to do things from an XSLT point of view.
> 
> using a divide and conquer method (see posts of dimitre's) should change
> the recursion depth from being O(n) to O(log n) (at the cost of giving
> up tail recursion) so on a system that doesn't implement tail recursion
> elimination this should be better (and may well be faster anyway).

Here's what I think of as a generic way to solve the majority of problems with deep
recursion when processing long lists of characters (strings).

In my previous posts I already introduced a very general list-folding function
foldl, and its string counterpart -- str-foldl.

I mentioned (I think) how the map() function can be defined using the foldl
function:

map :: (a -> b) -> [a] -> [b]
map f = foldl ((:).f) []

The map function takes a function and a list as arguments and returns a new list,
having the same number of elements as the list-argument, every element of which is
the result of applying the function f() on the corresponding element of the
list-argument.

Another, more understandable definition of map is:

map :: (a -> b) -> [a] -> [b]
map _ [] = []
map f (x:xs) = (f(x) : map f xs)

For example,

map (+1) [1,2,3] = [2,3,4]

Using foldl and map we may be able to solve 99% of all problems we have with lists.
Using str-foldl and str-map we may be able to solve 99% of all problems we have with
strings.

The only problem could be that when processing very big lists/strings on XSLT
processors that do not optimize tail-recursion with iteration, there would be an
exhaustive increase of the call stack leading to crash.

The solution in such cases is to use ***divide and conquer*** (DVC) implementations
of foldl and map or, respectively, of str-foldl and str-map.

Here's this implementation:

dvc-str-foldl.xsl:
-----------------
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
    <xsl:template name="dvc-str-foldl">
      <xsl:param name="pFunc" select="/.."/>
      <xsl:param name="pA0"/>
      <xsl:param name="pStr"/>

      <xsl:choose>
         <xsl:when test="not($pStr)">
            <xsl:copy-of select="$pA0"/>
         </xsl:when>
         <xsl:otherwise>
            <xsl:variable name="vcntList" select="string-length($pStr)"/>
            <xsl:choose>
              <xsl:when test="$vcntList = 1">
                  <xsl:apply-templates select="$pFunc[1]">
                    <xsl:with-param name="arg0" select="$pFunc[position() > 1]"/>
                    <xsl:with-param name="arg1" select="$pA0"/>
                    <xsl:with-param name="arg2" select="substring($pStr,1,1)"/>
                  </xsl:apply-templates>
              </xsl:when>
              <xsl:otherwise>
                <xsl:variable name="vHalfLen"
                              select="floor($vcntList div 2)"/>
                <xsl:variable name="vFunResult1">
                  <xsl:call-template name="dvc-str-foldl">
                    <xsl:with-param name="pFunc" select="$pFunc"/>
                    <xsl:with-param name="pA0" select="$pA0"/>
                    <xsl:with-param name="pStr" select="substring($pStr,
                                                                  1,
                                                                  $vHalfLen
                                                                  )"/>
                  </xsl:call-template>
                </xsl:variable>

                <xsl:call-template name="dvc-str-foldl">
    		        <xsl:with-param name="pFunc" select="$pFunc"/>
    		        <xsl:with-param name="pStr"
                                    select="substring($pStr,$vHalfLen+1)"
                                    />
    		        <xsl:with-param name="pA0" select="$vFunResult1"/>

                </xsl:call-template>
              </xsl:otherwise>
            </xsl:choose>
         </xsl:otherwise>
      </xsl:choose>

    </xsl:template>

</xsl:stylesheet>

str-dvc-map.xsl:
---------------
<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:map-foldl-func="map-foldl-func"
exclude-result-prefixes="xsl map-foldl-func"
>
   <xsl:import href="dvc-str-foldl.xsl"/>

   <map-foldl-func:map-foldl-func/>
   
     <xsl:template name="str-dvc-map">
      <xsl:param name="pFun" select="/.."/>
      <xsl:param name="pStr" select="/.."/>

       <xsl:variable name="vFoldlFun" select="document('')/*/map-foldl-func:*[1]"/>
       
	   <xsl:variable name="vFuncComposition">
	     <xsl:copy-of select="$vFoldlFun"/>
	     <xsl:copy-of select="$pFun"/>
	   </xsl:variable>
	   
	   <xsl:variable name="vFComposition" 
	                 select="msxsl:node-set($vFuncComposition)/*"/>

      <xsl:call-template name="dvc-str-foldl">
        <xsl:with-param name="pFunc" select="$vFComposition"/>
        <xsl:with-param name="pStr" select="$pStr"/>
        <xsl:with-param name="pA0" select="/.."/>
      </xsl:call-template>
    </xsl:template>


    <xsl:template name="mapL" match="*[namespace-uri() = 'map-foldl-func']">
         <xsl:param name="arg0" select="/.."/>
         <xsl:param name="arg1" select="/.."/>
         <xsl:param name="arg2" select="/.."/>
         
         <!-- $arg1 must be A0 -->
         <xsl:copy-of select="$arg1"/>
         
           <xsl:apply-templates select="$arg0[1]">
             <xsl:with-param name="arg1" select="substring($arg2,1,1)"/>
           </xsl:apply-templates>
    </xsl:template>
    

</xsl:stylesheet>

Now, to solve the problem of indenting every new line, we simply map every character
to itself, except for the NL character, which we map to NL + "'    '" (new line plus
4 spaces):

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:testmap="testmap"
exclude-result-prefixes="xsl testmap"
>
   <xsl:import href="str-dvc-map.xsl"/>
   
   <testmap:testmap/>

   <xsl:output omit-xml-declaration="yes" indent="yes"/>
   
   <xsl:template match="/">
     <xsl:variable name="vTestMap" select="document('')/*/testmap:*[1]"/>
     <xsl:call-template name="str-dvc-map">
       <xsl:with-param name="pFun" select="$vTestMap"/>
       <xsl:with-param name="pStr" select="string(/*)"/>
     </xsl:call-template>
   </xsl:template>

    <xsl:template name="indentNL" match="*[namespace-uri() = 'testmap']">
      <xsl:param name="arg1"/>
      
      <xsl:value-of select="$arg1"/>
      <xsl:if test="$arg1='&#10;'">
        <xsl:value-of select="'    '"/>
      </xsl:if>
    </xsl:template>

</xsl:stylesheet>

When the above transformation is applied on the following xml source:

<t>
line0
line1
line2
line3
line4
line5
line6
line7
line8
line9
line10
</t>

it produced this output:


    line0
    line1
    line2
    line3
    line4
    line5
    line6
    line7
    line8
    line9
    line10
    
I also tried it on a 10000 line input and it worked OK (but a little bit slow)
without crashing.

Of course, a more efficient solution will be a special DVC algorithm, using the XSLT
functions contains(), substring-before() and substring-after().

The advantages of foldl and map is that they've been already implemented and can be
readily used to solve multitude of problems, without requiring difficult and
error-prone additional programming.

Cheers,
Dimitre Novatchev.
P.S. The best solution to the original problem is just to enclose the text as a
child of a "blockquote" element... :o)





__________________________________________________
Do You Yahoo!?
Yahoo! GeoCities - quick and easy web site hosting, just $8.95/month.
http://geocities.yahoo.com/ps/info1

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread