## [xsl] Re: Divide and Conquer implementations of str-foldl and str-map (Was: Re: big recursive function)

 Subject: [xsl] Re: Divide and Conquer implementations of str-foldl and str-map (Was: Re: big recursive function) From: Dimitre Novatchev Date: Fri, 30 Nov 2001 12:42:05 -0800 (PST)
```Tom Myers <tommy at cs dot colgate dot edu> wrote:

> I'm way behind, as always, but still interested....
> On 28 Nov 2001 08:23:13 -0800 (PST), Dimitre Novatchev <dnovatchev@xxxxxxxxx>
>
> > > > I haven't seen yet different way to process characters inside a node
> > > > text (except making an extension function but it's cheating :-). It looks
> > > > like the right way to do things from an XSLT point of view.
> > >
> > > using a divide and conquer method (see posts of dimitre's) should change
> > > the recursion depth from being O(n) to O(log n) (at the cost of giving
> > > up tail recursion) so on a system that doesn't implement tail recursion
> > > elimination this should be better (and may well be faster anyway).
>
> with
> >Here's what I think of as a generic way to solve the majority of problems with
> > deep
> >recursion when processing long lists of characters (strings). ... ....
> >Another, more understandable definition of map is:
> >map :: (a -> b) -> [a] -> [b]
> >map _ [] = []
> >map f (x:xs) = (f(x) : map f xs)
> >For example,
> >map (+1) [1,2,3] = [2,3,4]
>
> I kinda like array-notation variations like
>    map f S [i] = f(S[i])
> myself, but I admit it's not a big deal, except that it makes clear that the
> length of
> the result is always and necessarily the same as the length of the original, which
> is
> relevant to this case. At least, I think so.
> ....
>
> >Now, to solve the problem of indenting every new line, we simply map every
> > character
> >to itself, except for the NL character, which we map to NL + "' '" (new line plus

> >4 spaces):
>
> Notice when you do that, that there is an implicit accumulation of results: we
take
> a string S of length N, and we produce N strings (each of length 1 or length 5)
and
> then we produce a string of length >= N as a result of cumulative concatenation.
> Okay,
> in template constructions everything gets cumulatively concatenated, and all
> algebraic
> needs are satisfied for parallelism (concat is associative, identity "") so it's
> still
> not a big deal, but when we define map :: (a -> b) -> [a] -> [b], it might be
> important; here your indentNL template is char -> string, i.e. char -> [char],
> so the map is (char -> [char]) -> [char] -> [[char]], unless I'm abusing the
> notation again,  and in the end we do need to get back from [[char]] to [char], in
> effect with a fold1 of concat which is folded into the XSLT implementation. Yes?

No, but your observations are correct... :o)  The reason we don't have to worry
about the final type is that... in XSLT a string is a scalar -- not a list at all.
Therefore we do not have "string of strings" -- it is just... "string".

>
> >... .... ...
> >Of course, a more efficient solution will be a special DVC algorithm, using the
> > XSLT
> >functions contains(), substring-before() and substring-after().
> >The advantages of foldl and map is that they've been already implemented and can
> > be
> >readily used to solve multitude of problems, without requiring difficult and
>
> I'm wondering about two possibilities.

I don't understand the rest of your message -- maybe because it's Friday evening?

> One would be versions of fold1 and map that
> would be used in pre-processing, in effect as macros, so that the XSLT programmer
> could take something like your suggested
> >     <xsl:template name="indentNL">
> >       <xsl:param name="arg1"/>
> >
> >       <xsl:value-of select="\$arg1"/>
> >       <xsl:if test="\$arg1='&#10;'">
> >         <xsl:value-of select="'    '"/>
> >       </xsl:if>
> >     </xsl:template>
>
> as one very small file, apply a "str-dcmap.xsl" transformation to it, and get a
> special
> DVC algorithm out of it which could then be manually improved. In effect, you'd
> have a
> wouldn't
> be asking map or fold1 to be more efficient. It should be easy to run within emacs
> or vi
> or even pfe on windows, in effect building a piece of a program development
> environment.
> But then again, maybe it's like generating assembler output from C code and
> editting that,
> which I haven't done in years and never really learned to love. Thoughts?
>
> Another possibility is that of introducing hierarchies automatically; it's not
hard
> to do DVC templates manually on data that's already hierarchical. I'm thinking of
a
> <xsl:template name="splitStr">
>    <xsl:param name="data">
>    <xsl:param name="break">
> ...
> </xsl:template>
> such that given the multiline string of this thread for "data" and "'&#10;'" for
> "break", we'd get a tree of depth log(N) where N would be the number of break-
> chars.

See my "Functional tokenizer"

http://sources.redhat.com/ml/xsl-list/2001-11/msg00901.html

Cheers,
Dimitre.
P.S. It's good that you returned back, so that I have someone to talk to about FP in
XSLT. :o)

__________________________________________________
Do You Yahoo!?
Yahoo! GeoCities - quick and easy web site hosting, just \$8.95/month.
http://geocities.yahoo.com/ps/info1

XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

```