Proper tail recursion

Subject: Proper tail recursion
From: "Frank A. Christoph" <christo@xxxxxxxxxxxxxxxxxx>
Date: Wed, 28 Jul 1999 17:56:44 +0900
There appears to be some confusion about what tail recursion means. Here is
a
longish explanation. A full treatment of the subject is really quite
involved; I
invite you to consult a textbook like Abelson & Sussman.

In functional programs, we need looping behavior, but we don't want to
introduce a while loop construct just for this purpose when we have
recursive
functions instead. Many kinds of looping programs are expressed more clearly
and elegantly using recursion than while-, for-... constructs. The problem
is
that a naive implementation of recursion will always leave an active frame
on
the stack when a function calls itself. However, many recursive functions
can be implemented with a jump that just reuses the active frame, rather
than
pushing a new one. Such functions are said to be iterative.

Intuitively, a function is iterative if it has no work left over to do after
it returns from a recursive call. Consider the following definition.

  (define (factorial n)
    (if (= n 0)
        1
        (* n (factorial (- n 1)))))

The factorial function is obviously iterative, but that fact is not readily
gleaned from the definition above. When factorial returns from its recursive
call with a value x, it still has a calculation left to do, namely to
compute
(* n x) and then return the result of that to the caller.

However, we can rewrite factorial so that it is obviously iterative, by
adding
an argument that functions as an accumulator, and then wrapping the new
function with another that provides an initial value:

  (define (factorial-helper n m)
     (if (= n 0)
         m
         (factorial-helper (- n 1) (* n m))))

  (define (factorial n) (factorial-helper n 1))

Now it is clear that factorial-helper is iterative, since after a recursive
call it has nothing left to do but return the result to the previous
caller. Consequently, we can eliminate _all_ the intermediate return calls,
and just implement the recursion with a jump.

The compiler needs a way to distinguish between functions that are iterative
and ones that are not. Furthermore, the criterion needs to by statically
decideable, i.e., detectable without executing the whole program itself.

Tail recursive form is the criterion that Scheme uses. Scheme requires that
recursive calls that appear in certain contexts called tail contexts will be
compiled to yield iterative control behavior. As an example, the second
definition of factorial has its recursive call in a tail context, while the
first form does not; in DSSSL and Scheme, then, the tail-recursive variant
will execute with a bounded amount of stack space.

Here is what the Scheme R5RS standard defines to be a tail context (I will
edit the parts that refer to non-DSSSL constructs):

>   * The last expression in the body of a lambda expression, shown
>     as <tail expression> below, occurs in a tail context.
>
>     (lambda <formals>
>       <definition>* <tail expression>)
>
>   * If one of the following expressions is in a tail context, then the
>     subexpressions shown as <tail expression> are in a tail context.
>     These were derived from rules in the grammar given in chapter
>     *Note Formal syntax and semantics:: by replacing some occurrences
>     of <expression> with <tail expression>.  Only those rules that
>     contain tail contexts are shown here.
>
>     (if <expression> <tail expression> <tail expression>)
>
>     (cond <cond clause>+)
>     (cond <cond clause>* (else <tail expression>))
>
>     (case <expression>
>       <case clause>+)
>     (case <expression>
>       <case clause>*
>       (else <tail expression>))
>
>     (and <expression>* <tail expression>)
>     (or <expression>* <tail expression>)
>
>     (let (<binding spec>*) <tail body>)
>     (let <variable> (<binding spec>*) <tail body>)
>     (let* (<binding spec>*) <tail body>)
>     (letrec (<binding spec>*) <tail body>)
>
>     where
>
>     <cond clause> --> (<test> <tail expression>)
>     <case clause> --> ((<datum>*) <tail expression>)
>
>     <tail body> --> <definition>* <tail expression>
>
>   * If a `cond' expression is in a tail context, and has a clause of
>     the form `(<expression1> => <expression2>)' then the (implied)
>     call to the procedure that results from the evaluation of
>     <expression2> is in a tail context.  <expression2> itself is not
>     in a tail context.

This criterion, tail position, is sufficient, but not necessary, to identify
iterative functions; in other words, the analysis will miss some functions
that could be compiled into iterative form. There are other, more sensitive
analyses which can find more iterative functions, but the problem is
undecideable in general. Here is a counterexample to decideability:

  (define (f x)
    (if (incredibly-complex-predicate-that-is-always-true x)
        (iterative-function x)
        (non-iterative-function x)))

If your function is not tail-recursive, you can either accept that your
function will probably not exhibit iterative behavior (which is often
acceptable and even faster than recursive behavior when the number of
recursive calls is small) or you can transform the function by hand into
tail-recursive form, so that your Scheme-compliant compiler recognizes it
and
is guaranteed to perform the required optimization. There is a well-known,
general method of doing the latter called the CPS (continuation-passing
style)
transform (which is also useful for many other things), but I won't go into
it
here. It is like the transformation I did for factorial above, but it makes
essential use of higher-order functions.

--FC


 DSSSList info and archive:  http://www.mulberrytech.com/dsssl/dssslist


Current Thread