RE: DSSSL side effect-freeness

Subject: RE: DSSSL side effect-freeness
From: "Frank A. Christoph" <christo@xxxxxxxxxxxxxxxxxx>
Date: Wed, 28 Jan 1998 14:09:36 +0900
>I'm not a language theorist, but I think you've more or less hit the
>nail on the head. Call by value and call by reference are identical in
>the absence of mutability. I thought that maybe the question was meant
>to be about lazy vs. eager processing (evaluation order). I thought this
>because laziness and side-effect-freeness often go hand in hand.

Sorry, I was being obtuse.  The question _is_ about lazy vs. eager
evaluation.  Please bear with me while I elucidate.  I'll come back to my
original point at the end, by which time you will hopefully be in a better
position to appreciate the reason for my question.

In a functional language, the question of argument-passing conventions comes
down to a choice of evaluation strategies.  There are essentially three
options.  Call-by-value (CBV), call-by-name (CBN) and call-by-need (Need).
The first is eager, the second two are lazy.

In CBV, a term is reduced to a data value (i.e., normalized) before it is
passed as an argument to a function.  This has the benefit that if a
argument is used twice or more in the function body, we avoid having to
normalize it twice or more.  On the other hand, if an argument is discarded,
then we did unnecessary work in normalizing it.  SML and Scheme both operate
on CBV.

In a lazy regime, a term is not normalized unless the value is required, for
example as the result of the program.  This has the benefit that if a
function argument is discarded, we did not perform the normalization in
vain, as we would have in CBV.  On the other hand, in CBN, if an argument is
used twice or more, then we end up normalizing it twice or more.  Need is
like CBN, except for a relatively small modification: when an argument's
value is demanded, we normalize it the first time, store the value away
somewhere, and if it is demanded again we use the normalized term which we
had saved instead of normalizing it again.  The net effect of Need, then, is
to minimize the amount of work done, at least locally.  Haskell and Miranda
use this last policy.

Functional languages with side-effects usually adopt an eager policy (i.e.,
CBV), and it is true that side-effects do not mix well with Need because the
order in which functions are evaluated can't in general be known statically,
so side-effects wouldn't be deterministic in such a regime.  But there is no
reason why a functional language cannot adopt CBN (and, in fact, John
Reynold's language Idealized Algol does); so there is no inherent dichotomy
between laziness and side-effects.

>The evaluation order does not matter *except* for the fact that some
>programs that will not terminate when evaluated eagerly WILL when
>evaluated lazily. In other words, my understanding is that if the
>language spec were lazy, this would not loop forever:
>
>(define (fact num)
>   (let ((b (* (fact (- num 1)) num)))
>      (if (= num 0)
> b
> 1)))


This program would loop in any reasonable language unless you switched the
branches of the if.  :)  Once you do that, then, yes, this function will
terminate in a lazy regime.  Really, it depends though on what the semantics
of 'let' are; even in a lazy language, you can have strict (eager)
constructs.  For example, people are contemplating making 'let' strict in
Standard Haskell.

This issue of termination is relevant to my original question.

In the lambda-calculus, upon which Scheme is loosely based, reduction can be
interpreted as equality.  If a term (f x) reduces to y, then (f x) = y.
(This property is called soundness.)  The choice of evaluation order affects
whether the converse is true, i.e., whether (f x) = y implies that (f x)
reduces to y.  When this is true, we say that that strategy is complete
w.r.t. the equational theory.  CBV is not complete, and the function that
Paul included above is one of those terms that proves this.  In CBV, (fact
1) will not terminate, even though we have that (fact 1)=0 in the equational
theory: there is no y for it to reduce to.  It turns out, though, that a
lazy evaluation order is complete, and so there are more provable equalities
between programs, so more programs terminate.

When the evaluation order is complete, programs are a) more concise and b)
more flexible because 1) they don't need to encode as much behavior, and 2)
they behave well in more contexts.  This is what makes lazy evaluation an
attractive choice for a functional language.
Since CBN is generally considered less efficient than Need, 'lazy' has come
to be confused with Need.

So, recalling that laziness does not mix well with side effects, you can see
that there are usually two extremes for a functional language: either it
uses CBV and has side-effects, or it uses Need and has no side-effects.
Clearly both alternatives have their advantages.

But now consider the DSSSL expression language.  In the world I have just
drawn, it is a very strange beast indeed: it isn't lazy and it hasn't got
any side-effects.  This puts it smack in the middle of two Good Things, yet
possessing neither, like a child at Christmas that can't make up its mind
about which present to open first.

Now, my question was, is this state of affairs _logically_ necessary or not?
That is, was this choice motivated by secondary issues like "Scheme is CBV
and DSSSL syntax is Scheme syntax so DSSSL will be less familiar if it isn't
also CBV" or "CBV is easier to implement"?  Or was it necessitated by
something else in DSSSL that interacts with the expression language? In
particular, I wanted to know if the way groves interact with the expression
language, namely via context-dependent procedures like 'current-node',
invalidates both our options, i.e., groves mix well neither with lazy
evaluation nor side-effects.

> Think about a grove that is 5GB large and distributed across many
> machines. Now you want to format a single paragraph. Under the DSSSL
> model, you must ask the other machines about the various parts of the
> grove, but you do *not* have to apply the entire stylesheet to the
> entire grove.
>
> At most you must apply the construction rules for the element and those
> of its ancestors. So you must build the whole grove (typically), but you
> need not apply the whole stylesheet to it. In a WYSIWYG system where you
> might want to look at a single page of a multi-gig document, that might
> make a big difference.

So this is an argument for side effect-freeness?  OK, I see your point about
being able to apply only part of the style spec.  I don't see what this has
to do with the grove being distributed, though -- and I can't imagine
distributed groves being a typical case anyway.

So, assuming we've ruled out side effects, what is the argument against
laziness?

--FC



 DSSSList info and archive:  http://www.mulberrytech.com/dsssl/dssslist


Current Thread
  • Re: DSSSL side effect-freeness, (continued)
    • G. Ken Holman - from mail1.ability.netby web4.ability.net (8.8.5/8.6.12) with ESMTP id JAA00629Tue, 27 Jan 1998 09:44:17 -0500 (EST)
      • Harald Hanche-Olsen - from mail1.ability.netby web4.ability.net (8.8.5/8.6.12) with ESMTP id QAA03547Tue, 27 Jan 1998 16:26:24 -0500 (EST)
        • Paul Prescod - from mail1.ability.netby web4.ability.net (8.8.5/8.6.12) with ESMTP id TAA10715Tue, 27 Jan 1998 19:42:59 -0500 (EST)
    • Paul Prescod - from mail1.ability.netby web4.ability.net (8.8.5/8.6.12) with ESMTP id TAA10659Tue, 27 Jan 1998 19:42:27 -0500 (EST)
    • Frank A. Christoph - from mail1.ability.netby web4.ability.net (8.8.5/8.6.12) with ESMTP id AAA12305Wed, 28 Jan 1998 00:06:25 -0500 (EST) <=
      • Pierre Mai - from mail1.ability.netby web4.ability.net (8.8.5/8.6.12) with ESMTP id OAA22416Wed, 28 Jan 1998 14:15:53 -0500 (EST)
      • Paul Prescod - from mail1.ability.netby web4.ability.net (8.8.5/8.6.12) with ESMTP id RAA24174Wed, 28 Jan 1998 17:19:24 -0500 (EST)
    • W. Eliot Kimber - from mail1.ability.netby web4.ability.net (8.8.5/8.6.12) with ESMTP id OAA23003Wed, 28 Jan 1998 14:56:26 -0500 (EST)