In this section we will implement a normal-order language that is
the same as
SchemeJavaScript
except that compound
proceduresfunctions
are non-strict in each argument. Primitive
proceduresfunctions
will still be strict. It is not difficult to modify the evaluator of
section 4.1.1 so that the language it
interprets behaves this way. Almost all the required changes center around
procedurefunction
application.
The basic idea is that, when applying a
procedure,function,
the interpreter must determine which arguments are to be evaluated and which
are to be delayed. The delayed arguments are not evaluated; instead, they
are transformed into objects called
thunks.[1]
The thunk must contain the information required to produce the value
of the argument when it is needed, as if it had been evaluated at
the time of the application. Thus, the thunk must contain the
argument expression and the environment in
which the
procedurefunction
application is being evaluated.
The process of evaluating the expression in a thunk is called
forcing.[2]
In general, a thunk will be forced only when its value is needed:
when it is passed to a primitive
procedurefunction
that will use the value of the thunk; when it is the value of a predicate of
a conditional; and when it is the value of
an operatora function expression
that is about to be
applied as a
procedure.function.
One design choice we have available is whether or not to
memoize thunks, similar to the optimization for streams in
section 3.5.1. With memoization, the first
time a thunk is forced, it stores the value that is computed. Subsequent
forcings simply return the stored value without repeating the computation.
We'll make our interpreter memoize, because this is more efficient for
many applications. There are tricky considerations here,
however.[3]
Modifying the evaluator
The main difference between the lazy evaluator and the one in
section 4.1 is in the handling of
procedurefunction
applications in
evalevaluate
and
apply.
The application? clause of
eval becomes
The is_application
clause of
evaluate becomes
This is almost the same as the
application?is_application
clause of
evalevaluate
in section 4.1.1. For lazy evaluation,
however, we call apply with the
operandargument
expressions, rather than the arguments produced by evaluating them. Since
we will need the environment to construct thunks if the arguments are to be
delayed, we must pass this as well. We still evaluate the
operator,function expression,
because
apply needs the actual
procedurefunction
to be applied in order to dispatch on its type (primitive versus compound)
and apply it.
Whenever we need the actual value of an expression, we use
function actual_value(exp, env) {
return force_it(evaluate(exp, env));
}
instead of just
eval,
evaluate,
so that if the expression's value is a thunk, it will be forced.
Our new version of apply is also almost the
same as the version in section 4.1.1.
The difference is that
evalevaluate
has passed in unevaluated
operand
argument
expressions: For primitive
proceduresfunctions
(which are strict), we evaluate all the arguments before applying the
primitive; for compound
proceduresfunctions
(which are non-strict) we delay all the
arguments before applying the
procedure.function.
function apply(fun, args, env) {
if (is_primitive_function(fun)) {
return apply_primitive_function(
fun,
list_of_arg_values(args, env)); // changed
} else if (is_compound_function(fun)) {
const result = evaluate(
function_body(fun),
extend_environment(
function_parameters(fun),
list_of_delayed_args(args, env), // changed
function_environment(fun)));
return is_return_value(result)
? return_value_content(result)
: undefined;
} else {
error(fun, "unknown function type -- apply");
}
}
The
proceduresfunctions
that process the arguments are just like
list-of-valueslist_of_values
from section 4.1.1,
except that
list-of-delayed-argslist_of_delayed_args
delays the arguments instead of evaluating them, and
list-of-arg-valueslist_of_arg_values
uses
actual-valueactual_value
instead of
eval:
evaluate:
The other place we must change the evaluator is in the handling of
if,
conditionals,
where we must use
actual-valueactual_value
instead of
evalevaluate
to get the value of the predicate
expression before testing whether it is true or false:
Finally, we must change the
driver-loopdriver_loopprocedurefunction
(from section 4.1.4) to use
actual-valueactual_value
instead of
eval,
evaluate,
so that if a delayed value is propagated back to the
read-eval-print loop,
read-evaluate-print loop,
it will be forced before being printed.
We also change the prompts to indicate that
this is the lazy evaluator:
With these changes made, we can start the evaluator and test it. The
successful evaluation of the
trytry_me
expression
discussed in section 4.2.1 indicates
that the interpreter is performing lazy evaluation:
Our evaluator must arrange to create thunks when
proceduresfunctions
are applied to arguments and to force these thunks later. A thunk must
package an expression together with the environment, so that the argument
can be produced later. To force the thunk, we simply extract the expression
and environment from the thunk and evaluate the expression in the
environment. We use
actual-valueactual_value
rather than
evalevaluate
so that in case the value of the expression is itself a thunk, we will force
that, and so on, until we reach something that is not a thunk:
One easy way to package an expression with an environment is to make a list
containing the expression and the environment. Thus, we create a thunk as
follows:
function delay_it(exp, env) {
return list("thunk", exp, env);
}
function is_thunk(obj) {
return is_tagged_list(obj, "thunk");
}
function thunk_exp(thunk) { return head(tail(thunk)); }
function thunk_env(thunk) { return head(tail(tail(thunk))); }
Actually, what we want for our interpreter is not quite this, but
rather thunks that have been memoized.
When a thunk is forced, we will turn it into an evaluated thunk by replacing
the stored expression with its value and changing the
thunk tag so that it can be recognized as
already evaluated.[4]
There is currently no solution available for this exercise. This textbook adaptation is a community effort. Do consider contributing by providing a solution for this exercise, using a Pull Request in Github.
Exercise 4.37 Eval
The function
evaluate
uses
actual-valueactual_value
rather than
evalevaluate
to evaluate the
operatorfunction expression
before passing it to
apply, in order to force the value of the
operator.function expression.
Give an example that demonstrates the need for this forcing.
There is currently no solution available for this exercise. This textbook adaptation is a community effort. Do consider contributing by providing a solution for this exercise, using a Pull Request in Github.
Exercise 4.38
Exhibit a program that you would expect to run much more slowly without
memoization than with memoization. Also, consider the following
interaction, where the idprocedurefunction
is defined as in exercise 4.36 and
count starts at 0:
L-evaluate input:
square(id(10));
L-evaluate value:
$\langle{}$response$\rangle$
L-evaluate input:
count;
L-evaluate value:
$\langle{}$response$\rangle$
Give the responses both when the evaluator memoizes and when it does not.
There is currently no solution available for this exercise. This textbook adaptation is a community effort. Do consider contributing by providing a solution for this exercise, using a Pull Request in Github.
Cy D. Fect, a reformed C programmer, is worried that some side effects
may never take place, because the lazy evaluator doesn't force the
expressions in a sequence.
Since the value of an expression in a sequence
other than the last one is not used (the expression is there only for
its effect, such as assigning to a variable or printing), there can be
no subsequent use of this value (e.g., as an argument to a primitive
procedure) that will cause it to be forced. Cy thus thinks that when
evaluating sequences, we must force all expressions in the sequence
except the final one. He proposes to modify
eval-sequence
from section 4.1.1 to use
actual-value
rather than eval:
Cy D. Fect, a reformed C programmer, is worried that some side effects
may never take place, because the lazy evaluator doesn't force the
statements in a sequence.
Since the value of a statement in a sequence
may not be used (the statement may be there only for
its effect, such as assigning to a variable or printing), there may be
no subsequent use of this value (e.g., as an argument to a primitive
function) that will cause it to be forced.
Cy thus thinks that when
evaluating sequences, we must force all statements in the sequence.
He proposes to modify
evaluate_sequence
from section 4.1.1 to use
actual_value
rather than
evaluate:
function eval_sequence(stmts, env) {
if (is_empty_sequence(stmts)) {
return undefined;
} else if (is_last_statement(stmts)) {
return actual_value(first_statement(stmts), env);
} else {
const first_stmt_value =
actual_value(first_statement(stmts), env);
if (is_return_value(first_stmt_value)) {
return first_stmt_value;
} else {
return eval_sequence(rest_statements(stmts), env);
}
}
}
Ben Bitdiddle thinks Cy is wrong. He shows Cy the
for-eachfor_eachprocedurefunction
described in exercise 2.24, which gives an
important example of a sequence with side effects:
Explain why Ben is right about the behavior of
for-each.
for_each.
Cy agrees that Ben is right about the
for-eachfor_each
example, but says that that's not the kind of program he
was thinking about when he proposed his change to
eval-sequence.
eval_sequence.
He
definesdeclares
the following two
proceduresfunctions
in the lazy evaluator:
Original
JavaScript
(define (p1 x)
(set! x (cons x '(2)))
x)
(define (p2 x)
(define (p e)
e
x)
(p (set! x (cons x '(2)))))
function f1(x) {
x = pair(x, list(2));
return x;
}
function f2(x) {
function f(e) {
e;
return x;
}
return f(x = pair(x, list(2)));
}
What are the values of
(p1 1)f1(1)
and
(p2 1)f2(1)
with the original
eval-sequence?
eval_sequence?
What would the values be with Cy's proposed change to
eval-sequence?
eval_sequence?
Cy also points out that changing
eval-sequenceeval_sequence
as he proposes does not affect the behavior of the example in part a.
Explain why this is true.
How do you think sequences
ought to be treated in the lazy evaluator?
Do you like Cy's approach, the approach in the text, or some other
approach?
There is currently no solution available for this exercise. This textbook adaptation is a community effort. Do consider contributing by providing a solution for this exercise, using a Pull Request in Github.
Exercise 4.40
The approach taken in this section is somewhat unpleasant, because it
makes an incompatible change to
Scheme.JavaScript.
It might be nicer to implement lazy evaluation as an
upward-compatible extension, that is, so that ordinary
SchemeJavaScript
programs will work as before. We can do this by
extending the syntax of procedureintroducing optional parameter declaration as a new
syntactic form inside function
declarations to let the user control whether or not arguments are to be
delayed. While we're at it, we may as well also give the user the
choice between delaying with and without memoization. For example, the
definition
declaration
would define f to be a
procedurefunction
of four arguments, where the first and third arguments are evaluated when the
procedurefunction
is called, the second argument is delayed, and the fourth argument is both
delayed and memoized.
Original
JavaScript
Thus, ordinary procedure definitions will produce the same behavior as
ordinary Scheme,
while adding the
lazy-memo
declaration to each parameter of every compound procedure
will produce the behavior of the lazy evaluator defined in this section.
Design and implement the changes required to produce such an extension to
Scheme. You will have to implement new syntax procedures
to handle the new syntax for define.
You can assume that the parameter declaration is always
the first statement in the body of a function declaration,
and if it is omitted, all parameters are strict.
Thus, ordinary function declaration
will produce the same behavior as ordinary JavaScript,
while adding the
"lazy_memo"
declaration to each parameter of every compound function
will produce the behavior of the lazy evaluator defined in this section.
Design and implement the changes required to produce such an extension to
JavaScript. The parse function will
treat parameter declarations as function applications, so you need to
modify apply to dispatch to your
implementation of the new syntactic form.
You must also arrange for
evalevaluate
or apply to determine when arguments are to be
delayed, and to force or delay arguments accordingly, and you must arrange
for forcing to memoize or not, as appropriate.
There is currently no solution available for this exercise. This textbook adaptation is a community effort. Do consider contributing by providing a solution for this exercise, using a Pull Request in Github.
[1]
The word thunk was invented by an informal
working group that was discussing the implementation of call-by-name
in Algol 60. They observed that most of the analysis of (thinking
about) the expression could be done at compile time; thus, at run
time, the expression would already have been thunk about
(Ingerman et al. 1960).
[2]
This is analogous to the
use of
force
on
forcing of
the delayed objects that were introduced in chapter 3 to
represent streams. The critical difference between what we are
doing here and what we did in chapter 3 is that we are building
delaying and forcing into the evaluator, and thus making this uniform
and automatic throughout the language.
[3]
Lazy evaluation combined with memoization is sometimes
referred to as
call-by-need argument passing, in contrast to
call-by-name argument passing.
(Call-by-name, introduced in
Algol 60, is similar to non-memoized lazy
evaluation.) As language designers, we can build our evaluator to memoize,
not to memoize, or leave this an option for programmers
(exercise 4.40). As you might
expect from chapter 3, these choices raise issues that become both
subtle and confusing in the presence of assignments. (See
exercises 4.36
and 4.38.)
An excellent article by
Clinger (1982) attempts to clarify the
multiple dimensions of confusion that arise here.
[4]
Notice that we also erase the
env from the thunk once the expression's
value has been computed. This makes no difference in the values returned by
the interpreter. It does help save space, however, because removing the
reference from the thunk to the env once it is
no longer needed allows this structure to be
garbage-collected and its space
recycled, as we will discuss in
section 5.3.
Similarly, we could have allowed unneeded environments in the memoized
delayed objects of section 3.5.1
to be garbage-collected, by having
memo-procmemo
do something like
(set! proc '())fun = null;
to discard the
procedure
procfunction
fun
(which includes the environment in which the
delay
lambda expression
that makes up the tail of the stream
was evaluated) after storing its
value.
[5]
This exercise demonstrates that the interaction between
lazy evaluation and side effects can be very confusing. This is just what you
might expect from the discussion in chapter 3.