4.2.2 An Interpreter with Lazy Evaluation

Permalink copied!

In this section we will implement a normal-order language that is the same as Scheme JavaScript except that compound procedures functions are non-strict in each argument. Primitive procedures functions will still be strict. It is not difficult to modify the evaluator of section 4.1.1 so that the language it interprets behaves this way. Almost all the required changes center around procedure function application.

The basic idea is that, when applying a procedure, function, the interpreter must determine which arguments are to be evaluated and which are to be delayed. The delayed arguments are not evaluated; instead, they are transformed into objects called thunks.[1] The thunk must contain the information required to produce the value of the argument when it is needed, as if it had been evaluated at the time of the application. Thus, the thunk must contain the argument expression and the environment in which the procedure function application is being evaluated.

The process of evaluating the expression in a thunk is called forcing.[2] In general, a thunk will be forced only when its value is needed: when it is passed to a primitive procedure function that will use the value of the thunk; when it is the value of a predicate of a conditional; and when it is the value of an operator a function expression that is about to be applied as a procedure. function. One design choice we have available is whether or not to memoize thunks, similar to the optimization for streams in section 3.5.1. With memoization, the first time a thunk is forced, it stores the value that is computed. Subsequent forcings simply return the stored value without repeating the computation. We'll make our interpreter memoize, because this is more efficient for many applications. There are tricky considerations here, however.[3]

Modifying the evaluator

The main difference between the lazy evaluator and the one in section 4.1 is in the handling of procedure function applications in eval evaluate and apply.

The application? clause of eval becomes The is_application clause of evaluate becomes

Original	JavaScript
`((application? exp) (apply (actual-value (operator exp) env) (operands exp) env))`	`: is_application(component) ? apply(actual_value(function_expression(component), env), arg_expressions(component), env)`

This is almost the same as the application? is_application clause of eval evaluate in section 4.1.1. For lazy evaluation, however, we call apply with the operand argument expressions, rather than the arguments produced by evaluating them. Since we will need the environment to construct thunks if the arguments are to be delayed, we must pass this as well. We still evaluate the operator, function expression, because apply needs the actual procedure function to be applied in order to dispatch on its type (primitive versus compound) and apply it.

Whenever we need the actual value of an expression, we use

Original	JavaScript
`(define (actual-value exp env) (force-it (eval exp env)))`	`function actual_value(exp, env) { return force_it(evaluate(exp, env)); }`

instead of just eval, evaluate, so that if the expression's value is a thunk, it will be forced.

Our new version of apply is also almost the same as the version in section 4.1.1. The difference is that eval evaluate has passed in unevaluated operand argument expressions: For primitive procedures functions (which are strict), we evaluate all the arguments before applying the primitive; for compound procedures functions (which are non-strict) we delay all the arguments before applying the procedure. function.

Original

JavaScript

(define (apply procedure arguments env) (cond ((primitive-procedure? procedure) (apply-primitive-procedure procedure (list-of-arg-values arguments env))) ; changed ((compound-procedure? procedure) (eval-sequence (procedure-body procedure) (extend-environment (procedure-parameters procedure) (list-of-delayed-args arguments env) ; changed (procedure-environment procedure)))) (else (error "Unknown procedure type - - APPLY" procedure))))

function apply(fun, args, env) { if (is_primitive_function(fun)) { return apply_primitive_function( fun, list_of_arg_values(args, env)); // changed } else if (is_compound_function(fun)) { const result = evaluate( function_body(fun), extend_environment( function_parameters(fun), list_of_delayed_args(args, env), // changed function_environment(fun))); return is_return_value(result) ? return_value_content(result) : undefined; } else { error(fun, "unknown function type -- apply"); } }

The procedures functions that process the arguments are just like list-of-values list_of_values from section 4.1.1, except that list-of-delayed-args list_of_delayed_args delays the arguments instead of evaluating them, and list-of-arg-values list_of_arg_values uses actual-value actual_value instead of eval: evaluate:

Original	JavaScript
`(define (list-of-arg-values exps env) (if (no-operands? exps) '() (cons (actual-value (first-operand exps) env) (list-of-arg-values (rest-operands exps) env)))) (define (list-of-delayed-args exps env) (if (no-operands? exps) '() (cons (delay-it (first-operand exps) env) (list-of-delayed-args (rest-operands exps) env))))`	`function list_of_arg_values(exps, env) { return map(exp => actual_value(exp, env), exps); } function list_of_delayed_args(exps, env) { return map(exp => delay_it(exp, env), exps); }`

The other place we must change the evaluator is in the handling of if, conditionals, where we must use actual-value actual_value instead of eval evaluate to get the value of the predicate expression before testing whether it is true or false:

Original	JavaScript
`(define (eval-if exp env) (if (true? (actual-value (if-predicate exp) env)) (eval (if-consequent exp) env) (eval (if-alternative exp) env)))`	`function eval_conditional(component, env) { return is_truthy(actual_value(conditional_predicate(component), env)) ? evaluate(conditional_consequent(component), env) : evaluate(conditional_alternative(component), env); }`

Finally, we must change the driver-loop driver_loop procedure function (from section 4.1.4) to use actual-value actual_value instead of eval, evaluate, so that if a delayed value is propagated back to the read-eval-print loop, read-evaluate-print loop, it will be forced before being printed. We also change the prompts to indicate that this is the lazy evaluator:

Original

JavaScript

(define input-prompt ";;; L-Eval input:") (define output-prompt ";;; L-Eval value:") (define (driver-loop) (prompt-for-input input-prompt) (let ((input (read))) (let ((output (actual-value input the-global-environment))) (announce-output output-prompt) (user-print output))) (driver-loop))

With these changes made, we can start the evaluator and test it. The successful evaluation of the try try_me expression discussed in section 4.2.1 indicates that the interpreter is performing lazy evaluation:

Original	JavaScript
`(define the-global-environment (setup-environment)) (driver-loop)`	`const the_global_environment = setup_environment(); driver_loop(the_global_environment);`

Original	JavaScript
`;;; L-Eval input: (define (try a b) (if (= a 0) 1 b)) ;;; L-Eval value: ok`	`L-evaluate input:function try_me(a, b) { return a === 0 ? 1 : b; }L-evaluate value: undefined`

Original	JavaScript
`;;; L-Eval input: (try 0 (/ 1 0)) ;;; L-Eval value: 1`	`L-evaluate input:try_me(0, head(null));L-evaluate value: 1`

Representing thunks

Our evaluator must arrange to create thunks when procedures functions are applied to arguments and to force these thunks later. A thunk must package an expression together with the environment, so that the argument can be produced later. To force the thunk, we simply extract the expression and environment from the thunk and evaluate the expression in the environment. We use actual-value actual_value rather than eval evaluate so that in case the value of the expression is itself a thunk, we will force that, and so on, until we reach something that is not a thunk:

Original	JavaScript
`(define (force-it obj) (if (thunk? obj) (actual-value (thunk-exp obj) (thunk-env obj)) obj))`	`function force_it(obj) { return is_thunk(obj) ? actual_value(thunk_exp(obj), thunk_env(obj)) : obj; }`

One easy way to package an expression with an environment is to make a list containing the expression and the environment. Thus, we create a thunk as follows:

Original	JavaScript
`(define (delay-it exp env) (list 'thunk exp env)) (define (thunk? obj) (tagged-list? obj 'thunk)) (define (thunk-exp thunk) (cadr thunk)) (define (thunk-env thunk) (caddr thunk))`	`function delay_it(exp, env) { return list("thunk", exp, env); } function is_thunk(obj) { return is_tagged_list(obj, "thunk"); } function thunk_exp(thunk) { return head(tail(thunk)); } function thunk_env(thunk) { return head(tail(tail(thunk))); }`

Actually, what we want for our interpreter is not quite this, but rather thunks that have been memoized. When a thunk is forced, we will turn it into an evaluated thunk by replacing the stored expression with its value and changing the thunk tag so that it can be recognized as already evaluated.[4]

Original

JavaScript

(define (evaluated-thunk? obj) (tagged-list? obj 'evaluated-thunk)) (define (thunk-value evaluated-thunk) (cadr evaluated-thunk)) (define (force-it obj) (cond ((thunk? obj) (let ((result (actual-value (thunk-exp obj) (thunk-env obj)))) (set-car! obj 'evaluated-thunk) (set-car! (cdr obj) result) ; replace exp with its value (set-cdr! (cdr obj) '()) ; forget unneeded env result)) ((evaluated-thunk? obj) (thunk-value obj)) (else obj)))

Notice that the same delay-it delay_it procedure function works both with and without memoization.

Exercise 4.36 Suppose we type in the following definitions declarations to the lazy evaluator:

Original	JavaScript
`(define count 0) (define (id x) (set! count (+ count 1)) x)`	`let count = 0; function id(x) { count = count + 1; return x; }`

Give the missing values in the following sequence of interactions, and explain your answers.[5]

Original	JavaScript
`(define w (id (id 10)))`	`const w = id(id(10));`

Original	JavaScript
`;;; L-Eval input: count ;;; L-Eval value: response`	`L-evaluate input:count;L-evaluate value: $\langle{}$response$\rangle$`

Original	JavaScript
`;;; L-Eval input: w ;;; L-Eval value: response`	`L-evaluate input:w;L-evaluate value: $\langle{}$response$\rangle$`

Original	JavaScript
`;;; L-Eval input: count ;;; L-Eval value: response`	`L-evaluate input:count;L-evaluate value: $\langle{}$response$\rangle$`

There is currently no solution available for this exercise. This textbook adaptation is a community effort. Do consider contributing by providing a solution for this exercise, using a Pull Request in Github.

Exercise 4.37 Eval The function evaluate uses actual-value actual_value rather than eval evaluate to evaluate the operator function expression before passing it to apply, in order to force the value of the operator. function expression. Give an example that demonstrates the need for this forcing.

Exercise 4.38 Exhibit a program that you would expect to run much more slowly without memoization than with memoization. Also, consider the following interaction, where the id procedure function is defined as in exercise 4.36 and count starts at 0:

Original	JavaScript
`(define (square x) (* x x));;; L-Eval input: (square (id 10)) ;;; L-Eval value: response ;;; L-Eval input: count ;;; L-Eval value: response`	`function square(x) { return x * x; }`

L-evaluate input:square(id(10));L-evaluate value: $\langle{}$response$\rangle$ L-evaluate input:count;L-evaluate value: $\langle{}$response$\rangle$ Give the responses both when the evaluator memoizes and when it does not.

Exercise 4.39

Original		JavaScript
Cy D. Fect, a reformed C programmer, is worried that some side effects may never take place, because the lazy evaluator doesn't force the expressions in a sequence. Since the value of an expression in a sequence other than the last one is not used (the expression is there only for its effect, such as assigning to a variable or printing), there can be no subsequent use of this value (e.g., as an argument to a primitive procedure) that will cause it to be forced. Cy thus thinks that when evaluating sequences, we must force all expressions in the sequence except the final one. He proposes to modify `eval-sequence` from section 4.1.1 to use `actual-value` rather than `eval`:		Cy D. Fect, a reformed C programmer, is worried that some side effects may never take place, because the lazy evaluator doesn't force the statements in a sequence. Since the value of a statement in a sequence may not be used (the statement may be there only for its effect, such as assigning to a variable or printing), there may be no subsequent use of this value (e.g., as an argument to a primitive function) that will cause it to be forced. Cy thus thinks that when evaluating sequences, we must force all statements in the sequence. He proposes to modify `evaluate_sequence` from section 4.1.1 to use `actual_value` rather than `evaluate`:

Original	JavaScript
`(define (eval-sequence exps env) (cond ((last-exp? exps) (actual-value (first-exp exps) env)) (else (actual-value (first-exp exps) env) (eval-sequence (rest-exps exps) env))))`	`function eval_sequence(stmts, env) { if (is_empty_sequence(stmts)) { return undefined; } else if (is_last_statement(stmts)) { return actual_value(first_statement(stmts), env); } else { const first_stmt_value = actual_value(first_statement(stmts), env); if (is_return_value(first_stmt_value)) { return first_stmt_value; } else { return eval_sequence(rest_statements(stmts), env); } } }`

Ben Bitdiddle thinks Cy is wrong. He shows Cy the for-each for_each procedure function described in exercise 2.24, which gives an important example of a sequence with side effects:

Original	JavaScript
`(define (for-each proc items) (if (null? items) 'done (begin (proc (car items)) (for-each proc (cdr items)))))`	`function for_each(fun, items) { if (is_null(items)){ return "done"; } else { fun(head(items)); for_each(fun, tail(items)); } }`

He claims that the evaluator in the text (with the original eval-sequence) eval_sequence) handles this correctly:

Original	JavaScript
`;;; L-Eval input:(for-each (lambda (x) (newline) (display x)) (list 57 321 88))57 321 88 ;;; L-Eval value: done`	`L-evaluate input:for_each(display, list(57, 321, 88));57 321 88 L-evaluate value: "done"`

Explain why Ben is right about the behavior of for-each. for_each.

Cy agrees that Ben is right about the for-each for_each example, but says that that's not the kind of program he was thinking about when he proposed his change to eval-sequence. eval_sequence. He defines declares the following two procedures functions in the lazy evaluator:

Original	JavaScript
`(define (p1 x) (set! x (cons x '(2))) x) (define (p2 x) (define (p e) e x) (p (set! x (cons x '(2)))))`	`function f1(x) { x = pair(x, list(2)); return x; } function f2(x) { function f(e) { e; return x; } return f(x = pair(x, list(2))); }`

What are the values of (p1 1) f1(1) and (p2 1) f2(1) with the original eval-sequence? eval_sequence? What would the values be with Cy's proposed change to eval-sequence? eval_sequence?

Cy also points out that changing eval-sequence eval_sequence as he proposes does not affect the behavior of the example in part a. Explain why this is true.
How do you think sequences ought to be treated in the lazy evaluator? Do you like Cy's approach, the approach in the text, or some other approach?

Exercise 4.40 The approach taken in this section is somewhat unpleasant, because it makes an incompatible change to Scheme. JavaScript. It might be nicer to implement lazy evaluation as an upward-compatible extension, that is, so that ordinary Scheme JavaScript programs will work as before. We can do this by extending the syntax of procedure introducing optional parameter declaration as a new syntactic form inside function declarations to let the user control whether or not arguments are to be delayed. While we're at it, we may as well also give the user the choice between delaying with and without memoization. For example, the definition declaration

Original	JavaScript
`(define (f a (b lazy) c (d lazy-memo)) $\ldots$)`	`function f(a, b, c, d) { parameters("strict", "lazy", "strict", "lazy_memo"); $\ldots$ }`

would define f to be a procedure function of four arguments, where the first and third arguments are evaluated when the procedure function is called, the second argument is delayed, and the fourth argument is both delayed and memoized.

Original		JavaScript
Thus, ordinary procedure definitions will produce the same behavior as ordinary Scheme, while adding the `lazy-memo` declaration to each parameter of every compound procedure will produce the behavior of the lazy evaluator defined in this section. Design and implement the changes required to produce such an extension to Scheme. You will have to implement new syntax procedures to handle the new syntax for `define`.		You can assume that the parameter declaration is always the first statement in the body of a function declaration, and if it is omitted, all parameters are strict. Thus, ordinary function declaration will produce the same behavior as ordinary JavaScript, while adding the `"lazy_memo"` declaration to each parameter of every compound function will produce the behavior of the lazy evaluator defined in this section. Design and implement the changes required to produce such an extension to JavaScript. The `parse` function will treat parameter declarations as function applications, so you need to modify `apply` to dispatch to your implementation of the new syntactic form.

You must also arrange for eval evaluate or apply to determine when arguments are to be delayed, and to force or delay arguments accordingly, and you must arrange for forcing to memoize or not, as appropriate.

[1] The word thunk was invented by an informal working group that was discussing the implementation of call-by-name in Algol 60. They observed that most of the analysis of (thinking about) the expression could be done at compile time; thus, at run time, the expression would already have been thunk about (Ingerman et al. 1960).

[2] This is analogous to the use of force on forcing of the delayed objects that were introduced in chapter 3 to represent streams. The critical difference between what we are doing here and what we did in chapter 3 is that we are building delaying and forcing into the evaluator, and thus making this uniform and automatic throughout the language.

[3] Lazy evaluation combined with memoization is sometimes referred to as call-by-need argument passing, in contrast to call-by-name argument passing. (Call-by-name, introduced in Algol 60, is similar to non-memoized lazy evaluation.) As language designers, we can build our evaluator to memoize, not to memoize, or leave this an option for programmers (exercise 4.40). As you might expect from chapter 3, these choices raise issues that become both subtle and confusing in the presence of assignments. (See exercises 4.36 and 4.38.) An excellent article by Clinger (1982) attempts to clarify the multiple dimensions of confusion that arise here.

[4] Notice that we also erase the env from the thunk once the expression's value has been computed. This makes no difference in the values returned by the interpreter. It does help save space, however, because removing the reference from the thunk to the env once it is no longer needed allows this structure to be garbage-collected and its space recycled, as we will discuss in section 5.3.

Similarly, we could have allowed unneeded environments in the memoized delayed objects of section 3.5.1 to be garbage-collected, by having memo-proc memo do something like (set! proc '()) fun = null; to discard the procedure proc function fun (which includes the environment in which the delay lambda expression that makes up the tail of the stream was evaluated) after storing its value.

[5] This exercise demonstrates that the interaction between lazy evaluation and side effects can be very confusing. This is just what you might expect from the discussion in chapter 3.

< Previous

Next >

4.2.2 An Interpreter with Lazy Evaluation