We have not yet explained how to load compiled code into the evaluator
machine or how to run it. We will assume that the explicit-control-evaluator
machine has been defined as in
section 5.4.4, with the additional
operations specified in footnote 5 (section 5.5.2).
We will implement a
procedurefunctioncompile-and-gocompile_and_go
that compiles a
Scheme expression,JavaScript program,
loads the resulting object code into the evaluator machine,
and causes the machine to run the code in the
evaluator global environment, print the result, and
enter the evaluator's driver loop. We will also modify the evaluator
so that interpreted
expressions
components
can call compiled
proceduresfunctions
as well as interpreted ones. We can then put a compiled
procedurefunction
into the machine and use the
evaluator to call it:
Original
JavaScript
(compile-and-go
'(define (factorial n)
(if (= n 1)
1
(* (factorial (- n 1)) n))))
;;; EC-Eval value:
ok
To allow the evaluator to handle compiled
proceduresfunctions
(for example,
to evaluate the call to factorial above),
we need to change the code at
apply-dispatchapply_dispatch
(section 5.4.1) so that it
recognizes compiled
proceduresfunctions
(as distinct from compound or primitive
procedures)functions)
and transfers control directly to the entry point of the
compiled code:[1]
Note the restore of continue at
compiled-apply.compiled_apply.
Recall that the evaluator was arranged so that at
apply-dispatch,apply_dispatch,
the continuation would be at the top of the stack. The compiled code entry
point, on the other hand, expects the continuation to be in
continue, so
continue must be
restored before the compiled code is executed.
At
compiled_apply, as at
compound_apply, we push a marker to the stack
so that a return statement in the compiled function
can revert the stack to this state.
Note that there is no save of
continue at
compiled_apply before
the marking of the stack, because the evaluator was
arranged so that at
apply_dispatch, the
continuation would be at the top of the stack.
To enable us to run some compiled code when we start the evaluator
machine, we add a branch instruction at
the beginning of the evaluator machine, which causes the machine to
go to a new entry point if the flag register
is set.[2]
Original
JavaScript
(branch (label external-entry)) ; branches if $\texttt{flag}$ is set
read-eval-print-loop
(perform (op initialize-stack))
$\ldots$
$\texttt{ }\texttt{ }$branch(label("external_entry")), // branches if flag is set
"read_evaluate_print_loop",
perform(list(op("initialize_stack"))),
$\ldots$
External-entryThe code at external_entry
assumes that the machine is started with val
containing the location of an instruction sequence that puts a result into
val and ends with
(goto (reg continue)).go_to(reg("continue")).
Starting at this entry point jumps to the location designated
by val, but first assigns
continue so that execution will return to
print-result,print_result,
which prints the value in val and then goes to
the beginning of the evaluator's
read-eval-print
read-evaluate-print
loop.[3]
Now we can use the following
procedurefunction
to compile a
procedure definition,function declaration,
execute the compiled code, and run the
read-eval-print
read-evaluate-print
loop so
we can try the
procedure.function.
Because we want the compiled code to
returnproceed
to the location in
continue with its result in
val, we compile the
expressionprogram
with a target of val and a
linkage of
return.
"return".
In order to transform the
object code produced by the compiler into executable instructions
for the evaluator register machine, we use the
procedurefunctionassemble from the
register-machine simulator
(section 5.2.2).
Original
JavaScript
For the interpreted program to refer to the names that
are declared at top level in the compiled program, we
scan out the top-level names and
extend the global environment by binding these names to
"*unassigned*",
knowing that the compiled code will assign them
the correct values.
We then initialize
the val register to point to the list
of instructions, set the
flag so that the evaluator will go to
external-entry,external_entry,
and start the evaluator.
EC-evaluate input:
factorial(5);
total pushes = 36
maximum depth = 14
EC-evaluate value:
120
Compare
this example with the evaluation of
(factorial 5)factorial(5)
using the interpreted version of the same
procedure,function,
shown at the end of section 5.4.4.
The interpreted version required 144 pushes and a maximum stack depth of 28.
The interpreted version required 151 pushes and a maximum stack depth of 28.
This illustrates the optimization that results from our compilation strategy.
Interpretation and compilation
With the programs in this section, we can now experiment with the
alternative execution strategies of interpretation and
compilation.[4]
An interpreter raises the machine to the level of the user program; a
compiler lowers the user program to the level of the machine language.
We can regard the
Scheme
JavaScript
language (or any programming language) as a
coherent family of abstractions erected on the machine language.
Interpreters are good for interactive program development and
debugging because the steps of program execution are organized in
terms of these abstractions, and are therefore more intelligible
to the programmer.
Compiled code can execute faster, because the steps of program execution
are organized in terms of the machine language, and the compiler is free
to make optimizations that cut across the higher-level
abstractions.[5]
The alternatives of interpretation and compilation also lead to
different strategies for
porting languages to new computers. Suppose
that we wish to implement
LispJavaScript
for a new machine. One strategy is
to begin with the explicit-control evaluator of
section 5.4
and translate its instructions to instructions for the
new machine. A different strategy is to begin with the compiler and
change the code generators so that they generate code for the new
machine. The second strategy allows us to run any
LispJavaScript
program on the new machine by first compiling it with the compiler running
on our
original LispJavaScript system, and linking it with a compiled version of the runtime
library.[6] Better yet, we can compile the compiler itself, and run
this on the new machine to compile other
LispJavaScript programs.[7] Or we can compile one of the interpreters of
section 4.1 to produce an interpreter that
runs on the new machine.
Exercise 5.51
By
comparing the stack operations used by compiled code to the stack
operations used by the evaluator for the same computation, we can
determine the extent to which the compiler optimizes use of the stack,
both in speed (reducing the total number of stack operations) and in
space (reducing the maximum stack depth). Comparing this optimized
stack use to the performance of a special-purpose machine for the same
computation gives some indication of the quality of the compiler.
Exercise 5.30 asked you to determine, as a
function of $n$, the number of pushes and
the maximum stack depth needed by the evaluator to compute
$n!$ using the recursive factorial
procedurefunction
given above. Exercise 5.14 asked you
to do the same measurements for the special-purpose factorial machine
shown in figure 5.13. Now perform the
same analysis using the compiled factorialprocedure.function.
Take the ratio of the number of pushes in the compiled version to the
number of pushes in the interpreted version, and do the same for the
maximum stack depth. Since the number of operations and the stack
depth used to compute $n!$ are linear in
$n$, these ratios should
approach constants as $n$ becomes large.
What are these constants? Similarly, find the ratios of the stack usage
in the special-purpose machine to the usage in the interpreted version.
Compare the ratios for special-purpose versus interpreted code to the
ratios for compiled versus interpreted code. You should find that the
special-purpose machine is much more efficient than the compiled code, since
the hand-tailored controller code should be much better than what is
produced by our rudimentary general-purpose compiler.
Can you suggest improvements to the compiler that would help it
generate code that would come closer in performance to the
hand-tailored version?
There is currently no solution available for this exercise. This textbook adaptation is a community effort. Do consider contributing by providing a solution for this exercise, using a Pull Request in Github.
Exercise 5.52
Carry out an analysis like the one in
exercise 5.51 to determine the
effectiveness of compiling the tree-recursive
Fibonacci
procedurefunction
Original
JavaScript
(define (fib n)
(if (< n 2)
n
(+ (fib (- n 1)) (fib (- n 2)))))
function fib(n) {
return n < 2 ? n : fib(n - 1) + fib(n - 2);
}
compared to the effectiveness of using the special-purpose Fibonacci machine
of figure 5.15. (For measurement of the
interpreted performance, see exercise 5.31.)
For Fibonacci, the time resource used is not linear in
$n$; hence the ratios of stack operations will not
approach a limiting value that is independent of
$n$.
There is currently no solution available for this exercise. This textbook adaptation is a community effort. Do consider contributing by providing a solution for this exercise, using a Pull Request in Github.
Exercise 5.53
This section described how to modify the explicit-control evaluator so
that interpreted code can call compiled
procedures.functions.
Show how to modify the compiler so that compiled
proceduresfunctions
can call not only primitive
proceduresfunctions
and compiled
procedures,functions,
but interpreted
proceduresfunctions
as well. This requires modifying
compile-procedure-callcompile_function_call
to handle the case of compound (interpreted)
procedures.functions.
Be sure to handle all the same target and
linkage combinations
as in
compile-proc-appl.
compile_fun_appl.
To do the actual
procedurefunction
application,
the code needs to jump to the evaluator's
compound-applycompound_apply
entry point. This label cannot be directly referenced in object code
(since the assembler requires that all labels referenced by the
code it is assembling be defined there), so we will add a register
called compapp to the evaluator machine to
hold this entry point, and add an instruction to initialize it:
Original
JavaScript
(assign compapp (label compound-apply))
(branch (label external-entry)) ; branches if $\texttt{flag}$ is set
read-eval-print-loop
$\ldots$
$\texttt{ }\texttt{ }$assign("compapp", label("compound_apply")),
branch(label("external_entry")), // branches if flag is set
"read_evaluate_print_loop",
$\ldots$
To test your code, start by
defining
declaring
a
procedure
function
f that calls a
procedure
function
g. Use
compile-and-gocompile_and_go
to compile the
definition
declaration
of f
and start the evaluator. Now, typing at the evaluator,
define
g and try to call
f.
declare
g and try to call
f.
There is currently no solution available for this exercise. This textbook adaptation is a community effort. Do consider contributing by providing a solution for this exercise, using a Pull Request in Github.
Exercise 5.54
The
compile-and-gocompile_and_go
interface implemented in this section is
awkward, since the compiler can be called only once (when the
evaluator machine is started). Augment the compiler–interpreter
interface by providing a
compile-and-runcompile_and_run
primitive that can be called from within the explicit-control evaluator
as follows:
Original
JavaScript
;;; EC-Eval input:
(compile-and-run
'(define (factorial n)
(if (= n 1)
1
(* (factorial (- n 1)) n))))
;;; EC-Eval value:
ok
There is currently no solution available for this exercise. This textbook adaptation is a community effort. Do consider contributing by providing a solution for this exercise, using a Pull Request in Github.
Exercise 5.55
As an alternative to using the explicit-control evaluator's
read-eval-print
read-evaluate-print
loop, design a register machine that performs a
read-compile-execute-print loop. That is, the machine should run a
loop that reads
an expression,
a program,
compiles it, assembles and
executes the resulting code, and prints the result. This is easy to
run in our simulated setup, since we can arrange to call the
proceduresfunctionscompile and
assemble as register-machine
operations.
There is currently no solution available for this exercise. This textbook adaptation is a community effort. Do consider contributing by providing a solution for this exercise, using a Pull Request in Github.
Exercise 5.56
Use the compiler to compile the
metacircular evaluator of
section 4.1 and run this program using the
register-machine simulator.
Original
JavaScript
(To compile more than one definition at a time,
you can package the definitions in a
begin.)
Because the parser takes a string as input, you will need to
convert the program into a string. The simplest way to do this is
to use the back quotes (`),
as we have done for the example inputs to
compile_and_go and
compile_and_run.
The resulting interpreter will run very slowly because of the multiple
levels of interpretation, but getting all the details to work is an
instructive exercise.
There is currently no solution available for this exercise. This textbook adaptation is a community effort. Do consider contributing by providing a solution for this exercise, using a Pull Request in Github.
Exercise 5.57
Develop a rudimentary implementation of
SchemeJavaScript
in
C (or some other low-level language of your choice) by translating the
explicit-control evaluator of section 5.4
into C. In order to run this code you will need to also
provide appropriate storage-allocation routines and other runtime
support.
There is currently no solution available for this exercise. This textbook adaptation is a community effort. Do consider contributing by providing a solution for this exercise, using a Pull Request in Github.
Exercise 5.58
As a counterpoint to exercise 5.57, modify
the compiler so that it compiles
Scheme proceduresJavaScript functions
into sequences of
C instructions. Compile the metacircular evaluator of
section 4.1 to produce a
SchemeJavaScript
interpreter written in C.
There is currently no solution available for this exercise. This textbook adaptation is a community effort. Do consider contributing by providing a solution for this exercise, using a Pull Request in Github.
[1]
Of course, compiled
proceduresfunctions
as well as interpreted
proceduresfunctions
are compound (nonprimitive). For compatibility with the terminology used
in the explicit-control evaluator, in this section we will use
compound to mean interpreted (as opposed to
compiled).
[2]
Now that the evaluator machine starts
with a branch, we must always initialize the
flag register before starting the evaluator
machine. To start the machine at its ordinary
read-eval-print
read-evaluate-print
loop, we
could use
function start_eceval() {
set_register_contents(eceval, "flag", false);
return start(eceval);
}
[3]
Since
a compiled
procedurefunction
is an object that the system may try to print, we also modify the system
print operation
user-printuser_print
(from section 4.1.4) so that it will not
attempt to print the components of a compiled
procedure:function:
function user_print(string, object) {
function prepare(object) {
return is_compound_function(object)
? "< compound function >"
: is_primitive_function(object)
? "< primitive function >"
: is_compiled_function(object)
? "< compiled function >"
: is_pair(object)
? pair(prepare(head(object)),
prepare(tail(object)))
: object;
}
display(string + " " + stringify(prepare(object)));
}
[4]
We can do even better by extending the compiler
to allow compiled code to call interpreted
procedures.functions.
See exercise 5.53.
[5]
Independent of the strategy of execution, we
incur significant overhead if we insist that
errors encountered in
execution of a user program be detected and signaled, rather than being
allowed to kill the system or produce wrong answers. For example, an
out-of-bounds array reference can be detected by checking the validity
of the reference before performing it. The overhead of checking,
however, can be many times the cost of the array reference itself, and
a programmer should weigh speed against safety in determining whether
such a check is desirable. A good compiler should be able to produce
code with such checks, should avoid redundant checks, and should allow
programmers to control the extent and type of error checking in the
compiled code.
Compilers for popular languages, such as
C and C++,
put hardly any error-checking operations into
running code, so as to make things run as fast as possible. As a
result, it falls to programmers to explicitly provide error checking.
Unfortunately, people often neglect to do this, even in
critical applications where speed is not a constraint. Their programs
lead fast and dangerous lives. For example, the notorious
Worm
that paralyzed the Internet in 1988 exploited the
UNIX$^{\textrm{TM}}$
operating system's failure to check whether the input buffer has
overflowed in the finger daemon. (See
Spafford 1989.)
[6]
Of course, with either the interpretation or the
compilation strategy we must also implement for the new machine storage
allocation, input and output, and all the various operations that we took
as primitive in our discussion of
the evaluator and compiler. One strategy for minimizing work here is
to write as many of these operations as possible in
LispJavaScript
and then compile them for the new machine. Ultimately, everything reduces
to a small kernel (such as garbage collection and the mechanism for
applying actual machine primitives) that is hand-coded for the new
machine.
[7]
This strategy leads to amusing tests of correctness of
the compiler, such as checking
whether the compilation of a program on the new machine, using the
compiled compiler, is identical with the
compilation of the program on the original
LispJavaScript
system. Tracking down the source of differences is fun but often
frustrating, because the results are extremely sensitive to minuscule
details.