5.5.7 Interfacing Compiled Code to the Evaluator

Permalink copied!

We have not yet explained how to load compiled code into the evaluator machine or how to run it. We will assume that the explicit-control-evaluator machine has been defined as in section 5.4.4, with the additional operations specified in footnote 5 (section 5.5.2). We will implement a procedure function compile-and-go compile_and_go that compiles a Scheme expression, JavaScript program, loads the resulting object code into the evaluator machine, and causes the machine to run the code in the evaluator global environment, print the result, and enter the evaluator's driver loop. We will also modify the evaluator so that interpreted expressions components can call compiled procedures functions as well as interpreted ones. We can then put a compiled procedure function into the machine and use the evaluator to call it:

Original	JavaScript
`(compile-and-go '(define (factorial n) (if (= n 1) 1 (* (factorial (- n 1)) n))));;; EC-Eval value: ok`	compile_and_go(parse(` function factorial(n) { return n === 1 ? 1 : factorial(n - 1) * n; } `));`EC-evaluate value: undefined`

Original	JavaScript
`;;; EC-Eval input:(factorial 5);;; EC-Eval value: 120`	`EC-evaluate input:factorial(5);EC-evaluate value: 120`

To allow the evaluator to handle compiled procedures functions (for example, to evaluate the call to factorial above), we need to change the code at apply-dispatch apply_dispatch (section 5.4.1) so that it recognizes compiled procedures functions (as distinct from compound or primitive procedures) functions) and transfers control directly to the entry point of the compiled code:[1]

Original

JavaScript

apply-dispatch (test (op primitive-procedure?) (reg proc)) (branch (label primitive-apply)) (test (op compound-procedure?) (reg proc)) (branch (label compound-apply)) (test (op compiled-procedure?) (reg proc)) (branch (label compiled-apply)) (goto (label unknown-procedure-type)) compiled-apply (restore continue) (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val))

"apply_dispatch", test(list(op("is_primitive_function"), reg("fun"))), branch(label("primitive_apply")), test(list(op("is_compound_function"), reg("fun"))), branch(label("compound_apply")), test(list(op("is_compiled_function"), reg("fun"))), branch(label("compiled_apply")), go_to(label("unknown_function_type")), "compiled_apply", push_marker_to_stack(), assign("val", list(op("compiled_function_entry"), reg("fun"))), go_to(reg("val")),

Original		JavaScript
Note the restore of `continue` at `compiled-apply`. `compiled_apply`. Recall that the evaluator was arranged so that at `apply-dispatch`, `apply_dispatch`, the continuation would be at the top of the stack. The compiled code entry point, on the other hand, expects the continuation to be in `continue`, so `continue` must be restored before the compiled code is executed.		At `compiled_apply`, as at `compound_apply`, we push a marker to the stack so that a return statement in the compiled function can revert the stack to this state. Note that there is no save of `continue` at `compiled_apply` before the marking of the stack, because the evaluator was arranged so that at `apply_dispatch`, the continuation would be at the top of the stack.

To enable us to run some compiled code when we start the evaluator machine, we add a branch instruction at the beginning of the evaluator machine, which causes the machine to go to a new entry point if the flag register is set.[2]

Original	JavaScript
`(branch (label external-entry)) ; branches if $\texttt{flag}$ is set read-eval-print-loop (perform (op initialize-stack)) $\ldots$`	`$\texttt{ }\texttt{ }$branch(label("external_entry")), // branches if flag is set "read_evaluate_print_loop", perform(list(op("initialize_stack"))), $\ldots$`

External-entry The code at external_entry assumes that the machine is started with val containing the location of an instruction sequence that puts a result into val and ends with (goto (reg continue)). go_to(reg("continue")). Starting at this entry point jumps to the location designated by val, but first assigns continue so that execution will return to print-result, print_result, which prints the value in val and then goes to the beginning of the evaluator's read-eval-print read-evaluate-print loop.[3]

Original	JavaScript
`external-entry (perform (op initialize-stack)) (assign env (op get-global-environment)) (assign continue (label print-result)) (goto (reg val))`	`"external_entry", perform(list(op("initialize_stack"))), assign("env", list(op("get_current_environment"))), assign("continue", label("print_result")), go_to(reg("val")),`

Now we can use the following procedure function to compile a procedure definition, function declaration, execute the compiled code, and run the read-eval-print read-evaluate-print loop so we can try the procedure. function. Because we want the compiled code to return proceed to the location in continue with its result in val, we compile the expression program with a target of val and a linkage of return. "return". In order to transform the object code produced by the compiler into executable instructions for the evaluator register machine, we use the procedure function assemble from the register-machine simulator (section 5.2.2).

Original		JavaScript
		For the interpreted program to refer to the names that are declared at top level in the compiled program, we scan out the top-level names and extend the global environment by binding these names to `"unassigned"`, knowing that the compiled code will assign them the correct values.

We then initialize the val register to point to the list of instructions, set the flag so that the evaluator will go to external-entry, external_entry, and start the evaluator.

Original

JavaScript

(define (compile-and-go expression) (let ((instructions (assemble (statements (compile expression 'val 'return)) eceval))) (set! the-global-environment (setup-environment)) (set-register-contents! eceval 'val instructions) (set-register-contents! eceval 'flag true) (start eceval)))

function compile_and_go(program) { const instrs = assemble(instructions(compile(program, "val", "return")), eceval); const toplevel_names = scan_out_declarations(program); const unassigneds = list_of_unassigned(toplevel_names); set_current_environment(extend_environment( toplevel_names, unassigneds, the_global_environment)); set_register_contents(eceval, "val", instrs); set_register_contents(eceval, "flag", true); return start(eceval); }

If we have set up stack monitoring, as at the end of section 5.4.4, we can examine the stack usage of compiled code:

Original	JavaScript
`(compile-and-go '(define (factorial n) (if (= n 1) 1 (* (factorial (- n 1)) n))))(total-pushes = 0 maximum-depth = 0) ;;; EC-Eval value: ok`	compile_and_go(parse(` function factorial(n) { return n === 1 ? 1 : factorial(n - 1) * n; } `));`total pushes = 0 maximum depth = 0 EC-evaluate value: undefined`

Original	JavaScript
`;;; EC-Eval input:(factorial 5)(total-pushes = 31 maximum-depth = 14) ;;; EC-Eval value: 120`	`EC-evaluate input:factorial(5);total pushes = 36 maximum depth = 14 EC-evaluate value: 120`

Compare this example with the evaluation of (factorial 5) factorial(5) using the interpreted version of the same procedure, function, shown at the end of section 5.4.4. The interpreted version required 144 pushes and a maximum stack depth of 28. The interpreted version required 151 pushes and a maximum stack depth of 28. This illustrates the optimization that results from our compilation strategy.

Interpretation and compilation

With the programs in this section, we can now experiment with the alternative execution strategies of interpretation and compilation.[4] An interpreter raises the machine to the level of the user program; a compiler lowers the user program to the level of the machine language. We can regard the Scheme JavaScript language (or any programming language) as a coherent family of abstractions erected on the machine language. Interpreters are good for interactive program development and debugging because the steps of program execution are organized in terms of these abstractions, and are therefore more intelligible to the programmer. Compiled code can execute faster, because the steps of program execution are organized in terms of the machine language, and the compiler is free to make optimizations that cut across the higher-level abstractions.[5]

The alternatives of interpretation and compilation also lead to different strategies for porting languages to new computers. Suppose that we wish to implement Lisp JavaScript for a new machine. One strategy is to begin with the explicit-control evaluator of section 5.4 and translate its instructions to instructions for the new machine. A different strategy is to begin with the compiler and change the code generators so that they generate code for the new machine. The second strategy allows us to run any Lisp JavaScript program on the new machine by first compiling it with the compiler running on our original LispJavaScript system, and linking it with a compiled version of the runtime library.[6] Better yet, we can compile the compiler itself, and run this on the new machine to compile other LispJavaScript programs.[7] Or we can compile one of the interpreters of section 4.1 to produce an interpreter that runs on the new machine.

Exercise 5.51 By comparing the stack operations used by compiled code to the stack operations used by the evaluator for the same computation, we can determine the extent to which the compiler optimizes use of the stack, both in speed (reducing the total number of stack operations) and in space (reducing the maximum stack depth). Comparing this optimized stack use to the performance of a special-purpose machine for the same computation gives some indication of the quality of the compiler.

Exercise 5.30 asked you to determine, as a function of $n$, the number of pushes and the maximum stack depth needed by the evaluator to compute $n!$ using the recursive factorial procedure function given above. Exercise 5.14 asked you to do the same measurements for the special-purpose factorial machine shown in figure 5.13. Now perform the same analysis using the compiled factorial procedure. function.

Take the ratio of the number of pushes in the compiled version to the number of pushes in the interpreted version, and do the same for the maximum stack depth. Since the number of operations and the stack depth used to compute $n!$ are linear in $n$, these ratios should approach constants as $n$ becomes large. What are these constants? Similarly, find the ratios of the stack usage in the special-purpose machine to the usage in the interpreted version.

Compare the ratios for special-purpose versus interpreted code to the ratios for compiled versus interpreted code. You should find that the special-purpose machine is much more efficient than the compiled code, since the hand-tailored controller code should be much better than what is produced by our rudimentary general-purpose compiler.
Can you suggest improvements to the compiler that would help it generate code that would come closer in performance to the hand-tailored version?

There is currently no solution available for this exercise. This textbook adaptation is a community effort. Do consider contributing by providing a solution for this exercise, using a Pull Request in Github.

Exercise 5.52 Carry out an analysis like the one in exercise 5.51 to determine the effectiveness of compiling the tree-recursive Fibonacci procedure function

Original	JavaScript
`(define (fib n) (if (< n 2) n (+ (fib (- n 1)) (fib (- n 2)))))`	`function fib(n) { return n < 2 ? n : fib(n - 1) + fib(n - 2); }`

compared to the effectiveness of using the special-purpose Fibonacci machine of figure 5.15. (For measurement of the interpreted performance, see exercise 5.31.) For Fibonacci, the time resource used is not linear in $n$; hence the ratios of stack operations will not approach a limiting value that is independent of $n$.

Exercise 5.53 This section described how to modify the explicit-control evaluator so that interpreted code can call compiled procedures. functions. Show how to modify the compiler so that compiled procedures functions can call not only primitive procedures functions and compiled procedures, functions, but interpreted procedures functions as well. This requires modifying compile-procedure-call compile_function_call to handle the case of compound (interpreted) procedures. functions. Be sure to handle all the same target and linkage combinations as in compile-proc-appl. compile_fun_appl. To do the actual procedure function application, the code needs to jump to the evaluator's compound-apply compound_apply entry point. This label cannot be directly referenced in object code (since the assembler requires that all labels referenced by the code it is assembling be defined there), so we will add a register called compapp to the evaluator machine to hold this entry point, and add an instruction to initialize it:

Original	JavaScript
`(assign compapp (label compound-apply)) (branch (label external-entry)) ; branches if $\texttt{flag}$ is set read-eval-print-loop $\ldots$`	`$\texttt{ }\texttt{ }$assign("compapp", label("compound_apply")), branch(label("external_entry")), // branches if flag is set "read_evaluate_print_loop", $\ldots$`

To test your code, start by defining declaring a procedure function f that calls a procedure function g. Use compile-and-go compile_and_go to compile the definition declaration of f and start the evaluator. Now, typing at the evaluator, define g and try to call f. declare g and try to call f.

Exercise 5.54 The compile-and-go compile_and_go interface implemented in this section is awkward, since the compiler can be called only once (when the evaluator machine is started). Augment the compiler–interpreter interface by providing a compile-and-run compile_and_run primitive that can be called from within the explicit-control evaluator as follows:

Original	JavaScript
`;;; EC-Eval input:(compile-and-run '(define (factorial n) (if (= n 1) 1 (* (factorial (- n 1)) n))));;; EC-Eval value: ok`	`EC-evaluate input:`compile_and_run(parse(` function factorial(n) { return n === 1 ? 1 : factorial(n - 1) * n; } `));`EC-evaluate value: undefined`

Original	JavaScript
`;;; EC-Eval input:(factorial 5);;; EC-Eval value: 120`	`EC-evaluate input:factorial(5)EC-Eval value: 120`

Exercise 5.55 As an alternative to using the explicit-control evaluator's read-eval-print read-evaluate-print loop, design a register machine that performs a read-compile-execute-print loop. That is, the machine should run a loop that reads an expression, a program, compiles it, assembles and executes the resulting code, and prints the result. This is easy to run in our simulated setup, since we can arrange to call the procedures functions compile and assemble as register-machine operations.

Exercise 5.56 Use the compiler to compile the metacircular evaluator of section 4.1 and run this program using the register-machine simulator.

Original		JavaScript
(To compile more than one definition at a time, you can package the definitions in a `begin`.)		Because the parser takes a string as input, you will need to convert the program into a string. The simplest way to do this is to use the back quotes (`), as we have done for the example inputs to `compile_and_go` and `compile_and_run`.

The resulting interpreter will run very slowly because of the multiple levels of interpretation, but getting all the details to work is an instructive exercise.

Exercise 5.57 Develop a rudimentary implementation of Scheme JavaScript in C (or some other low-level language of your choice) by translating the explicit-control evaluator of section 5.4 into C. In order to run this code you will need to also provide appropriate storage-allocation routines and other runtime support.

Exercise 5.58 As a counterpoint to exercise 5.57, modify the compiler so that it compiles Scheme procedures JavaScript functions into sequences of C instructions. Compile the metacircular evaluator of section 4.1 to produce a Scheme JavaScript interpreter written in C.

[1] Of course, compiled procedures functions as well as interpreted procedures functions are compound (nonprimitive). For compatibility with the terminology used in the explicit-control evaluator, in this section we will use compound to mean interpreted (as opposed to compiled).

[2] Now that the evaluator machine starts with a branch, we must always initialize the flag register before starting the evaluator machine. To start the machine at its ordinary read-eval-print read-evaluate-print loop, we could use

Original	JavaScript
`(define (start-eceval) (set! the-global-environment (setup-environment)) (set-register-contents! eceval 'flag false) (start eceval))`	`function start_eceval() { set_register_contents(eceval, "flag", false); return start(eceval); }`

[3] Since a compiled procedure function is an object that the system may try to print, we also modify the system print operation user-print user_print (from section 4.1.4) so that it will not attempt to print the components of a compiled procedure: function:

Original	JavaScript
`(define (user-print object) (cond ((compound-procedure? object) (display (list 'compound-procedure (procedure-parameters object) (procedure-body object) '))) ((compiled-procedure? object) (display ')) (else (display object))))`	`function user_print(string, object) { function prepare(object) { return is_compound_function(object) ? "< compound function >" : is_primitive_function(object) ? "< primitive function >" : is_compiled_function(object) ? "< compiled function >" : is_pair(object) ? pair(prepare(head(object)), prepare(tail(object))) : object; } display(string + " " + stringify(prepare(object))); }`

[4] We can do even better by extending the compiler to allow compiled code to call interpreted procedures. functions. See exercise 5.53.

[5] Independent of the strategy of execution, we incur significant overhead if we insist that errors encountered in execution of a user program be detected and signaled, rather than being allowed to kill the system or produce wrong answers. For example, an out-of-bounds array reference can be detected by checking the validity of the reference before performing it. The overhead of checking, however, can be many times the cost of the array reference itself, and a programmer should weigh speed against safety in determining whether such a check is desirable. A good compiler should be able to produce code with such checks, should avoid redundant checks, and should allow programmers to control the extent and type of error checking in the compiled code.

Compilers for popular languages, such as C and C++, put hardly any error-checking operations into running code, so as to make things run as fast as possible. As a result, it falls to programmers to explicitly provide error checking. Unfortunately, people often neglect to do this, even in critical applications where speed is not a constraint. Their programs lead fast and dangerous lives. For example, the notorious Worm that paralyzed the Internet in 1988 exploited the UNIX$^{\textrm{TM}}$ operating system's failure to check whether the input buffer has overflowed in the finger daemon. (See Spafford 1989.)

[6] Of course, with either the interpretation or the compilation strategy we must also implement for the new machine storage allocation, input and output, and all the various operations that we took as primitive in our discussion of the evaluator and compiler. One strategy for minimizing work here is to write as many of these operations as possible in Lisp JavaScript and then compile them for the new machine. Ultimately, everything reduces to a small kernel (such as garbage collection and the mechanism for applying actual machine primitives) that is hand-coded for the new machine.

[7] This strategy leads to amusing tests of correctness of the compiler, such as checking whether the compilation of a program on the new machine, using the compiled compiler, is identical with the compilation of the program on the original Lisp JavaScript system. Tracking down the source of differences is fun but often frustrating, because the results are extremely sensitive to minuscule details.

< Previous

Next >

5.5.7 Interfacing Compiled Code to the Evaluator