In section 4.1.7 we modified our original metacircular interpreter to separate analysis from execution. We analyzed each expression component to produce an execution procedure function that took an environment as argument and performed the required operations. In our compiler, we will do essentially the same analysis. Instead of producing execution procedures, functions, however, we will generate sequences of instructions to be run by our register machine.
The procedure function compile is the top-level dispatch in the compiler. It corresponds to the evalevaluate procedure function of section 4.1.1, the analyze procedure function of section 4.1.7, and the eval-dispatch eval_dispatch entry point of the explicit-control-evaluator in section 5.4.1. The compiler, like the interpreters, uses the expression-syntax procedures component-syntax functions defined in section 4.1.2.[1] Compile The function compile performs a case analysis on the syntactic type of the expression component to be compiled. For each type of expression, component, it dispatches to a specialized code generator:
Original | JavaScript |
(define (compile exp target linkage) (cond ((self-evaluating? exp) (compile-self-evaluating exp target linkage)) ((quoted? exp) (compile-quoted exp target linkage)) ((variable? exp) (compile-variable exp target linkage)) ((assignment? exp) (compile-assignment exp target linkage)) ((definition? exp) (compile-definition exp target linkage)) ((if? exp) (compile-if exp target linkage)) ((lambda? exp) (compile-lambda exp target linkage)) ((begin? exp) (compile-sequence (begin-actions exp) target linkage)) ((cond? exp) (compile (cond->if exp) target linkage)) ((application? exp) (compile-application exp target linkage)) (else (error "Unknown expression type - - COMPILE" exp)))) | function compile(component, target, linkage) { return is_literal(component) ? compile_literal(component, target, linkage) : is_name(component) ? compile_name(component, target, linkage) : is_application(component) ? compile_application(component, target, linkage) : is_operator_combination(component) ? compile(operator_combination_to_application(component), target, linkage) : is_conditional(component) ? compile_conditional(component, target, linkage) : is_lambda_expression(component) ? compile_lambda_expression(component, target, linkage) : is_sequence(component) ? compile_sequence(sequence_statements(component), target, linkage) : is_block(component) ? compile_block(component, target, linkage) : is_return_statement(component) ? compile_return_statement(component, target, linkage) : is_function_declaration(component) ? compile(function_decl_to_constant_decl(component), target, linkage) : is_declaration(component) ? compile_declaration(component, target, linkage) : is_assignment(component) ? compile_assignment(component, target, linkage) : error(component, "unknown component type -- compile"); } |
Compile The function compile and the code generators that it calls take two arguments in addition to the expression component to compile. There is a target, which specifies the register in which the compiled code is to return the value of the expression. component. There is also a linkage descriptor, which describes how the code resulting from the compilation of the expression component should proceed when it has finished its execution. The linkage descriptor can require the code to do one of the following three things:
For example, compiling the expression literal 5 (which is self-evaluating) with a target of the val register and a linkage of next "next" should produce the instruction
Original | JavaScript |
(assign val (const 5)) | assign("val", constant(5)) |
Original | JavaScript |
(assign val (const 5)) (goto (reg continue)) | assign("val", constant(5)), go_to(reg("continue")) |
Original | JavaScript | |
Our compiler uses the "return" linkage when compiling
the return expression of a return statement.
Just as in the explicit-control evaluator, returning from a function call happens in three steps:
|
Each code generator returns an instruction sequence containing the object code it has generated for the expression. component. Code generation for a compound expression compound component is accomplished by combining the output from simpler code generators for component expressions, subcomponents, just as evaluation of a compound expression compound component is accomplished by evaluating the component expressions. subcomponents.
The simplest method for combining instruction sequences is a procedure function called append-instruction-sequences. append_instruction_sequences, It takes as arguments any number of instruction sequences which takes as arguments two instruction sequences that are to be executed sequentially; it sequentially. It appends them and returns the combined sequence. That is, if $seq_1$ and $seq_2$ are sequences of instructions, then evaluating
Original | JavaScript |
(append-instruction-sequences $seq_1$ $seq_2$) | append_instruction_sequences($seq$$_1$, $seq$$_2$) |
Original | JavaScript |
$seq_1$ $seq_2$ | $seq$$_1$ $seq$$_2$ |
Whenever registers might need to be saved, the compiler's code generators use preserving, which is a more subtle method for combining instruction sequences. Preserving The function preserving takes three arguments: a set of registers and two instruction sequences that are to be executed sequentially. It appends the sequences in such a way that the contents of each register in the set is preserved over the execution of the first sequence, if this is needed for the execution of the second sequence. That is, if the first sequence modifies the register and the second sequence actually needs the register's original contents, then preserving wraps a save and a restore of the register around the first sequence before appending the sequences. Otherwise, preserving simply returns the appended instruction sequences. Thus, for example,
Original | JavaScript |
(preserving (list $reg_1$ $reg_2$) $seq_1$ $seq_2$) | preserving(list($reg$$_1$, $reg$$_2$), $seq$$_1$, $seq$$_2$) |
Original | JavaScript | |
\[ \begin{array}{l|l|l|l} \textit{seq}_1 & \texttt{save(}\textit{reg}_1\texttt{),} & \texttt{save(}\textit{reg}_2\texttt{),} & \texttt{save(}\textit{reg}_2\texttt{),} \\ \textit{seq}_2 & \textit{seq}_1 & \textit{seq}_1 & \texttt{save(}\textit{reg}_1\texttt{),} \\ & \texttt{restore(}\textit{reg}_1\texttt{),} & \texttt{restore(}\textit{reg}_2\texttt{),} & \textit{seq}_1 \\ & \textit{seq}_2 & \textit{seq}_2 & \texttt{restore(}\textit{reg}_1\texttt{),} \\ & & & \texttt{restore(}\textit{reg}_2\texttt{),} \\ & & & \textit{seq}_2 \end{array} \] |
By using preserving to combine instruction sequences the compiler avoids unnecessary stack operations. This also isolates the details of whether or not to generate save and restore instructions within the preserving procedure, function, separating them from the concerns that arise in writing each of the individual code generators. In fact no save or restore instructions are explicitly produced by the code generators. generators, except that the code for calling a function saves continue and the code for returning from a function restores it: These corresponding save and restore instructions are explicitly generated by different calls to compile, not as a matched pair by preserving (as we will see in section 5.5.3).
In principle, we could represent an instruction sequence simply as a list of instructions. Append-instruction-sequences The function append_instruction_sequences could then combine instruction sequences by performing an ordinary list append. However, preserving would then be a complex operation, because it would have to analyze each instruction sequence to determine how the sequence uses its registers. Preserving The function preserving would be inefficient as well as complex, because it would have to analyze each of its instruction sequence arguments, even though these sequences might themselves have been constructed by calls to preserving, in which case their parts would have already been analyzed. To avoid such repetitious analysis we will associate with each instruction sequence some information about its register use. When we construct a basic instruction sequence we will provide this information explicitly, and the procedures functions that combine instruction sequences will derive register-use information for the combined sequence from the information associated with the sequences being combined.
An instruction sequence will contain three pieces of information:
Original | JavaScript |
(define (make-instruction-sequence needs modifies statements) (list needs modifies statements)) | function make_instruction_sequence(needs, modifies, instructions) { return list(needs, modifies, instructions); } |
For example, the two-instruction sequence that looks up the value of the variable x symbol "x" in the current environment, assigns the result to val, and then returns, and then proceeds to the continuation, requires registers env and continue to have been initialized, and modifies register val. This sequence would therefore be constructed as
Original | JavaScript |
(make-instruction-sequence '(env continue) '(val) '((assign val (op lookup-variable-value) (const x) (reg env)) (goto (reg continue)))) | make_instruction_sequence(list("env", "continue"), list("val"), list(assign("val", list(op("lookup_symbol_value"), constant("x"), reg("env"))), go_to(reg("continue")))); |
Original | JavaScript |
(f 'x 'y) ((f) 'x 'y) (f (g 'x) y) (f (g 'x) 'y) | f("x", "y") f()("x", "y") f(g("x"), y) f(g("x"), "y") |