5.5.3 Compiling <span style="color:green"> Combinations </span> <span style="color:blue"> Applications and Return Statements </span>

Permalink copied!

5.5.3 Compiling Combinations Applications and Return Statements

The essence of the compilation process is the compilation of procedure function applications. The code for a combination an application compiled with a given target and linkage has the form

Original	JavaScript
`$\langle compilation\ of\ operator,\ target$ proc$,\ linkage$ next$\rangle$ $\langle evaluate\ operands\ and\ construct\ argument\ list\ in$ argl$\rangle$ $\langle compilation\ of\ procedure\ call\ with\ given\ target\ and\ linkage\rangle$`	`$\langle{}$compilation of function expression, target fun, linkage "next"$\rangle$ $\langle{}$evaluate argument expressions and construct argument list in argl$\rangle$ $\langle{}$compilation of function call with given target and linkage$\rangle$`

The registers env, proc, fun, and argl may have to be saved and restored during evaluation of the operator and operands. function and argument expressions. Note that this is the only place in the compiler where a target other than val is specified.

The required code is generated by compile-application. compile_application. This recursively compiles the operator, function expression, to produce code that puts the procedure function to be applied into proc, fun, and compiles the operands, argument expressions, to produce code that evaluates the individual operands argument expressions of the application. The instruction sequences for the operands argument expressions are combined (by construct-arglist) construct_arglist) with code that constructs the list of arguments in argl, and the resulting argument-list code is combined with the procedure function code and the code that performs the procedure function call (produced by compile-procedure-call). compile_function_call). In appending the code sequences, the env register must be preserved around the evaluation of the operator function expression (since evaluating the operator function expression might modify env, which will be needed to evaluate the operands), argument expressions), and the proc fun register must be preserved around the construction of the argument list (since evaluating the operands argument expressions might modify proc, fun, which will be needed for the actual procedure function application). Continue The continue register must also be preserved throughout, since it is needed for the linkage in the procedure function call.

Original	JavaScript
`(define (compile-application exp target linkage) (let ((proc-code (compile (operator exp) 'proc 'next)) (operand-codes (map (lambda (operand) (compile operand 'val 'next)) (operands exp)))) (preserving '(env continue) proc-code (preserving '(proc continue) (construct-arglist operand-codes) (compile-procedure-call target linkage)))))`	`function compile_application(exp, target, linkage) { const fun_code = compile(function_expression(exp), "fun", "next"); const argument_codes = map(arg => compile(arg, "val", "next"), arg_expressions(exp)); return preserving(list("env", "continue"), fun_code, preserving(list("fun", "continue"), construct_arglist(argument_codes), compile_function_call(target, linkage))); }`

The code to construct the argument list will evaluate each operand argument expression into val and then cons that value onto the argument list being accumulated in argl. combine that value with the argument list being accumulated in argl using pair. Since we cons the arguments onto argl in sequence, adjoin the arguments to the front of argl in sequence, we must start with the last argument and end with the first, so that the arguments will appear in order from first to last in the resulting list. Rather than waste an instruction by initializing argl to the empty list to set up for this sequence of evaluations, we make the first code sequence construct the initial argl. The general form of the argument-list construction is thus as follows:

Original	JavaScript
`$\langle compilation\ of\ last\ operand,\ targeted\ to$ val$\rangle$ (assign argl (op list) (reg val)) $\langle compilation\ of\ next\ operand,\ targeted\ to$ val$\rangle$ (assign argl (op cons) (reg val) (reg argl)) $\ldots$ $\langle compilation\ of\ first\ operand,\ targeted\ to$ val$\rangle$ (assign argl (op cons) (reg val) (reg argl))`	`$\langle{}$compilation of last argument, targeted to val$\rangle$ assign("argl", list(op("list"), reg("val"))), $\langle{}$compilation of next argument, targeted to val$\rangle$ assign("argl", list(op("pair"), reg("val"), reg("argl"))), $\ldots$ $\langle{}$compilation of first argument, targeted to val$\rangle$ assign("argl", list(op("pair"), reg("val"), reg("argl"))),`

Argl must be preserved around each operand The argl register must be preserved around each argument evaluation except the first (so that arguments accumulated so far won't be lost), and env must be preserved around each operand evaluation except the last (for use by subsequent operand evaluations). argument evaluation except the last (for use by subsequent argument evaluations).

Compiling this argument code is a bit tricky, because of the special treatment of the first operand argument expression to be evaluated and the need to preserve argl and env in different places. The construct-arglist construct_arglist procedure function takes as arguments the code that evaluates the individual operands. argument expressions. If there are no operands argument expressions at all, it simply emits the instruction

Original	JavaScript
`(assign argl (const ()))`	`assign(argl, constant(null))`

Otherwise, construct-arglist construct_arglist creates code that initializes argl with the last argument, and appends code that evaluates the rest of the arguments and adjoins them to argl in succession. In order to process the arguments from last to first, we must reverse the list of operand argument code sequences from the order supplied by compile-application. compile_application.

Original

JavaScript

(define (construct-arglist operand-codes) (let ((operand-codes (reverse operand-codes))) (if (null? operand-codes) (make-instruction-sequence '() '(argl) '((assign argl (const ())))) (let ((code-to-get-last-arg (append-instruction-sequences (car operand-codes) (make-instruction-sequence '(val) '(argl) '((assign argl (op list) (reg val))))))) (if (null? (cdr operand-codes)) code-to-get-last-arg (preserving '(env) code-to-get-last-arg (code-to-get-rest-args (cdr operand-codes)))))))) (define (code-to-get-rest-args operand-codes) (let ((code-for-next-arg (preserving '(argl) (car operand-codes) (make-instruction-sequence '(val argl) '(argl) '((assign argl (op cons) (reg val) (reg argl))))))) (if (null? (cdr operand-codes)) code-for-next-arg (preserving '(env) code-for-next-arg (code-to-get-rest-args (cdr operand-codes))))))

Applying procedures functions

After evaluating the elements of a combination, function application, the compiled code must apply the procedure function in proc fun to the arguments in argl. The code performs essentially the same dispatch as the apply procedure function in the metacircular evaluator of section 4.1.1 or the apply-dispatch apply_dispatch entry point in the explicit-control evaluator of section 5.4.1. It checks whether the procedure function to be applied is a primitive procedure function or a compiled procedure. function. For a primitive procedure, function, it uses apply-primitive-procedure; apply_primitive_function; we will see shortly how it handles compiled procedures. functions. The procedure-application function-application code has the following form:

Original	JavaScript
`(test (op primitive-procedure?) (reg proc)) (branch (label primitive-branch)) compiled-branch $\langle code\ to\ apply\ compiled\ procedure\ with\ given\ target\ and\ appropriate\ linkage\rangle$ primitive-branch (assign $\langle target\rangle$ (op apply-primitive-procedure) (reg proc) (reg argl)) $\langle linkage \rangle$ after-call`	`$\texttt{ }\texttt{ }$test(list(op("primitive_function"), reg("fun"))), branch(label("primitive_branch")), "compiled_branch", $\langle{}$code to apply compiled function with given target and appropriate linkage$\rangle$ "primitive_branch", assign($target$, list(op("apply_primitive_function"), reg("fun"), reg("argl"))), $\langle{}$linkage$\rangle$ "after_call"`

Observe that the compiled branch must skip around the primitive branch. Therefore, if the linkage for the original procedure function call was next, "next", the compound branch must use a linkage that jumps to a label that is inserted after the primitive branch. (This is similar to the linkage used for the true branch in compile-if.) compile_conditional.)

Original

JavaScript

(define (compile-procedure-call target linkage) (let ((primitive-branch (make-label 'primitive-branch)) (compiled-branch (make-label 'compiled-branch)) (after-call (make-label 'after-call))) (let ((compiled-linkage (if (eq? linkage 'next) after-call linkage))) (append-instruction-sequences (make-instruction-sequence '(proc) '() `((test (op primitive-procedure?) (reg proc)) (branch (label ,primitive-branch)))) (parallel-instruction-sequences (append-instruction-sequences compiled-branch (compile-proc-appl target compiled-linkage)) (append-instruction-sequences primitive-branch (end-with-linkage linkage (make-instruction-sequence '(proc argl) (list target) `((assign ,target (op apply-primitive-procedure) (reg proc) (reg argl))))))) after-call))))

function compile_function_call(target, linkage) { const primitive_branch = make_label("primitive_branch"); const compiled_branch = make_label("compiled_branch"); const after_call = make_label("after_call"); const compiled_linkage = linkage === "next" ? after_call : linkage; return append_instruction_sequences( make_instruction_sequence(list("fun"), null, list(test(list(op("is_primitive_function"), reg("fun"))), branch(label(primitive_branch)))), append_instruction_sequences( parallel_instruction_sequences( append_instruction_sequences( compiled_branch, compile_fun_appl(target, compiled_linkage)), append_instruction_sequences( primitive_branch, end_with_linkage(linkage, make_instruction_sequence(list("fun", "argl"), list(target), list(assign( target, list(op("apply_primitive_function"), reg("fun"), reg("argl")))))))), after_call)); }

The primitive and compound branches, like the true and false branches in compile-if, compile_conditional, are appended using parallel-instruction-sequences parallel_instruction_sequences rather than the ordinary append-instruction-sequences, append_instruction_sequences, because they will not be executed sequentially.

Applying compiled procedures functions

The code that handles procedure The handling of function application and return is the most subtle part of the compiler, even though the instruction sequences it generates are very short. compiler. A compiled procedure function (as constructed by compile-lambda) compile_lambda_expression) has an entry point, which is a label that designates where the code for the procedure function starts. The code at this entry point computes a result in val and returns by executing the instruction (goto (reg continue)). and ends by executing the instructions from a compiled return statement.

Original		JavaScript
		The code for a compiled-function application uses the stack in the same way as the explicit-control evaluator (section 5.4.1): before jumping to the compiled function's entry point, it saves the continuation of the function call to the stack, followed by a mark that allows reverting the stack to the state right before the call with the continuation on top. `$\texttt{ }\texttt{ }$// set up for return from function save("continue"), push_marker_to_stack(), // jump to the function's entry point assign("val", list(op("compiled_function_entry"), reg("fun"))), go_to(reg("val")),` Compiling a return statement (with `compile_return_statement`) generates corresponding code for reverting the stack and restoring and jumping to `continue`. `$\texttt{ }\texttt{ }$revert_stack_to_marker(), restore("continue"), $\langle{}$evaluate the return expression and store the result in val$\rangle$ go_to(reg("continue")), // $\texttt{"return"}$-linkage code` Unless a function enters an infinite loop, it will end by executing the above return code, resulting from either a return statement in the program or one inserted by `compile_lambda_body` to return `undefined`.[1]

Original		JavaScript
Thus, we might expect the code for a compiled-procedure application (to be generated by `compile-proc-appl`) with a given target and linkage to look like this if the linkage is a label:		Straightforward code for a compiled-function application with a given target and linkage would set up `continue` to make the function return to a local label instead of to the final linkage, to copy the function value from `val` to the target register if necessary. It would look like this if the linkage is a label:

Original

JavaScript

(assign continue (label proc-return)) (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val)) proc-return (assign $target$ (reg val)) ; included if target is not $\texttt{val}$ (goto (label $\langle linkage\rangle$)) ; linkage code

$\texttt{ }\texttt{ }$assign("continue", label("fun_return")), // where function should return to save("continue"), // will be restored by the function push_marker_to_stack(), // allows the function to revert stack to find $\texttt{fun_return}$ assign("val", list(op("compiled_function_entry"), reg("fun"))), go_to(reg("val")), // eventually reverts stack, restores and jumps to $\texttt{continue}$ "fun_return", // the function returns to here assign($\mathit{target}$, reg("val")), // included if target is not $\texttt{val}$ go_to(label($linkage$)), // linkage code

or like this if the linkage is return. this—saving the caller's continuation at the start in order to restore and go to it at the end—if the linkage is "return" (that is, if the application is in a return statement and its value is the result to be returned):

Original

JavaScript

(save continue) (assign continue (label proc-return)) (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val)) proc-return (assign $\langle target\rangle$ (reg val)) ; included if target is not $\texttt{val}$ (restore continue) (goto (reg continue)) ; linkage code

$\texttt{ }\texttt{ }$save("continue"), // save the caller's continuation assign("continue", label("fun_return")), // where function should return to save("continue"), // will be restored by the function push_marker_to_stack(), // allows the function to revert stack to find $\texttt{fun_return}$ assign("val", list(op("compiled_function_entry"), reg("fun"))), go_to(reg("val")), // eventually reverts stack, restores and jumps to $\texttt{continue}$ "fun_return", // the function returns to here assign($\mathit{target}$, reg("val")), // included if target is not $\texttt{val}$ restore("continue"), // restore the caller's continuation go_to(reg("continue")), // linkage code

This code sets up continue so that the procedure function will return to a label proc-return fun_return and jumps to the procedure's function's entry point. The code at proc-return fun_return transfers the procedure's function's result from val to the target register (if necessary) and then jumps to the location specified by the linkage. (The linkage is always return "return" or a label, because compile-procedure-call compile_function_call replaces a next "next" linkage for the compound-procedure branch by an after-call compound-function branch by an after_call label.)

Original		JavaScript
		Before jumping to the function's entry point, we save `continue` and execute `push_marker_to_stack()` to enable the function to return to the intended location in the program with the expected stack. Matching `revert_stack_to_marker()` and `restore("continue")` instructions are generated by `compile_return_statement` for each return statement in the body of the function.[2]

In fact, if the target is not val, that is the above is exactly the code our compiler will generate.[3] Usually, however, the target is val (the only time the compiler specifies a different register is when targeting the evaluation of an operator a function expression to proc), fun), so the procedure function result is put directly into the target register and there is no need to return jump to a special location that copies it. Instead we simplify the code by setting up continue so that the procedure called function will return directly to the place specified by the caller's linkage:

Original	JavaScript
`$\langle set\ up$ continue $for\ linkage\rangle$ (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val))`	`$\langle{}$set up continue for linkage and push the marker$\rangle$ assign("val", list(op("compiled_function_entry"), reg("fun"))), go_to(reg("val")),`

If the linkage is a label, we set up continue so that the procedure function will return tocontinue at that label. (That is, the (goto (reg continue)) go_to(reg("continue")) the procedure called function ends with becomes equivalent to the (goto (label linkage)) go_to(label($linkage$)) at proc-return fun_return above.)

Original	JavaScript
`(assign continue (label linkage)) (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val))`	`assign("continue", label($linkage$)), save("continue"), push_marker_to_stack(), assign("val", list(op("compiled_function_entry"), reg("fun"))), go_to(reg("val")),`

If the linkage is return, "return", we don't need to set up continue at all: assign continue: It already holds the desired location. (That is, the (goto (reg continue)) go_to(reg("continue")) the procedure called function ends with goes directly to the place where the (goto (reg continue)) go_to(reg("continue")) at proc-return fun_{\hspace{0pt}}return would have gone.)

Original	JavaScript
`(assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val))`	`save("continue"), push_marker_to_stack(), assign("val", list(op("compiled_function_entry"), reg("fun"))), go_to(reg("val")),`

With this implementation of the return "return" linkage, the compiler generates tail-recursive code. Calling a procedure as the final step in a procedure body A function call in a return statement whose value is the result to be returned does a direct transfer, without saving any information on the stack. unnecessary information on the stack.

Suppose instead that we had handled the case of a procedure function call with a linkage of return"return" and a target of val in the same way as for a non-val target. This would destroy tail recursion. Our system would still give return the same value for any expression. function call. But each time we called a procedure, function, we would save continue and return after the call to undo the (useless) save. These extra saves would accumulate during a nest of procedure function calls.[4]

Compile-proc-appl The function compile_fun_appl generates the above procedure-application function-application code by considering four cases, depending on whether the target for the call is val and whether the linkage is return. "return". Observe that the instruction sequences are declared to modify all the registers, since executing the procedure function body can change the registers in arbitrary ways.[5]

Original		JavaScript
Also note that the code sequence for the case with target `val` and linkage `return` is declared to need `continue`: Even though `continue` is not explicitly used in the two-instruction sequence, we must be sure that `continue` will have the correct value when we enter the compiled procedure.

Original

JavaScript

(define (compile-proc-appl target linkage) (cond ((and (eq? target 'val) (not (eq? linkage 'return))) (make-instruction-sequence '(proc) all-regs `((assign continue (label ,linkage)) (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val))))) ((and (not (eq? target 'val)) (not (eq? linkage 'return))) (let ((proc-return (make-label 'proc-return))) (make-instruction-sequence '(proc) all-regs `((assign continue (label ,proc-return)) (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val)) ,proc-return (assign ,target (reg val)) (goto (label ,linkage)))))) ((and (eq? target 'val) (eq? linkage 'return)) (make-instruction-sequence '(proc continue) all-regs '((assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val))))) ((and (not (eq? target 'val)) (eq? linkage 'return)) (error "return linkage, target not val - - COMPILE" target))))

function compile_fun_appl(target, linkage) { const fun_return = make_label("fun_return"); return target === "val" && linkage !== "return" ? make_instruction_sequence(list("fun"), all_regs, list(assign("continue", label(linkage)), save("continue"), push_marker_to_stack(), assign("val", list(op("compiled_function_entry"), reg("fun"))), go_to(reg("val")))) : target !== "val" && linkage !== "return" ? make_instruction_sequence(list("fun"), all_regs, list(assign("continue", label(fun_return)), save("continue"), push_marker_to_stack(), assign("val", list(op("compiled_function_entry"), reg("fun"))), go_to(reg("val")), fun_return, assign(target, reg("val")), go_to(label(linkage)))) : target === "val" && linkage === "return" ? make_instruction_sequence(list("fun", "continue"), all_regs, list(save("continue"), push_marker_to_stack(), assign("val", list(op("compiled_function_entry"), reg("fun"))), go_to(reg("val")))) : // $\texttt{target !== "val" \&\& linkage === "return"}$ error(target, "return linkage, target not val -- compile"); }

Original		JavaScript
		We have shown how to generate tail-recursive linkage code for a function application when the linkage is `"return"`—that is, when the application is in a return statement and its value is the result to be returned. Similarly, as explained in section 5.4.1, the stack-marker mechanism used here (and in the explicit-control evaluator) for the call and return produces tail-recursive behavior only in that situation. These two aspects of the code generated for function application combine to ensure that when a function ends by returning the value of a function call, no stack accumulates.

Original		JavaScript
		Compiling return statements The code for a return statement takes the following form, regardless of the given linkage and target: revert_stack_to_marker(), restore("continue"), // saved by $\texttt{compile\char`_fun\char`_appl}$ $\langle{}$evaluate the return expression and store the result in val$\rangle$ go_to(reg("continue")) // $\texttt{"return"}$-linkage code The instructions to revert the stack using the marker and then restore `continue` correspond to the instructions generated by `compile_fun_appl` to save `continue` and mark the stack. The final jump to `continue` is generated by the use of the `"return"` linkage when compiling the return expression. The function `compile_return_statement` is different from all other code generators in that it ignores the target and linkage arguments—it always compiles the return expression with target `val` and linkage `"return"`. `function compile_return_statement(stmt, target, linkage) { return append_instruction_sequences( make_instruction_sequence(null, list("continue"), list(revert_stack_to_marker(), restore("continue"))), compile(return_expression(stmt), "val", "return")); }`

[1] Because the execution of a function body always ends with a return, there is no need here for a mechanism like the return_undefined entry point from section 5.4.1.

[2] Elsewhere in the compiler, all saves and restores of registers are generated by preserving to preserve a register's value across a sequence of instructions by saving it before those instructions and restoring it after—for example over the evaluation of the predicate of a conditional. But this mechanism cannot generate instructions to save and restore continue for a function application and the corresponding return, because these are compiled separately and are not contiguous. Instead, these saves and restores must be explicitly generated by compile_fun_appl and compile_return_statement.

[3] Actually, we signal an error when the target is not val and the linkage is return, "return", since the only place we request returnlinkages a "return" linkage is in compiling procedures, return expressions, and our convention is that procedures functions return their values in val.

[4] Making a compiler generate tail-recursive code might seem like a straightforward idea. But code is desirable, especially in the functional paradigm. However, compilers for common languages, including C and Pascal, C and C++, do not always do this, and therefore these languages cannot represent iterative processes in terms of procedure function call alone. The difficulty with tail recursion in these languages is that their implementations use the stack to store procedure function arguments and local variables names as well as return addresses. The Scheme JavaScript implementations described in this book store arguments and variables names in memory to be garbage-collected. The reason for using the stack for variables names and arguments is that it avoids the need for garbage collection in languages that would not otherwise require it, and is generally believed to be more efficient. Sophisticated Lisp compilers can, in fact, use the stack for arguments without destroying tail recursion. (See Hanson 1990 for a description.) There is also some debate about whether stack allocation is actually more efficient than garbage collection in the first place, but the details seem to hinge on fine points of computer architecture. (See Appel 1987 and Miller and Rozas 1994 for opposing views on this issue.)

[5] The variable constant all-regs all_regs is bound to the list of names of all the registers:

Original	JavaScript
`(define all-regs '(env proc val argl continue))`	`const all_regs = list("env", "fun", "val", "argl", "continue");`

< Previous

Next >

5.5.3 Compiling Combinations Applications and Return Statements

Applying procedures functions

Applying compiled procedures functions

Compiling return statements