2.1.3 What Is Meant by Data? - SICP Comparison Edition

Permalink copied!

We began the rational-number implementation in section 2.1.1 by implementing the rational-number operations add-rat, add_rat, sub-rat, sub_rat, and so on in terms of three unspecified procedures: functions: make-rat, make_rat, numer, and denom. At that point, we could think of the operations as being defined in terms of data objects—numerators, denominators, and rational numbers—whose behavior was specified by the latter three procedures. functions.

But exactly what is meant by data? It is not enough to say whatever is implemented by the given selectors and constructors. Clearly, not every arbitrary set of three procedures functions can serve as an appropriate basis for the rational-number implementation. We need to guarantee that, if we construct a rational number x from a pair of integers n and d, then extracting the numer and the denom of x and dividing them should yield the same result as dividing n by d. In other words, make-rat, make_rat, numer, and denom must satisfy the condition that, for any integer n and any nonzero integer d, if x is (make-rat n d), make_rat(n, d), then

Original		JavaScript
\[ \begin{array}{lll} \dfrac{(\texttt{numer}~\texttt{x})}{(\texttt{denom}~\texttt{x})} &=& \dfrac{\texttt{n}}{\texttt{d}} \end{array} \]		\[ \begin{array}{lll} \dfrac{\texttt{numer}(\texttt{x})}{\texttt{denom}(\texttt{x})} &=& \dfrac{\texttt{n}}{\texttt{d}} \end{array} \]

In fact, this is the only condition make-rat, make_rat, numer, and denom must fulfill in order to form a suitable basis for a rational-number representation. In general, we can think of data as defined by some collection of selectors and constructors, together with specified conditions that these procedures functions must fulfill in order to be a valid representation.[1]

This point of view can serve to define not only high-level data objects, such as rational numbers, but lower-level objects as well. Consider the notion of a pair, which we used in order to define our rational numbers. We never actually said what a pair was, only that the language supplied procedures functions cons, pair, car, head, and cdr tail for operating on pairs. But the only thing we need to know about these three operations is that if we glue two objects together using cons pair we can retrieve the objects using car head and cdr. tail. That is, the operations satisfy the condition that, for any objects x and y, if z is (cons x y) pair(x, y) then (car z) head(z) is x and (cdr z) tail(z) is y. Indeed, we mentioned that these three procedures functions are included as primitives in our language. However, any triple of procedures functions that satisfies the above condition can be used as the basis for implementing pairs. This point is illustrated strikingly by the fact that we could implement cons, pair, car, head, and cdr tail without using any data structures at all but only using procedures. functions. Here are the definitions:[2]

Original	JavaScript
`(define (cons x y) (define (dispatch m) (cond ((= m 0) x) ((= m 1) y) (else (error "Argument not 0 or 1 -- CONS" m)))) dispatch) (define (car z) (z 0)) (define (cdr z) (z 1))`	`function pair(x, y) { function dispatch(m) { return m === 0 ? x : m === 1 ? y : error(m, "argument not 0 or 1 -- pair"); } return dispatch; } function head(z) { return z(0); } function tail(z) { return z(1); }`

This use of procedures functions corresponds to nothing like our intuitive notion of what data should be. Nevertheless, all we need to do to show that this is a valid way to represent pairs is to verify that these procedures functions satisfy the condition given above.

The subtle point to notice is that the value returned by (cons x y) pair(x, y) is a procedure—namely function—namely the internally defined procedure function dispatch, which takes one argument and returns either x or y depending on whether the argument is 0 or 1. Correspondingly, (car z) head(z) is defined to apply z to 0. Hence, if z is the procedure function formed by (cons x y), pair(x, y), then z applied to 0 will yield x. Thus, we have shown that (car (cons x y)) head(pair(x, y)) yields x, as desired. Similarly, (cdr (cons x y)) tail(pair(x, y)) applies the procedure function returned by (cons x y) pair(x, y) to 1, which returns y. Therefore, this procedural functional implementation of pairs is a valid implementation, and if we access pairs using only cons, pair, car, head, and cdr tail we cannot distinguish this implementation from one that uses real data structures.

The point of exhibiting the procedural functional representation of pairs is not that our language works this way (Scheme, and Lisp systems in general, implement pairs directly, for efficiency reasons) (an efficient implementation of pairs might use JavaScript's native vector data structure) but that it could work this way. The procedural functional representation, although obscure, is a perfectly adequate way to represent pairs, since it fulfills the only conditions that pairs need to fulfill. This example also demonstrates that the ability to manipulate procedures functions as objects automatically provides the ability to represent compound data. This may seem a curiosity now, but procedural functional representations of data will play a central role in our programming repertoire. This style of programming is often called message passing, and we will be using it as a basic tool in chapter 3 when we address the issues of modeling and simulation.

Exercise 2.4 Here is an alternative procedural functional representation of pairs. For this representation, verify that (car (cons x y)) head(pair(x, y)) yields x for any objects x and y.

Original	JavaScript
`(define (cons x y) (lambda (m) (m x y))) (define (car z) (z (lambda (p q) p)))`	`function pair(x, y) { return m => m(x, y); } function head(z) { return z((p, q) => p); }`

What is the corresponding definition of cdr? tail? (Hint: To verify that this works, make use of the substitution model of section 1.1.5.)

Original	JavaScript
	`function tail(z) { return z((p, q) => q); }`

Exercise 2.5 Show that we can represent pairs of nonnegative integers using only numbers and arithmetic operations if we represent the pair $a$ and $b$ as the integer that is the product $2^a 3^b$. Give the corresponding definitions of the procedures functions cons, pair, car, head, and cdr. tail.

Original	JavaScript
	`function pair(a, b) { return fast_expt(2, a) * fast_expt(3, b); } function head(p) { return p % 2 === 0 ? head(p / 2) + 1 : 0; } function tail(p) { return p % 3 === 0 ? tail(p / 3) + 1 : 0; }`

Exercise 2.6 In case representing pairs as procedures functions (exercise 2.4) wasn't mind-boggling enough, consider that, in a language that can manipulate procedures, functions, we can get by without numbers (at least insofar as nonnegative integers are concerned) by implementing 0 and the operation of adding 1 as

Original	JavaScript
`(define zero (lambda (f) (lambda (x) x))) (define (add-1 n) (lambda (f) (lambda (x) (f ((n f) x)))))`	`const zero = f => x => x; function add_1(n) { return f => x => f(n(f)(x)); }`

This representation is known as Church numerals, after its inventor, Alonzo Church, the logician who invented the $\lambda$ calculus.

Define one and two directly (not in terms of zero and add-1). add_1). (Hint: Use substitution to evaluate (add-1 zero)). add_1(zero)). Give a direct definition of the addition procedure + function plus (not in terms of repeated application of add-1). add_1).

Original	JavaScript
	`const one = f => x => f(x); const two = f => x => f(f(x)); function plus(n, m) { return f => x => n(f)(m(f)(x)); } // testing const three = plus(one, two); function church_to_number(c) { return c(n => n + 1)(0); } church_to_number(three);`

[1] Surprisingly, this idea is very difficult to formulate rigorously. There are two approaches to giving such a formulation. One, pioneered by C. A. R. Hoare (1972), is known as the method of abstract models. It formalizes the procedures plus conditions functions plus conditions specification as outlined in the rational-number example above. Note that the condition on the rational-number representation was stated in terms of facts about integers (equality and division). In general, abstract models define new kinds of data objects in terms of previously defined types of data objects. Assertions about data objects can therefore be checked by reducing them to assertions about previously defined data objects. Another approach, introduced by Zilles at MIT, by Goguen, Thatcher, Wagner, and Wright at IBM (see Thatcher, Wagner, and Wright 1978), and by Guttag at Toronto (see Guttag 1977), is called algebraic specification. It regards the procedures functions as elements of an abstract algebraic system whose behavior is specified by axioms that correspond to our conditions, and uses the techniques of abstract algebra to check assertions about data objects. Both methods are surveyed in the paper by Liskov and Zilles (1975).

[2] The function error introduced in section 1.3.3 takes as optional second argument a string that gets displayed before the first argument—for example, if m is 42: Error in line 7: argument not 0 or 1 -- pair: 42

< Previous

Next >

2.1.3 What Is Meant by Data?