The manipulation of symbolic algebraic expressions is a complex
process that illustrates many of the hardest problems that occur in
the design of large-scale systems. An
algebraic expression, in
general, can be viewed as a hierarchical structure, a tree of
operators applied to operands. We can construct algebraic expressions
by starting with a set of primitive objects, such as constants and
variables, and combining these by means of algebraic operators, such
as addition and multiplication. As in other languages, we form
abstractions that enable us to refer to compound objects in simple
terms. Typical abstractions in symbolic algebra are ideas such as
linear combination, polynomial, rational function, or trigonometric
function. We can regard these as compound types,
which are
often useful for directing the processing of expressions. For example, we
could describe the expression
\[ x^{2}\, \sin (y^2+1)+x\, \cos 2y+\cos (y^3 -2y^2) \]
as a polynomial in $x$ with coefficients that
are trigonometric functions of polynomials in
$y$ whose coefficients are integers.
We will not attempt to develop a complete algebraic-manipulation system here. Such systems are exceedingly complex programs, embodying deep algebraic knowledge and elegant algorithms. What we will do is look at a simple but important part of algebraic manipulation: the arithmetic of polynomials. We will illustrate the kinds of decisions the designer of such a system faces, and how to apply the ideas of abstract data and generic operations to help organize this effort.
Our first task in designing a system for performing arithmetic on polynomials is to decide just what a polynomial is. Polynomials are normally defined relative to certain variables (the indeterminates of the polynomial). For simplicity, we will restrict ourselves to polynomials having just one indeterminate (univariate polynomials).[1] We will define a polynomial to be a sum of terms, each of which is either a coefficient, a power of the indeterminate, or a product of a coefficient and a power of the indeterminate. A coefficient is defined as an algebraic expression that is not dependent upon the indeterminate of the polynomial. For example, \[ 5x^2 +3x +7 \] is a simple polynomial in $x$, and \[ (y^2 +1)x^3 +(2y)x+1 \] is a polynomial in $x$ whose coefficients are polynomials in $y$.
Already we are skirting some thorny issues. Is the first of these
polynomials the same as the polynomial
$5y^2 +3y +7$, or not? A reasonable answer
might be yes, if we are considering a polynomial purely as a
mathematical function, but no, if we are considering a polynomial to be a
syntactic form.
The second polynomial is algebraically equivalent
to a polynomial in $y$ whose coefficients are
polynomials in $x$. Should our system recognize
this, or not? Furthermore, there are other ways to represent a
polynomial—for example, as a product of factors, or (for a
univariate polynomial) as the set of roots, or as a listing of the values
of the polynomial at a specified set of points.[2]
We can finesse these questions by deciding that in our
algebraic-manipulation system a polynomial
will be a
particular syntactic form, not its underlying mathematical meaning.
Now we must consider how to go about doing arithmetic on polynomials. In this simple system, we will consider only addition and multiplication. Moreover, we will insist that two polynomials to be combined must have the same indeterminate.
We will approach the design of our system by following the familiar discipline of data abstraction. We will represent polynomials using a data structure called a poly, which consists of a variable and a collection of terms. We assume that we have selectors variable and term-list term_list that extract those parts from a poly and a constructor make-poly make_poly that assembles a poly from a given variable and a term list. A variable will be just a symbol, string, so we can use the same-variable? is_same_variable procedure function of section 2.3.2 to compare variables. The following procedures functions define addition and multiplication of polys:
Original | JavaScript |
(define (add-poly p1 p2) (if (same-variable? (variable p1) (variable p2)) (make-poly (variable p1) (add-terms (term-list p1) (term-list p2))) (error "Polys not in same var - - ADD-POLY" (list p1 p2)))) (define (mul-poly p1 p2) (if (same-variable? (variable p1) (variable p2)) (make-poly (variable p1) (mul-terms (term-list p1) (term-list p2))) (error "Polys not in same var - - MUL-POLY" (list p1 p2)))) | function add_poly(p1, p2) { return is_same_variable(variable(p1), variable(p2)) ? make_poly(variable(p1), add_terms(term_list(p1), term_list(p2))) : error(list(p1, p2), "polys not in same var -- add_poly"); } function mul_poly(p1, p2) { return is_same_variable(variable(p1), variable(p2)) ? make_poly(variable(p1), mul_terms(term_list(p1), term_list(p2))) : error(list(p1, p2), "polys not in same var -- mul_poly"); } |
To incorporate polynomials into our generic arithmetic system, we need to supply them with type tags. We'll use the tag polynomial, "polynomial", and install appropriate operations on tagged polynomials in the operation table.
Original | JavaScript | |
We'll embed all our code in an installation procedure for the polynomial package, similar to the ones in section 2.5.1: (define (install-polynomial-package) ;; internal procedures ;; representation of poly (define (make-poly variable term-list) (cons variable term-list)) (define (variable p) (car p)) (define (term-list p) (cdr p)) $langle$ procedures same-variable? and variable? from section 2.3.2 $\rangle$ ;; representation of terms and term lists $langle$ procedures adjoin-term $\ldots$ coeff from text below $langle$ (define (add-poly p1 p2) $\ldots$) $langle$ procedures used by add-poly $langle$ (define (mul-poly p1 p2) $\ldots$) $langle$procedures used by mul-poly $langle$ ;; interface to rest of the system (define (tag p) (attach-tag 'polynomial p)) (put 'add '(polynomial polynomial) (lambda (p1 p2) (tag (add-poly p1 p2)))) (put 'mul '(polynomial polynomial) (lambda (p1 p2) (tag (mul-poly p1 p2)))) (put 'make 'polynomial (lambda (var terms) (tag (make-poly var terms)))) 'done) | We'll embed all our code in an installation function for the polynomial package, similar to the installation functions in section 2.5.1: function install_polynomial_package() { // internal functions // representation of poly function make_poly(variable, term_list) { return pair(variable, term_list); } function variable(p) { return head(p); } function term_list(p) { return tail(p); } $\langle{}$functions is_same_variable and is_variable from section 2.3.2$\rangle$ // representation of terms and term lists $\langle{}$functions adjoin_term...coeff from text below$\rangle$ function add_poly(p1, p2) { ... } $\langle{}$functions used by add_poly$\rangle$ function mul_poly(p1, p2) { ... } $\langle{}$functions used by mul_poly$\rangle$ // interface to rest of the system function tag(p) { return attach_tag("polynomial", p); } put("add", list("polynomial", "polynomial"), (p1, p2) => tag(add_poly(p1, p2))); put("mul", list("polynomial", "polynomial"), (p1, p2) => tag(mul_poly(p1, p2))); put("make", "polynomial", (variable, terms) => tag(make_poly(variable, terms))); return "done"; } |
Polynomial addition is performed termwise. Terms of the same order (i.e., with the same power of the indeterminate) must be combined. This is done by forming a new term of the same order whose coefficient is the sum of the coefficients of the addends. Terms in one addend for which there are no terms of the same order in the other addend are simply accumulated into the sum polynomial being constructed.
In order to manipulate term lists, we will assume that we have a constructor the-empty-termlist the_empty_termlist that returns an empty term list and a constructor adjoin-term adjoin_term that adjoins a new term to a term list. We will also assume that we have a predicate empty-termlist? is_empty_termlist that tells if a given term list is empty, a selector first-term first_term that extracts the highest-order term from a term list, and a selector rest-terms rest_terms that returns all but the highest-order term. To manipulate terms, we will suppose that we have a constructor make-term make_term that constructs a term with given order and coefficient, and selectors order and coeff that return, respectively, the order and the coefficient of the term. These operations allow us to consider both terms and term lists as data abstractions, whose concrete representations we can worry about separately.
Here is the procedure function that constructs the term list for the sum of two polynomials:polynomials;[3]
Original | JavaScript | |
note that we slightly extend the syntax of conditional statements described in section 1.3.2 by admitting another conditional statement in place of the block following else: |
Original | JavaScript |
(define (add-terms L1 L2) (cond ((empty-termlist? L1) L2) ((empty-termlist? L2) L1) (else (let ((t1 (first-term L1)) (t2 (first-term L2))) (cond ((> (order t1) (order t2)) (adjoin-term t1 (add-terms (rest-terms L1) L2))) ((< (order t1) (order t2)) (adjoin-term t2 (add-terms L1 (rest-terms L2)))) (else (adjoin-term (make-term (order t1) (add (coeff t1) (coeff t2))) (add-terms (rest-terms L1) (rest-terms L2))))))))) | function add_terms(L1, L2) { if (is_empty_termlist(L1)) { return L2; } else if (is_empty_termlist(L2)) { return L1; } else { const t1 = first_term(L1); const t2 = first_term(L2); return order(t1) > order(t2) ? adjoin_term(t1, add_terms(rest_terms(L1), L2)) : order(t1) < order(t2) ? adjoin_term(t2, add_terms(L1, rest_terms(L2))) : adjoin_term(make_term(order(t1), add(coeff(t1), coeff(t2))), add_terms(rest_terms(L1), rest_terms(L2))); } } |
In order to multiply two term lists, we multiply each term of the first list by all the terms of the other list, repeatedly using mul-term-by-all-terms, mul_term_by_all_terms, which multiplies a given term by all terms in a given term list. The resulting term lists (one for each term of the first list) are accumulated into a sum. Multiplying two terms forms a term whose order is the sum of the orders of the factors and whose coefficient is the product of the coefficients of the factors:
Original | JavaScript |
(define (mul-terms L1 L2) (if (empty-termlist? L1) (the-empty-termlist) (add-terms (mul-term-by-all-terms (first-term L1) L2) (mul-terms (rest-terms L1) L2)))) (define (mul-term-by-all-terms t1 L) (if (empty-termlist? L) (the-empty-termlist) (let ((t2 (first-term L))) (adjoin-term (make-term (+ (order t1) (order t2)) (mul (coeff t1) (coeff t2))) (mul-term-by-all-terms t1 (rest-terms L)))))) | function mul_terms(L1, L2) { return is_empty_termlist(L1) ? the_empty_termlist : add_terms(mul_term_by_all_terms( first_term(L1), L2), mul_terms(rest_terms(L1), L2)); } function mul_term_by_all_terms(t1, L) { if (is_empty_termlist(L)) { return the_empty_termlist; } else { const t2 = first_term(L); return adjoin_term( make_term(order(t1) + order(t2), mul(coeff(t1), coeff(t2))), mul_term_by_all_terms(t1, rest_terms(L))); } } |
This is really all there is to polynomial addition and multiplication. Notice that, since we operate on terms using the generic procedures functions add and mul, our polynomial package is automatically able to handle any type of coefficient that is known about by the generic arithmetic package. If we include a coercion mechanism such as one of those discussed in section 2.5.2, then we also are automatically able to handle operations on polynomials of different coefficient types, such as \[ \begin{array}{l} {\left[3x^2 +(2+3i)x+7\right] \cdot \left[x^4 +\frac{2}{3}x^2 +(5+3i)\right]} \end{array} \]
Because we installed the polynomial addition and multiplication
procedures
functions
add-poly
add_poly
and
mul-poly
mul_poly
in the generic arithmetic system as the add
and mul operations for type
polynomial, our system is also automatically
able to handle polynomial operations such as
\[
\begin{array}{l}
{\left[ (y+1)x^2 +(y^2 +1)x+(y-1)\right]\cdot \left[(y-2)x+(y^3 +7)\right]}
\end{array}
\]
The reason is that when the system tries to combine coefficients, it
will dispatch through add and
mul. Since the coefficients are themselves
polynomials (in $y$), these will be combined
using
add-poly
add_poly
and
mul-poly.
mul_poly.
The result is a kind of
data-directed recursion
in which, for example, a call to
mul-poly
mul_poly
will result in recursive calls to
mul-poly
mul_poly
in order to multiply the coefficients. If the coefficients of the
coefficients were themselves polynomials (as might be used to represent
polynomials in three variables), the data direction would ensure that the
system would follow through another level of recursive calls, and so on
through as many levels as the structure of the data dictates.[4]
Finally, we must confront the job of implementing a good representation for term lists. A term list is, in effect, a set of coefficients keyed by the order of the term. Hence, any of the methods for representing sets, as discussed in section 2.3.3, can be applied to this task. On the other hand, our procedures functions add-terms add_terms and mul-terms mul_terms always access term lists sequentially from highest to lowest order. Thus, we will use some kind of ordered list representation.
How should we structure the list that represents a term list? One
consideration is the density
of the polynomials we intend
to manipulate. A polynomial is said to be
dense if it has nonzero coefficients in terms of most orders.
If it has many zero terms it is said to be
sparse. For example,
\[ A:\quad x^5 +2x^4 +3x^2 -2x -5 \]
is a dense polynomial, whereas
\[ B:\quad x^{100} +2x^2 +1 \]
is sparse.
The term lists of dense polynomials are most efficiently represented as lists of the coefficients. The term list of a dense polynomial is most efficiently represented as a list of the coefficients. For example, the polynomial $A$ above would be nicely represented as (1 2 0 3 -2 -5). list(1, 2, 0, 3, -2, -5). The order of a term in this representation is the length of the sublist beginning with that term's coefficient, decremented by 1.[5] This would be a terrible representation for a sparse polynomial such as $B$: There would be a giant list of zeros punctuated by a few lonely nonzero terms. A more reasonable representation of the term list of a sparse polynomial is as a list of the nonzero terms, where each term is a list containing the order of the term and the coefficient for that order. In such a scheme, polynomial $B$ is efficiently represented as ((100 1) (2 2) (0 1)). list(list(100, 1), list(2, 2), list(0, 1)). As most polynomial manipulations are performed on sparse polynomials, we will use this method. We will assume that term lists are represented as lists of terms, arranged from highest-order to lowest-order term. Once we have made this decision, implementing the selectors and constructors for terms and term lists is straightforward:[6]
Original | JavaScript |
(define (adjoin-term term term-list) (if (=zero? (coeff term)) term-list (cons term term-list))) (define (the-empty-termlist) '()) (define (first-term term-list) (car term-list)) (define (rest-terms term-list) (cdr term-list)) (define (empty-termlist? term-list) (null? term-list)) (define (make-term order coeff) (list order coeff)) (define (order term) (car term)) (define (coeff term) (cadr term)) | function adjoin_term(term, term_list) { return is_equal_to_zero(coeff(term)) ? term_list : pair(term, term_list); } const the_empty_termlist = null; function first_term(term_list) { return head(term_list); } function rest_terms(term_list) { return tail(term_list); } function is_empty_termlist(term_list) { return is_null(term_list); } function make_term(order, coeff) { return list(order, coeff); } function order(term) { return head(term); } function coeff(term) { return head(tail(term)); } |
Users of the polynomial package will create (tagged) polynomials by means of the procedure: function:
Original | JavaScript |
(define (make-polynomial var terms) ((get 'make 'polynomial) var terms)) | function make_polynomial(variable, terms) { return get("make", "polynomial")(variable, terms); } |
Original | JavaScript |
(define (div-terms L1 L2) (if (empty-termlist? L1) (list (the-empty-termlist) (the-empty-termlist)) (let ((t1 (first-term L1)) (t2 (first-term L2))) (if (> (order t2) (order t1)) (list (the-empty-termlist) L1) (let ((new-c (div (coeff t1) (coeff t2))) (new-o (- (order t1) (order t2)))) (let ((rest-of-result compute rest of result recursively )) form complete result )))))) | function div_terms(L1, L2) { if (is_empty_termlist(L1)) { return list(the_empty_termlist, the_empty_termlist); } else { const t1 = first_term(L1); const t2 = first_term(L2); if (order(t2) > order(t1)) { return list(the_empty_termlist, L1); } else { const new_c = div(coeff(t1), coeff(t2)); const new_o = order(t1) - order(t2); const rest_of_result = $\langle{}$compute rest of result recursively$\rangle$; $\langle{}$form and return complete result$\rangle$ } } } |
Our polynomial system illustrates how objects of one type
(polynomials) may in fact be complex objects that have objects of many
different types as parts. This poses no real difficulty in defining
generic operations. We need only install appropriate generic operations
for performing the necessary manipulations of the parts of the
compound types. In fact, we saw that polynomials form a kind of
recursive data abstraction,
in that parts of a polynomial may
themselves be polynomials. Our generic operations and our
data-directed programming style can handle this complication without
much trouble.
On the other hand, polynomial algebra is a system for which the data
types cannot be naturally arranged in a tower. For instance, it is
possible to have polynomials in $x$ whose
coefficients are polynomials in $y$. It is also
possible to have polynomials in $y$ whose
coefficients are polynomials in $x$. Neither of
these types is above
the other in any natural way, yet it is
often necessary to add together elements from each set. There are several
ways to do this. One possibility is to convert one polynomial to the type
of the other by expanding and rearranging terms so that both polynomials
have the same principal variable. One can impose a towerlike structure on
this by ordering the variables and thus always converting any polynomial
to a
canonical form
with the highest-priority variable
dominant and the lower-priority variables buried in the coefficients.
This strategy works fairly well, except that the conversion may expand
a polynomial unnecessarily, making it hard to read and perhaps less
efficient to work with. The tower strategy is certainly not natural
for this domain or for any domain where the user can invent new types
dynamically using old types in various combining forms, such as
trigonometric functions, power series, and integrals.
It should not be surprising that controlling coercion is a serious problem in the design of large-scale algebraic-manipulation systems. Much of the complexity of such systems is concerned with relationships among diverse types. Indeed, it is fair to say that we do not yet completely understand coercion. In fact, we do not yet completely understand the concept of a data type. Nevertheless, what we know provides us with powerful structuring and modularity principles to support the design of large systems.
We can extend our generic arithmetic system to include rational
functions. These are fractions
whose numerator and
denominator are polynomials, such as
\[
\begin{array}{l}
\dfrac{x+1}{x^3 -1}
\end{array}
\]
The system should be able to add, subtract, multiply, and divide
rational functions, and to perform such computations as
\[
\begin{array}{lll}
\dfrac{x+1}{x^3 -1}+\dfrac{x}{x^2 -1} & = & \dfrac{x^3 +2x^2 +3x +1}{x^4 +
x^3 -x-1}
\end{array}
\]
(Here the sum has been simplified by removing common factors.
Ordinary cross multiplication
would have produced a
fourth-degree polynomial over a fifth-degree polynomial.)
If we modify our rational-arithmetic package so that it uses generic operations, then it will do what we want, except for the problem of reducing fractions to lowest terms.
Original | JavaScript |
(define p1 (make-polynomial 'x '((2 1)(0 1)))) (define p2 (make-polynomial 'x '((3 1)(0 1)))) (define rf (make-rational p2 p1)) | const p1 = make_polynomial("x", list(make_term(2, 1), make_term(0, 1))); const p2 = make_polynomial("x", list(make_term(3, 1), make_term(0, 1))); const rf = make_rational(p2, p1); |
We can reduce polynomial fractions to lowest terms using the same idea
we used with integers: modifying
make-rat
make_rat
to divide both the numerator and the denominator by their greatest common
divisor. The notion of
greatest common divisor
makes sense for polynomials. In
fact, we can compute the GCD of two polynomials using essentially the
same Euclid's Algorithm that works for integers.[7] The
integer version is
Original | JavaScript |
(define (gcd a b) (if (= b 0) a (gcd b (remainder a b)))) | function gcd(a, b) { return b === 0 ? a : gcd(b, a % b); } |
Original | JavaScript |
(define (gcd-terms a b) (if (empty-termlist? b) a (gcd-terms b (remainder-terms a b)))) | function gcd_terms(a, b) { return is_empty_termlist(b) ? a : gcd_terms(b, remainder_terms(a, b)); } |
Original | JavaScript |
(define p1 (make-polynomial 'x '((4 1) (3 -1) (2 -2) (1 2)))) (define p2 (make-polynomial 'x '((3 1) (1 -1)))) (greatest-common-divisor p1 p2) | const p1 = make_polynomial("x", list(make_term(4, 1), make_term(3, -1), make_term(2, -2), make_term(1, 2))); const p2 = make_polynomial("x", list(make_term(3, 1), make_term(1, -1))); greatest_common_divisor(p1, p2); |
$P_{1}$: | $x^2 - 2x + 1$ |
$P_{2}$: | $11x^2 + 7$ |
$P_{3}$: | $13x + 5$ |
We can solve the problem exhibited in exercise 2.99 if we use the following modification of the GCD algorithm (which really works only in the case of polynomials with integer coefficients). Before performing any polynomial division in the GCD computation, we multiply the dividend by an integer constant factor, chosen to guarantee that no fractions will arise during the division process. Our answer will thus differ from the actual GCD by an integer constant factor, but this does not matter in the case of reducing rational functions to lowest terms; the GCD will be used to divide both the numerator and denominator, so the integer constant factor will cancel out.
More precisely, if $P$ and $Q$ are polynomials, let $O_1$ be the order of $P$ (i.e., the order of the largest term of $P$) and let $O_2$ be the order of $Q$. Let $c$ be the leading coefficient of $Q$. Then it can be shown that, if we multiply $P$ by the integerizing factor $c^{1+O_{1} -O_{2}}$, the resulting polynomial can be divided by $Q$ by using the div-terms div_terms algorithm without introducing any fractions. The operation of multiplying the dividend by this constant and then dividing is sometimes called the pseudodivision of $P$ by $Q$. The remainder of the division is called the pseudoremainder.
Thus, here is how to reduce a rational function to lowest terms:
Original | JavaScript |
(define (reduce-integers n d) (let ((g (gcd n d))) (list (/ n g) (/ d g)))) | function reduce_integers(n, d) { const g = gcd(n, d); return list(n / g, d / g); } |
Original | JavaScript |
(define p1 (make-polynomial 'x '((1 1)(0 1)))) (define p2 (make-polynomial 'x '((3 1)(0 -1)))) (define p3 (make-polynomial 'x '((1 1)))) (define p4 (make-polynomial 'x '((2 1)(0 -1)))) (define rf1 (make-rational p1 p2)) (define rf2 (make-rational p3 p4)) (add rf1 rf2) | const p1 = make_polynomial("x", list(make_term(1, 1), make_term(0, 1))); const p2 = make_polynomial("x", list(make_term(3, 1), make_term(0, -1))); const p3 = make_polynomial("x", list(make_term(1, 1))); const p4 = make_polynomial("x", list(make_term(2, 1), make_term(0, -1))); const rf1 = make_rational(p1, p2); const rf2 = make_rational(p3, p4); add(rf1, rf2); |
The GCD computation is at the heart of any system that does operations on rational functions. The algorithm used above, although mathematically straightforward, is extremely slow. The slowness is due partly to the large number of division operations and partly to the enormous size of the intermediate coefficients generated by the pseudodivisions. One of the active areas in the development of algebraic-manipulation systems is the design of better algorithms for computing polynomial GCDs.[10]
numberto a polynomial by regarding it as a polynomial of degree zero whose coefficient is the number. This is necessary if we are going to perform operations such as \[ {\left[ x^2 +(y+1)x+5\right]+ \left[ x^2 +2x+1\right]} \] which requires adding the coefficient $y+1$ to the coefficient 2.
measure$m(x)$ with the properties that $m(xy)\geq m(x)$ for any nonzero $x$ and $y$ and that, given any $x$ and $y$, there exists a $q$ such that $y=qx+r$ and either $r=0$ or $m(r) < m(x)$. From an abstract point of view, this is what is needed to prove that Euclid's Algorithm works. For the domain of integers, the measure $m$ of an integer is the absolute value of the integer itself. For the domain of polynomials, the measure of a polynomial is its degree.