We've seen that the difficulty in dealing with concurrent processes threads is rooted in the need to consider the interleaving of the order of events in the different processes threads. For example, suppose we have two processes threads, one with three ordered events $(a,b,c)$ and one with three ordered events $(x,y,z)$. If the two processes threads run concurrently, with no constraints on how their execution is interleaved, then there are 20 different possible orderings for the events that are consistent with the individual orderings for the two processes threads: \[ \begin{array}{cccc} (a,b,c,x,y,z) & (a,x,b,y,c,z) & (x,a,b,c,y,z) & (x,a,y,z,b,c)\\ (a,b,x,c,y,z) & (a,x,b,y,z,c) & (x,a,b,y,c,z) & (x,y,a,b,c,z)\\ (a,b,x,y,c,z) & (a,x,y,b,c,z) & (x,a,b,y,z,c) & (x,y,a,b,z,c)\\ (a,b,x,y,z,c) & (a,x,y,b,z,c) & (x,a,y,b,c,z) & (x,y,a,z,b,c)\\ (a,x,b,c,y,z) & (a,x,y,z,b,c) & (x,a,y,b,z,c) & (x,y,z,a,b,c) \end{array} \] As programmers designing this system, we would have to consider the effects of each of these 20 orderings and check that each behavior is acceptable. Such an approach rapidly becomes unwieldy as the numbers of processes threads and events increase.
A more practical approach to the design of concurrent systems is to devise general mechanisms that allow us to constrain the interleaving of concurrent processes threads so that we can be sure that the program behavior is correct. Many mechanisms have been developed for this purpose. In this section, we describe one of them, the serializer.
Serialization implements the following idea: Processes Threads will execute concurrently, but there will be certain collections of procedures functions that cannot be executed concurrently. More precisely, serialization creates distinguished sets of procedures functions such that only one execution of a procedure function in each serialized set is permitted to happen at a time. If some procedure function in the set is being executed, then a process thread that attempts to execute any procedure function in the set will be forced to wait until the first execution has finished.
We can use serialization to control access to shared variables. For example, if we want to update a shared variable based on the previous value of that variable, we put the access to the previous value of the variable and the assignment of the new value to the variable in the same procedure function. We then ensure that no other procedure function that assigns to the variable can run concurrently with this procedure function by serializing all of these procedures functions with the same serializer. This guarantees that the value of the variable cannot be changed between an access and the corresponding assignment.
Original | JavaScript | |
To make the above mechanism more concrete, suppose that we have extended Scheme JavaScript to include a procedure function called parallel-execute: concurrent_execute:
Original | JavaScript |
(parallel-execute p$_{1}$ p$_{2}$ $\ldots$ p$_{k}$) | concurrent_execute($f_{1}$, $f_{2}$, $\ldots$, $f_{k}$) |
As an example of how this is used, consider
Original | JavaScript |
(define x 10) (parallel-execute (lambda () (set! x (* x x))) (lambda () (set! x (+ x 1)))) | let x = 10; concurrent_execute(() => { x = x * x; }, () => { x = x + 1; }); |
Original | JavaScript | |||||||||||||||||||||||||
This creates two concurrent
processes—$P_1$, which
sets x to
x times x,
and $P_2$, which
increments x. After execution is
complete, x will be left with one of five
possible values, depending on the interleaving of the events of
$P_1$ and $P_2$:
|
This creates two concurrent
threads—$T_1$, which sets
x to x times
x, and $T_2$,
which increments x. After execution is
complete, x will be left with one of five
possible values, depending on the interleaving of the events of
$T_1$ and $T_2$:
|
We can constrain the concurrency by using serialized procedures, functions, which are created by serializers. Serializers are constructed by make-serializer, make_serializer, whose implementation is given below. A serializer takes a procedure function as argument and returns a serialized procedure function that behaves like the original procedure. function. All calls to a given serializer return serialized procedures functions in the same set.
Thus, in contrast to the example above, executing
Original | JavaScript |
(define x 10) (define s (make-serializer)) (parallel-execute (s (lambda () (set! x (* x x)))) (s (lambda () (set! x (+ x 1))))) | let x = 10; const s = make_serializer(); concurrent_execute(s(() => { x = x * x; }), s(() => { x = x + 1; })); |
Here is a version of the make-account make_account procedure function from section 3.1.1, where the deposits and withdrawals have been serialized:
Original | JavaScript |
(define (make-account balance) (define (withdraw amount) (if (>= balance amount) (begin (set! balance (- balance amount)) balance) "Insufficient funds")) (define (deposit amount) (set! balance (+ balance amount)) balance) (let ((protected (make-serializer))) (define (dispatch m) (cond ((eq? m 'withdraw) (protected withdraw)) ((eq? m 'deposit) (protected deposit)) ((eq? m 'balance) balance) (else (error "Unknown request - - MAKE-ACCOUNT" m)))) dispatch)) | function make_account(balance) { function withdraw(amount) { if (balance > amount) { balance = balance - amount; return balance; } else { return "Insufficient funds"; } } function deposit(amount) { balance = balance + amount; return balance; } const protect = make_serializer(); function dispatch(m) { return m === "withdraw" ? protect(withdraw) : m === "deposit" ? protect(deposit) : m === "balance" ? balance : error(m, "unknown request -- make_account"); } return dispatch; } |
Original | JavaScript |
(define x 10) (define s (make-serializer)) (parallel-execute (lambda () (set! x ((s (lambda () (* x x)))))) (s (lambda () (set! x (+ x 1))))) | let x = 10; const s = make_serializer(); concurrent_execute( () => { x = s(() => x * x)(); }, s(() => { x = x + 1; })); |
Original | JavaScript |
(define x 10) (parallel-execute (lambda () (set! x (* x x))) (lambda () (set! x (* x x x)))) | let x = 10; concurrent_execute(() => { x = x * x; }, () => { x = x * x * x; }); |
Original | JavaScript |
(define x 10) (define s (make-serializer)) (parallel-execute (s (lambda () (set! x (* x x)))) (s (lambda () (set! x (* x x x))))) | let x = 10; const s = make_serializer(); concurrent_execute(s(() => { x = x * x; }), s(() => { x = x * x * x; })); |
Original | JavaScript |
(define (make-account balance) (define (withdraw amount) (if (>= balance amount) (begin (set! balance (- balance amount)) balance) "Insufficient funds")) (define (deposit amount) (set! balance (+ balance amount)) balance) (let ((protected (make-serializer))) (define (dispatch m) (cond ((eq? m 'withdraw) (protected withdraw)) ((eq? m 'deposit) (protected deposit)) ((eq? m 'balance) ((protected (lambda () balance)))) ; serialized (else (error "Unknown request - - MAKE-ACCOUNT" m)))) dispatch)) | function make_account(balance) { function withdraw(amount) { if (balance > amount) { balance = balance - amount; return balance; } else { return "Insufficient funds"; } } function deposit(amount) { balance = balance + amount; return balance; } const protect = make_serializer(); function dispatch(m) { return m === "withdraw" ? protect(withdraw) : m === "deposit" ? protect(deposit) : m === "balance" ? protect(() => balance)(undefined) // serialized : error(m, "unknown request -- make_account"); } return dispatch; } |
Original | JavaScript |
(define (make-account balance) (define (withdraw amount) (if (>= balance amount) (begin (set! balance (- balance amount)) balance) "Insufficient funds")) (define (deposit amount) (set! balance (+ balance amount)) balance) (let ((protected (make-serializer))) (let ((protected-withdraw (protected withdraw)) (protected-deposit (protected deposit))) (define (dispatch m) (cond ((eq? m 'withdraw) protected-withdraw) ((eq? m 'deposit) protected-deposit) ((eq? m 'balance) balance) (else (error "Unknown request - - MAKE-ACCOUNT" m)))) dispatch))) | function make_account(balance) { function withdraw(amount) { if (balance > amount) { balance = balance - amount; return balance; } else { return "Insufficient funds"; } } function deposit(amount) { balance = balance + amount; return balance; } const protect = make_serializer(); const protect_withdraw = protect(withdraw); const protect_deposit = protect(deposit); function dispatch(m) { return m === "withdraw" ? protect_withdraw : m === "deposit" ? protect_deposit : m === "balance" ? balance : error(m, "unknown request -- make_account"); } return dispatch; } |
Serializers provide a powerful abstraction that helps isolate the complexities of concurrent programs so that they can be dealt with carefully and (hopefully) correctly. However, while using serializers is relatively straightforward when there is only a single shared resource (such as a single bank account), concurrent programming can be treacherously difficult when there are multiple shared resources.
To illustrate one of the difficulties that can arise, suppose we wish to swap the balances in two bank accounts. We access each account to find the balance, compute the difference between the balances, withdraw this difference from one account, and deposit it in the other account. We could implement this as follows:[2]
Original | JavaScript |
(define (exchange account1 account2) (let ((difference (- (account1 'balance) (account2 'balance)))) ((account1 'withdraw) difference) ((account2 'deposit) difference))) | function exchange(account1, account2) { const difference = account1("balance") - account2("balance"); account1("withdraw")(difference); account2("deposit")(difference); } |
This procedure function works well when only a single process thread is trying to do the exchange. Suppose, however, that Peter and Paul both have access to accounts $a_1$, $a_2$, and $a_3$, and that Peter exchanges $a_1$ and $a_2$ while Paul concurrently exchanges $a_1$ and $a_3$. Even with account deposits and withdrawals serialized for individual accounts (as in the make-account make_account procedure function shown above in this section), exchange can still produce incorrect results. For example, Peter might compute the difference in the balances for $a_1$ and $a_2$, but then Paul might change the balance in $a_1$ before Peter is able to complete the exchange.[3] For correct behavior, we must arrange for the exchange procedure function to lock out any other concurrent accesses to the accounts during the entire time of the exchange.
One way we can accomplish this is by using both accounts' serializers to serialize the entire exchange procedure function. To do this, we will arrange for access to an account's serializer. Note that we are deliberately breaking the modularity of the bank-account object by exposing the serializer. The following version of make-account make_account is identical to the original version given in section 3.1.1, except that a serializer is provided to protect the balance variable, and the serializer is exported via message passing:
Original | JavaScript |
(define (make-account-and-serializer balance) (define (withdraw amount) (if (>= balance amount) (begin (set! balance (- balance amount)) balance) "Insufficient funds")) (define (deposit amount) (set! balance (+ balance amount)) balance) (let ((balance-serializer (make-serializer))) (define (dispatch m) (cond ((eq? m 'withdraw) withdraw) ((eq? m 'deposit) deposit) ((eq? m 'balance) balance) ((eq? m 'serializer) balance-serializer) (else (error "Unknown request - - MAKE-ACCOUNT" m)))) dispatch)) | function make_account_and_serializer(balance) { function withdraw(amount) { if (balance > amount) { balance = balance - amount; return balance; } else { return "Insufficient funds"; } } function deposit(amount) { balance = balance + amount; return balance; } const balance_serializer = make_serializer(); return m => m === "withdraw" ? withdraw : m === "deposit" ? deposit : m === "balance" ? balance : m === "serializer" ? balance_serializer : error(m, "unknown request -- make_account"); } |
We can use this to do serialized deposits and withdrawals. However, unlike our earlier serialized account, it is now the responsibility of each user of bank-account objects to explicitly manage the serialization, for example as follows:[4]
Original | JavaScript |
(define (deposit account amount) (let ((s (account 'serializer)) (d (account 'deposit))) ((s d) amount))) | function deposit(account, amount) { const s = account("serializer"); const d = account("deposit"); s(d(amount)); } |
Exporting the serializer in this way gives us enough flexibility to implement a serialized exchange program. We simply serialize the original exchange procedure function with the serializers for both accounts:
Original | JavaScript |
(define (serialized-exchange account1 account2) (let ((serializer1 (account1 'serializer)) (serializer2 (account2 'serializer))) ((serializer1 (serializer2 exchange)) account1 account2))) | function serialized_exchange(account1, account2) { const serializer1 = account1("serializer"); const serializer2 = account2("serializer"); serializer1(serializer2(exchange))(account1, account2); } |
Original | JavaScript |
(define (transfer from-account to-account amount) ((from-account 'withdraw) amount) ((to-account 'deposit) amount)) | function transfer(from_account, to_account, amount) { from_account("withdraw")(amount); to_account("deposit")(amount); } |
Original | JavaScript |
(define (make-account-and-serializer balance) (define (withdraw amount) (if (>= balance amount) (begin (set! balance (- balance amount)) balance) "Insufficient funds")) (define (deposit amount) (set! balance (+ balance amount)) balance) (let ((balance-serializer (make-serializer))) (define (dispatch m) (cond ((eq? m 'withdraw) (balance-serializer withdraw)) ((eq? m 'deposit) (balance-serializer deposit)) ((eq? m 'balance) balance) ((eq? m 'serializer) balance-serializer) (else (error "Unknown request - - MAKE-ACCOUNT" m)))) dispatch)) | function make_account_and_serializer(balance) { function withdraw(amount) { if (balance > amount) { balance = balance - amount; return balance; } else { return "Insufficient funds"; } } function deposit(amount) { balance = balance + amount; return balance; } const balance_serializer = make_serializer(); return m => m === "withdraw" ? balance_serializer(withdraw) : m === "deposit" ? balance_serializer(deposit) : m === "balance" ? balance : m === "serializer" ? balance_serializer : error(m, "unknown request -- make_account"); } |
Original | JavaScript |
(define (deposit account amount) ((account 'deposit) amount)) | function deposit(account, amount) { account("deposit")(amount); } |
We implement serializers in terms of a more primitive synchronization mechanism called a mutex. A mutex is an object that supports two operations—the mutex can be acquired, and the mutex can be released. Once a mutex has been acquired, no other acquire operations on that mutex may proceed until the mutex is released.[5] In our implementation, each serializer has an associated mutex. Given a procedure p, function f, the serializer returns a procedure function that acquires the mutex, runs p, f, and then releases the mutex. This ensures that only one of the procedures functions produced by the serializer can be running at once, which is precisely the serialization property that we need to guarantee.
Original | JavaScript | |
To apply serializers to functions that take an arbitrary number of arguments, we use JavaScript's rest parameter and spread syntax. The ... in front of the parameter args collects the rest (here all) of the arguments of any call of the function into a vector data structure. The ... in front of args in the application f(...args) spreads the elements of args so that they become separate arguments of f. |
Original | JavaScript |
(define (make-serializer) (let ((mutex (make-mutex))) (lambda (p) (define (serialized-p . args) (mutex 'acquire) (let ((val (apply p args))) (mutex 'release) val)) serialized-p))) | function make_serializer() { const mutex = make_mutex(); return f => { function serialized_f(...args) { mutex("acquire"); const val = f(...args); mutex("release"); return val; } return serialized_f; }; } |
The mutex is a mutable object (here we'll use a one-element list, which we'll refer to as a cell) that can hold the value true or false. When the value is false, the mutex is available to be acquired. When the value is true, the mutex is unavailable, and any process thread that attempts to acquire the mutex must wait.
Our mutex constructor make-mutex make_mutex begins by initializing the cell contents to false. To acquire the mutex, we test the cell. If the mutex is available, we set the cell contents to true and proceed. Otherwise, we wait in a loop, attempting to acquire over and over again, until we find that the mutex is available.[6] To release the mutex, we set the cell contents to false.
Original | JavaScript |
(define (make-mutex) (let ((cell (list false))) (define (the-mutex m) (cond ((eq? m 'acquire) (if (test-and-set! cell) (the-mutex 'acquire))) ; retry ((eq? m 'release) (clear! cell)))) the-mutex)) (define (clear! cell) (set-car! cell false)) | function make_mutex() { const cell = list(false); function the_mutex(m) { return m === "acquire" ? test_and_set(cell) ? the_mutex("acquire") // retry : true : m === "release" ? clear(cell) : error(m, "unknown request -- mutex"); } return the_mutex; } function clear(cell) { set_head(cell, false); } |
Test-and-set! The function test_and_set tests the cell and returns the result of the test. In addition, if the test was false, test-and-set! test_and_set sets the cell contents to true before returning false. We can express this behavior as the following procedure: function:
Original | JavaScript |
(define (test-and-set! cell) (if (car cell) true (begin (set-car! cell true) false))) | function test_and_set(cell) { if (head(cell)) { return true; } else { set_head(cell, true); return false; } } |
However, this implementation of test-and-set! test_and_set does not suffice as it stands. There is a crucial subtlety here, which is the essential place where concurrency control enters the system: The test-and-set! test_and_set operation must be performed atomically. That is, we must guarantee that, once a process thread has tested the cell and found it to be false, the cell contents will actually be set to true before any other process thread can test the cell. If we do not make this guarantee, then the mutex can fail in a way similar to the bank-account failure in figure 3.53. (See exercise 3.46.)
The actual implementation of test-and-set! test_and_set depends on the details of how our system runs concurrent processes. threads. For example, we might be executing concurrent processes threads on a sequential processor using a time-slicing mechanism that cycles through the processes, threads, permitting each process thread to run for a short time before interrupting it and moving on to the next process. thread. In that case, test-and-set! test_and_set can work by disabling time slicing during the testing and setting.[7] Alternatively, multiprocessing computers provide instructions that support atomic operations directly in hardware.[8]
Now that we have seen how to implement serializers, we can see that account exchanging still has a problem, even with the serialized-exchange serialized_exchange procedure function above. Imagine that Peter attempts to exchange $a_1$ with $a_2$ while Paul concurrently attempts to exchange $a_2$ with $a_1$. Suppose that Peter's process thread reaches the point where it has entered a serialized procedure function protecting $a_1$ and, just after that, Paul's process thread enters a serialized procedure function protecting $a_2$. Now Peter cannot proceed (to enter a serialized procedure function protecting $a_2$) until Paul exits the serialized procedure function protecting $a_2$. Similarly, Paul cannot proceed until Peter exits the serialized procedure function protecting $a_1$. Each process thread is stalled forever, waiting for the other. This situation is called a deadlock. Deadlock is always a danger in systems that provide concurrent access to multiple shared resources.
One way to avoid the deadlock in this situation is to give each account a unique identification number and rewrite serialized-exchange serialized_exchange so that a process thread will always attempt to enter a procedure function protecting the lowest-numbered account first. Although this method works well for the exchange problem, there are other situations that require more sophisticated deadlock-avoidance techniques, or where deadlock cannot be avoided at all. (See exercises 3.48 and 3.49.)[9]
We've seen how programming concurrent systems requires controlling
the ordering of events when different
processes
threads
access shared state, and we've seen how to achieve this control
through judicious use of serializers. But the problems of concurrency
lie deeper than this, because, from a fundamental point of view, it's
not always clear what is meant by shared state.
Mechanisms such as test-and-set! test_and_set require processes threads to examine a global shared flag at arbitrary times. This is problematic and inefficient to implement in modern high-speed processors, where due to optimization techniques such as pipelining and cached memory, the contents of memory may not be in a consistent state at every instant. In contemporary some multiprocessing systems, therefore, the serializer paradigm is being supplanted by new other approaches to concurrency control.[10]
The problematic aspects of shared state also arise in large, distributed
systems. For instance, imagine a distributed banking system where
individual branch banks maintain local values for bank balances and
periodically compare these with values maintained by other branches. In
such a system the value of the account balance
would be
undetermined, except right after synchronization. If Peter deposits money
in an account he holds jointly with Paul, when should we say that the
account balance has changed—when the balance in the local branch
changes, or not until after the synchronization? And if Paul accesses the
account from a different branch, what are the reasonable constraints to
place on the banking system such that the behavior is
correct
? The only thing that might matter for correctness
is the behavior observed by Peter and Paul individually and the
state
of the account immediately after synchronization.
Questions about the real
account balance or the order of
events between synchronizations may be irrelevant or
meaningless.[11]
The basic phenomenon here is that synchronizing different processes, threads, establishing shared state, or imposing an order on events requires communication among the processes. threads. In essence, any notion of time in concurrency control must be intimately tied to communication.[12] It is intriguing that a similar connection between time and communication also arises in the Theory of Relativity, where the speed of light (the fastest signal that can be used to synchronize events) is a fundamental constant relating time and space. The complexities we encounter in dealing with time and state in our computational models may in fact mirror a fundamental complexity of the physical universe.
Original | JavaScript | |
Parallel-execute is not part of standard Scheme, but it can be implemented in MIT Scheme. In our implementation, the new concurrent processes also run concurrently with the original Scheme process. Also, in our implementation, the value returned by parallel-execute is a special control object that can be used to halt the newly created processes. | The function concurrent_execute is not part of the JavaScript standard, but the examples in this section can be implemented in ECMAScript 2020. |
mutexis an abbreviation for mutual exclusion. The general problem of arranging a mechanism that permits concurrent processes threads to safely share resources is called the mutual exclusion problem. Our mutex is a simple variant of the semaphore mechanism (see exercise 3.47), which was introduced in the
THEMultiprogramming System developed at the Technological University of Eindhoven and named for the university's initials in Dutch (
busy-waitingas above. Instead, the system schedules another process thread to run while the first is waiting, and the blocked process thread is awakened when the mutex becomes available.
back outof the deadlocked state and try again. Deadlock-recovery mechanisms are widely used in data-base-management systems, a topic that is treated in detail in
barriers) through which no process thread can proceed until all the processes threads have reached the barrier. Modern Some processors provide machine instructions that permit programmers to establish synchronization points at places where consistency is required. The PowerPC$^{\textrm{TM}}$, for example, includes for this purpose two instructions called SYNC and EIEIO (Enforced In-order Execution of Input/Output).
global clocksthat can be used to establish orderings on events in distributed systems.