Chapter 6, Concurrent Processes, Ida A. Flynn and Ann McIver McHoes, Understanding Operating Systems, Second Edition, PWS Publishing Company (1993)
Lesson B: Process Cooperation, Producers and Consumers, Readers and Writers, Concurrent Programming, Applications of Concurrent Programming
Problem 4. Describe "explicit parallelism."
Explicit parallelism is parallelism achieved by placing the burden of identifying and controlling parallel processes exclusively upon the applications programmer.
Problem 5. Describe "implicit parallelism".
Implicit parallelism is parallelism achieved by placing the burden of identifying and implementing parallel processes exclusively upon the compiler and the operating system.
Remarks regarding explicit and implicit parallelism.
There has been vigorous debate within the computer science community about whether compilers should require explicit parallelism or permit implicit parallelism. Implicit parallelism has the potential of making parallel processing accessible to a wider audience of programmers. It can provide a systematic approach to splitting a task into parallel processes based upon the structure of algorithms provided by a programmer. This is an argument that is similar to that for optimizing compilers. It places a lower bound upon achievable performance that can reduce the level of required skill and knowledge by the programmer, and yet still achieve (hopefully) acceptable results.
It is usually true that any process controlled by a skilled and knowledgable programmer can achieve higher levels of desired performance than can be achieved by optimization routines implemented by an optimizing compiler or compiler that implements implicit parallel processing.
In either case, the competence of the programmer is required to even make implicit parallelism possible. The classical example is that of performing computations with matrices. A knowledgable programmer can be alert to opportunities to perform an eigenvalue decomposition on a matrix and split a task into independent processes. This is something that a compiler that implements implicit parallel processing is unlikely to do on its own. Even if this were done, there are examples where this is undesirable. There really is no adequate substitute for the programmer knowing the problem set which the programmer is trying to solve. It is not enough to merely know the syntax of a programming language, just as it is not sufficient to merely know a language or teaching methods (although these are important). You must also know something worthwhile to say. In your development as information system professionals, you must learn not only how the tools of the trade work, you must also know what the tools are used for. (Soap Box 13.2A, change 2 incorporated.)
Problem 6. Rewirte each of the following arithmetic expressions to take advantage of concurrent processing and then code each. Use the terms COBEGIN and COEND to delimit the sections of concurrent code.
a. (X (Y * Z * W * R) + M + N + P)
Recall from algebra that parentheses have higher precedence than arithmetic operators. Multiplication is higher precedence than addition. See the course web site link to Arithmetic.
Original Equation |
(X | (Y | * Z | * W | * R) | + M | + N | + P) | Processor Assignments |
Time Slice #1 | (X | (T1 P1 |
*T2) P2 |
+T3 P3 |
+ P) | Compute T1=Y*Z with Processor P1 Compute T2=R*W with Processor P2 Compute T3=M+N with Processor P3 |
|||
Time Slice #2 | (X | *T4 P1 |
+T5) P3 |
Compute T4=T1 * T2 with Processor P1 Compute T5=T3 + P with Processor P3 |
|||||
Time Slice #3 | (T6 P1 |
+T5) | Compute T6=X * T4 with Processor P1 | ||||||
Time Slice #3 | T7 P1 |
Compute T7=T6 + T5 with Processor P1 |
T7 is the final result.
Main
Begin
COBEGIN
T1 = Y * Z
T2 = W * R
T3 = M + N
COEND
COBEGIN
T4 = T1 * T2
T5 = T3 + P
COEND
T6 = X * T4
T7 = T6 + T5
End
b. ( (J + K * L * M * N ) * I)
Original Equation |
( (J | +K | *L | *M | *N) | *I) | Processor Assignments |
Time Slice #1 | ( (J | +T1 P1 |
*T2) P2 |
*I) | Compute T1=K*L with Processor P1 Compute T2=M*N with Processor P2 |
||
Time Slice #2 | ( (J | +T3) P1 |
*I) | ||||
Time Slice #3 | (T4 P1 |
*I) | |||||
Time Slice #4 | T5 P1 |
The final answer is in T5.
Main
Begin
COBEGIN
T1 = K * L
T2 = M * N
COEND
T3 = T1 * T2
T4 = J + T3
T5 = T4 * I
End
Problem 7. Use the P and V semaphore operations to simulate the traffic flow at the intersection of two one-way streets. The following rules should be satisfied:
Rule (b) does not make sense. If obeyed and more than one car is at the intersection, this rule generates indefinite postponement.
The critical region is the intersection. When a car arrives at the intersection, it needs to include direction of travel when issuing the probe. This is used when the operating system chooses which car from the queue to send through the intersection next.
Procedure Main Begin Intersection = 1 COBEGIN Repeat East Repeat North COEND End |
Procedure East Begin Probe(Intersection, East) Transit Intersection Vacate(Intersection, East) End |
Procedure North Begin Probe(Intersection, North) Transit Intersection Vacate(Intersection, North) End |
Probe(Intersection, Direction) Begin Intersection = Intersection - 1 If Intersection < 0 Then move token and Direction to Waiting Queue. End |
Vacate(Intersection, Direction) Begin Intersection = Intersection + 1 If Intersection <1 // If someone is in the waiting queue Then If Other_Direction is in Waiting Queue // Alternate directions Then select token with Other_Direction from Waiting Queue to move to Ready Queue Else select token with Direction from Waiting Queue to move to Ready Queue End If End If End |
Problem 8. Consider the following program segments for two different processes executing concurrently:
Procedure P1 Begin Do A = 1, 3 X = X + 1 End Do End |
Procedure P2 Begin Do B = 1, 3 X = X + 1 End Do End |
Variable X starts at zero and is a shared variable. Variables A and B are not shared variables.
If Processes P1 and P2 execute only once at any speed, what are the possible resulting values of X? Explain your answer.
Answer: The starting value of X is 0. The final value of X is 6. During execution of P1 and P2, X takes on all integer values beginning with 0 and ending with 6. We cannot tell which procedure is incrementing X by observing only the value of X.
Problem 10. Consider the following segment taken from a FORTRAN program:
DO I = 1, 12 READ *, X IF (X .EQ. 0) Y(I) = 0 IF (X .NE. 0) Y(I) = 10 ENDDO |
See the course web site for a tutorial on FORTRAN to get the needed background for this problem.
a. Recode it so it will run more efficiently in a single processor system.
DO I = 1, 12 READ *, X IF (X .NE. 0) X = 10 Y(I) = X ENDDO |
b. Given that a multiprocessing environment with four symmetrical processors is available, recode the segment as an efficient concurrent program that performs the same function as the original FORTRAN program.
DO I = 1, 3 READ *, W IF (W .NE. 0) W = 10 Y(I) = X ENDDO |
DO I = 4, 6 READ *, X IF (X .NE. 0) X = 10 Y(I) = X ENDDO |
DO I = 7, 9 READ *, Y IF (Y .NE. 0) Y = 10 Y(I) = X ENDDO |
DO I = 10, 12 READ *, Z IF (Z .NE. 0) Z = 10 Y(I) = Z ENDDO |
c. Given that all processors have identical capabilities, compare the execution speeds of the original FORTRAN segment with the execution speeds of your segments for parts (a) and (b).
Answer: If you ignore overhead in setting up the do-loops and only concentrate on the productive operations inside the loops, the parallel implementation in part (b) will complete execution in about 25% of the time required for the single processor implementation in part (a).