Chapter 6B

Chapter 6, Concurrent Processes, Ida A. Flynn and Ann McIver McHoes, Understanding Operating Systems, Second Edition, PWS Publishing Company (1993)

Lesson B: Process Cooperation, Producers and Consumers, Readers and Writers, Concurrent Programming, Applications of Concurrent Programming

Problem 4. Describe "explicit parallelism."

Explicit parallelism is parallelism achieved by placing the burden of identifying and controlling parallel processes exclusively upon the applications programmer.

Problem 5. Describe "implicit parallelism".

Implicit parallelism is parallelism achieved by placing the burden of identifying and implementing parallel processes exclusively upon the compiler and the operating system.

Remarks regarding explicit and implicit parallelism.

There has been vigorous debate within the computer science community about whether compilers should require explicit parallelism or permit implicit parallelism. Implicit parallelism has the potential of making parallel processing accessible to a wider audience of programmers. It can provide a systematic approach to splitting a task into parallel processes based upon the structure of algorithms provided by a programmer. This is an argument that is similar to that for optimizing compilers. It places a lower bound upon achievable performance that can reduce the level of required skill and knowledge by the programmer, and yet still achieve (hopefully) acceptable results.

It is usually true that any process controlled by a skilled and knowledgable programmer can achieve higher levels of desired performance than can be achieved by optimization routines implemented by an optimizing compiler or compiler that implements implicit parallel processing.

In either case, the competence of the programmer is required to even make implicit parallelism possible. The classical example is that of performing computations with matrices. A knowledgable programmer can be alert to opportunities to perform an eigenvalue decomposition on a matrix and split a task into independent processes. This is something that a compiler that implements implicit parallel processing is unlikely to do on its own. Even if this were done, there are examples where this is undesirable. There really is no adequate substitute for the programmer knowing the problem set which the programmer is trying to solve. It is not enough to merely know the syntax of a programming language, just as it is not sufficient to merely know a language or teaching methods (although these are important). You must also know something worthwhile to say. In your development as information system professionals, you must learn not only how the tools of the trade work, you must also know what the tools are used for. (Soap Box 13.2A, change 2 incorporated.)

Problem 6. Rewirte each of the following arithmetic expressions to take advantage of concurrent processing and then code each. Use the terms COBEGIN and COEND to delimit the sections of concurrent code.

a. (X (Y * Z * W * R) + M + N + P)

Recall from algebra that parentheses have higher precedence than arithmetic operators. Multiplication is higher precedence than addition. See the course web site link to Arithmetic.

Original Equation	(X	(Y	* Z	* W	* R)	+ M	+ N	+ P)	Processor Assignments
Time Slice #1	(X	(T1 P1		*T2) P2		+T3 P3		+ P)	Compute T1=YZ with Processor P1 Compute T2=RW with Processor P2 Compute T3=M+N with Processor P3
Time Slice #2	(X	*T4 P1				+T5) P3			Compute T4=T1 * T2 with Processor P1 Compute T5=T3 + P with Processor P3
Time Slice #3	(T6 P1					+T5)			Compute T6=X * T4 with Processor P1
Time Slice #3	T7 P1								Compute T7=T6 + T5 with Processor P1

T7 is the final result.

Main
Begin
    COBEGIN
      T1 = Y * Z
      T2 = W * R
      T3 = M + N
    COEND
    COBEGIN
      T4 = T1 * T2
      T5 = T3 + P
    COEND
      T6 = X * T4
      T7 = T6 + T5
End

b. ( (J + K * L * M * N ) * I)

Original Equation	( (J	+K	*L	*M	*N)	*I)	Processor Assignments
Time Slice #1	( (J	+T1 P1		*T2) P2		*I)	Compute T1=KL with Processor P1 Compute T2=MN with Processor P2
Time Slice #2	( (J	+T3) P1				*I)
Time Slice #3	(T4 P1					*I)
Time Slice #4	T5 P1

The final answer is in T5.

Main
Begin
    COBEGIN
      T1 = K * L
      T2 = M * N
    COEND
    T3 = T1 * T2
    T4 = J + T3
    T5 = T4 * I
End

Problem 7. Use the P and V semaphore operations to simulate the traffic flow at the intersection of two one-way streets. The following rules should be satisfied:

Only one car can be crossing at any given time.
A car should be allowed to cross the intersection only if there are no cars coming from the other street.
When cars are coming from both streets, they should take turns to prevent indefinite postponements in either street.

Rule (b) does not make sense. If obeyed and more than one car is at the intersection, this rule generates indefinite postponement.

Only one car can be in the intersection at a time. (Otherwise, you get a paint sample.)
When cars are present from both streets, alternate direction of permitted travel through the intersection to prevent starvation. (They are going to grocery stores.)

The critical region is the intersection. When a car arrives at the intersection, it needs to include direction of travel when issuing the probe. This is used when the operating system chooses which car from the queue to send through the intersection next.

Procedure Main
Begin
Intersection = 1
COBEGIN
Repeat East
Repeat North
COEND
End

Procedure East
Begin
    Probe(Intersection, East)
    Transit Intersection
    Vacate(Intersection, East)
End

Procedure North
Begin
    Probe(Intersection, North)
    Transit Intersection
    Vacate(Intersection, North)
End

Probe(Intersection, Direction)
Begin
    Intersection = Intersection - 1
    If Intersection < 0
      Then move token and Direction to Waiting Queue.
End

Vacate(Intersection, Direction)
Begin
    Intersection = Intersection + 1
    If Intersection <1    // If someone is in the waiting queue
      Then
        If Other_Direction is in Waiting Queue    // Alternate directions
          Then select token with Other_Direction from Waiting Queue to move to Ready Queue
          Else select token with Direction from Waiting Queue to move to Ready Queue
          End If
      End If
End

Problem 8. Consider the following program segments for two different processes executing concurrently:

Procedure P1
Begin
    Do A = 1, 3
      X = X + 1
      End Do
End

Procedure P2
Begin
    Do B = 1, 3
      X = X + 1
      End Do
End

Variable X starts at zero and is a shared variable. Variables A and B are not shared variables.

If Processes P1 and P2 execute only once at any speed, what are the possible resulting values of X? Explain your answer.

Answer: The starting value of X is 0. The final value of X is 6. During execution of P1 and P2, X takes on all integer values beginning with 0 and ending with 6. We cannot tell which procedure is incrementing X by observing only the value of X.

Problem 10. Consider the following segment taken from a FORTRAN program:

DO I = 1, 12
READ *, X
IF (X .EQ. 0) Y(I) = 0
IF (X .NE. 0) Y(I) = 10
ENDDO

See the course web site for a tutorial on FORTRAN to get the needed background for this problem.

a. Recode it so it will run more efficiently in a single processor system.

DO I = 1, 12
READ *, X
IF (X .NE. 0) X = 10
Y(I) = X
ENDDO

b. Given that a multiprocessing environment with four symmetrical processors is available, recode the segment as an efficient concurrent program that performs the same function as the original FORTRAN program.

DO I = 1, 3
READ *, W
IF (W .NE. 0) W = 10
Y(I) = X
ENDDO

DO I = 4, 6
READ *, X
IF (X .NE. 0) X = 10
Y(I) = X
ENDDO

DO I = 7, 9
READ *, Y
IF (Y .NE. 0) Y = 10
Y(I) = X
ENDDO

DO I = 10, 12
READ *, Z
IF (Z .NE. 0) Z = 10
Y(I) = Z
ENDDO

c. Given that all processors have identical capabilities, compare the execution speeds of the original FORTRAN segment with the execution speeds of your segments for parts (a) and (b).

Answer: If you ignore overhead in setting up the do-loops and only concentrate on the productive operations inside the loops, the parallel implementation in part (b) will complete execution in about 25% of the time required for the single processor implementation in part (a).