

 Delta

Parallel Computing - Doing several computation tasks simultaneously.

$$\begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix} \times \begin{bmatrix} b_{11} & b_{12} & b_{13} \\ b_{21} & b_{22} & b_{23} \\ b_{31} & b_{32} & b_{33} \end{bmatrix} = \begin{bmatrix} a_{11}b_{11} + a_{12}b_{21} + a_{13}b_{31} \\ a_{21}b_{11} + a_{22}b_{21} + a_{23}b_{31} \\ a_{31}b_{11} + a_{32}b_{21} + a_{33}b_{31} \end{bmatrix} = \begin{bmatrix} c_{11} & c_{12} & c_{13} \\ c_{21} & c_{22} & c_{23} \\ c_{31} & c_{32} & c_{33} \end{bmatrix}$$

- If we have single processor, it will have 'n' computations.
- And we have ' $n^2$ ' terms, so it will be  $(O(n^3))$  complexity.
- If we are having 'n' processors, then the operations can be done simultaneously or parallelly.
- To multiply 2 matrices, if we have 3 processors:  $p_1, p_2, p_3$
- For computing 1st row i-

| Processor ID  | $K=1$          | $K=2$          | $K=3$          |
|---------------|----------------|----------------|----------------|
| Initialize    | $c_{11} = 0$   | $c_{12} = 0$   | $c_{13} = 0$   |
| 1st iteration | $a_{11}b_{11}$ | $a_{12}b_{12}$ | $a_{13}b_{13}$ |
| 2nd iteration | $a_{12}b_{21}$ | $a_{12}b_{22}$ | $a_{12}b_{23}$ |
| 3rd iteration | $a_{13}b_{31}$ | $a_{13}b_{32}$ | $a_{13}b_{33}$ |

- It is of  $O(n)$ , ∴ faster
- Similarly, we can compute other rows.
- Total computation time =  $O(n^2)$ , ∴ faster than sequential.

e.g! supercomputer: Sunway Taihu delight (1.45GB)

Q How the processor will know that it has to perform a certain operation?

Ans: We have parallel algorithms

∴ the processors should have unique ID which is fixed.

```

for (K=1; K<=n; K++)
    for (i=1; i<=n; i++)
        C[i][K] = 0
    for (j=1; j<=n; j++)
        C[i][j] += a[i][K]*b[K][j]
    
```

26-7-19

## Pipelining





Let  
 No. of Inst<sup>r</sup> = N  
 No. of stages = M

① Time = M + (N-1) cc  $\rightarrow$  clock cycle  
 for 1st  $\leftarrow$  for remaining N-1 inst<sup>r</sup> bcoz in each  
 cycle ~~at~~ one inst<sup>r</sup> is executed

$$T = \frac{M+N-1}{\text{Frequency}}$$

② Speedup =  $\frac{\text{Time taken w/o pipeline}}{\text{Time taken w/ pipeline}} = \frac{MN}{M+N-1} \text{ cc}$

$$S = MN$$

$$M+N-1$$

- Benefit of pipeline is only there when pipeline is filled up, i.e., after  $M$  cycles.
- ∵ more the no. of inst<sup>r</sup>, more the speedup

$$\text{Max Speedup} = \lim_{N \rightarrow \infty} \frac{MN}{M+N-1} = \lim_{N \rightarrow \infty} \frac{M}{\frac{M+1-1}{N}} = M$$

Max Speedup =  $M$  = no. of segments.

- ③ Throughput - No. of inst<sup>r</sup> executed per sec.  
 $= \frac{\text{Total inst}^r}{\text{Total time}}$

$$T = \frac{Nf}{M+N-1}$$

$$\text{Max } T = \lim_{N \rightarrow \infty} \frac{Nf}{M+N-1} = f \lim_{N \rightarrow \infty} \frac{1}{\frac{M+1-1}{N}} = f$$

Max. Throughput =  $f$

- ∵ higher the frequency, higher the throughput
- Hence, we prefer system w/ higher freq

Q How to increase frequency?

- By increasing the no. of stages/ segments, since amount of work to be done in each stage decreases.
- Higher the  $\nu$ , higher the heat generation.  
 $\therefore$  increase  $\nu$  as per available cooling.

And hence, concept of multicore was generated

29-7-19

① Average No. of inst<sup>r</sup> executed per clockcycle

$$= \frac{N}{M+N-1} = \frac{MN}{M+N-1}$$

$$\text{Max} = M \quad [\text{1 inst}^r \text{ cycle (IC)} = M \text{ cc}]$$

→ If multicore, you use  $\nu$  of  $1/2$  the original  $\nu$ .



$$4 \times f/2 = 2f \rightarrow \text{faster and heat also less}$$

② Performance Cost Ratio (PCR) =  $\frac{\text{Throughput} \rightarrow \text{frequency}}{\text{Total Cost}}$

$$\text{PCR} = \frac{f}{C + mh}$$

$c$  = cost of digital gates

$h$  = cost of latch

$m$  = no. of stages

$f$  = freq. of pipeline



After  $m_o$ , even if throughput increases, PCR decreases since cost increases. If we use no. of stages as  $m_o$  (Optimal no. of stages)



Then for each stage, time taken =  $t/m + d$

$$f = \frac{1}{t/m + d}$$

$$\text{PCR} = 1$$

$$(t/m + d) * (c + mh)$$

At what value of  $m$  will PCR be max?

$$\frac{dP_{CH}}{dm} = \frac{h}{(t/m + d)(c + mh)^2} - \frac{t}{m^2(t/m + d)^2}$$

$$\text{Maximizing } \frac{dP_{CH}}{dm} = 0$$

$$\frac{h}{c + mh} = \frac{t}{(t/m + d) m^2}$$

$$mh = \frac{tc + mht}{tm + m^2d}$$

$$mht + m^2hd = tc + mht$$

$$m = \sqrt{\frac{tc}{dh}}$$

$\frac{d^2 P_{CR}}{dm^2} < 0$  hence maximum value of  
m is

$$m = \sqrt{\frac{tc}{dh}}$$

31-7-19

Dt.:  
Pg.:  
**Delta**

- Frequency decided by every stage must have approx. same work.
- Time period of pipeline =  $\max(t_1 + t_2 + t_3 + \dots + t_n) + \text{latch}$

One branch causes  $M-1$  extra CC

$$\text{No. of instr}^r = N$$

$$\text{No. of stages} = M$$

Prob. that an "inst" will cause branching = S

$$\text{No. of instr}^r = N$$

$$\text{No. of stages} = M + \text{latches}$$

 $I_1$  $I_2$  $I_3$  $I_4$  $I_5$  $I_6 \rightarrow$  branch jump next $I_7$  $I_8$  went  $I_{12}$  $I_9$  $I_{13}$  $I_{14}$  $I_{15}$  $I_{16}$ 

extra CC

|          |                                                               |               |
|----------|---------------------------------------------------------------|---------------|
| Store    | 1 2 3 4 5 6                                                   | ① ② ③ ④ 12 11 |
| executed | 1 2 3 4 5 6 7                                                 | 12 11 10      |
| DI       | 1 2 3 4 5 6 7 8                                               | 12 11 10 9    |
| ID       | 1 2 3 4 5 6 7 8 9                                             | 12 11 10 9 8  |
| IF       | I <sub>1</sub> I <sub>2</sub> 3 4 5 6 7 8 9 10 11 12 10 9 8 7 |               |

$$\textcircled{1} \quad \text{Time} = M + N - 1 + NS(M-1)$$

↳ how many branching  
(avg)

$$\textcircled{2} \quad \text{speedup} = \frac{MN}{M + N - 1 + NS(M-1)}$$

$$\text{Max. lim. } \lim_{N \rightarrow \infty} \frac{M}{\frac{M + 1 - 1}{N} + S(M-1)} = \frac{M}{1 + S(M-1)}$$

if  $S = 0$ , Max. speedup = M

$$\textcircled{3} \quad \text{throughput} = \frac{Nf}{M + N - 1 + SN(M-1)}$$

$$\text{Max. lim. } \lim_{N \rightarrow \infty} \frac{f}{1 + S(M-1)}$$

$$\textcircled{4} \quad \text{Avg. no. of instr. executed per cc} =$$

$$\frac{MN}{N + M - 1 + NS(M-1)}$$

$$= \frac{M}{1 + S(M-1)}$$

→ PQP question  
1st part

# Parallel & Distributed Systems

No. of Instructions: N  
steps in a P

No. of Instructions:  $N$   
No. of stages in a pipeline =  $m$

No. 8  
Probability that an instruction is a branch  
instruction =  $\phi$

Prob. that a branch inst' is unconditional

conditional branch =

, or a conditional branch will cause branching = s

Find : Time, Throughput, speedup, Avg. no. of instructions executed per instr. set.

~~Max Arg. no. of branch Instructions = NP~~

Arg. no. ref conditional branch Inst<sup>v</sup> = NPr  
" " " unconditional " " " " = Npg

Aug. no. of brain chipping from cond. branch = Npns

Aug no. of branch =  $N_p g + N_p rs$

$$\text{no. of branch} = \text{NPG} + \text{NPS}$$

Throughput =

NF

$$m + N - 1 + NP(g + rs)(m-1)$$

speedup = Max Throughput =

lt NF

$$N \rightarrow \infty \quad m + N - 1 + NP(g + rs)(m-1)$$

~~Rate~~ =

$$f$$
  
$$1 + p(g + rs)(m-1)$$

if  $p=0$ , then max. throughput =  $f$

Speedup = NM

$$m + N - 1 + NP(g + rs)(M-1)$$

Max Speedup = lt MN  
 $N \rightarrow \infty$   $m + N - 1 + NP(g + rs)(M-1)$

= M

$$1 + p(g + rs)(M-1)$$

If  $p=0$ , max speedup =  $M$ .

Frequency

Dt.:

Pg.:

Delta

Scalar Pipeline  $\rightarrow$  1 clock 1 Job  
every clock 1 inst<sup>r</sup> executor.  $T_{\text{Pc.}}^{\text{D.L.}}$  13-9-19  
~~Delta~~

### Superscalar Pipeline

In one clock cycle, more than 1 inst<sup>r</sup> can be completed. Process will be fast.  
Suppose we have 2 pipeline:



Can ~~or~~ have multiple pipelines.

Degree  $\rightarrow$  No. of pipeline.

Degree  $n \rightarrow$  It can perform  $n$  Jobs per clock cycle.

### Disadvantage!

As instructions are stored in memory & at one time only one inst<sup>r</sup> can access memory hence there will be bottleneck.

Soln :-

Have a common hardware for fetching inst<sup>r</sup> from memory.



fetch      next inst<sup>r</sup>

~~& store in processor~~

Fetch & issue to free execution unit



All  $I_1, I_2, I_3$  are executing at same time but they should be independent of one another.

Hence we perform Dependency analysis to know which inst<sup>x</sup> should be performed parallel execution.

If one inst<sup>x</sup> is dependent on other so the next inst<sup>y</sup> is execute Hence it goes out of order.

Correctness of program should be maintained.

Commit Unit

By using out of order execution, it becomes faster and store the result in order and present it in correct order to user.

All processors are using out of order execution to save time.

~~Delta~~

Split cache for structure hazard.

Memory of data

& memory of instr are ~~of~~ separated

### Dependency Analysis

Input set = I

Output Set = O

Instr = S

I

S<sub>1</sub>



Output of S<sub>1</sub> is input of S<sub>2</sub>

S<sub>2</sub>

O<sub>2</sub>

- ① Flow Dependency → for S<sub>2</sub> to execute,  
it is imp to get output  
O<sub>1</sub> ∩ I<sub>2</sub> ≠ φ of S<sub>1</sub>.

Flow independent if O<sub>1</sub> ∩ I<sub>2</sub> = φ  
Representation: →

- ② Anti Dependency O<sub>2</sub> ∩ I<sub>1</sub> ≠ φ

Output of S<sub>2</sub> (O<sub>2</sub>) is overlapping with  
input of S<sub>1</sub> (I<sub>1</sub>)

Anti independent if O<sub>2</sub> ∩ I<sub>1</sub> = φ  
Representation: + →

- ③ Output Dependency O<sub>2</sub> ∩ O<sub>1</sub> ≠ φ

Both try to write at same position/file

Representation:  $\rightarrow$

DL:  
Pg:

Delta

Output independent if  $O_2 \cap O_1 = \emptyset$

Berstein's Conditions

For  $S_1$  &  $S_2$  to execute simultaneously if

$$\boxed{O_1 \cap I_2 = \emptyset}$$
$$O_2 \cap I_1 = \emptyset$$
$$O_2 \cap O_1 = \emptyset$$

e.g.  $\circled{1}$   $S_1$ : Load R1, A  
 $S_2$ : Add , R2, A1  
 $S_3$ : Mov R1, R3  
 $S_4$ : Store B, R1



$p_1 \parallel p_2 \parallel p_3 \parallel p_4 \dots \parallel p_k$  can be scheduled for parallel execution if and only if  $p_i \parallel p_j$  &  $i \neq j$

↳ each pair can be parallelly executed

(2)

$$P_1 : C = D \times E$$

$$P_2 : M = G + C$$

$$P_3 : A = B + C$$

$$P_4 : c = C + M$$

$$P_5 : F = G - E$$



Q8

$$S_1 : A = B + D$$

$$S_2 : C = A \times 3$$

$$S_3 : A = A + C$$

$$S_4 : E = A/2$$

Dr.:  
Pg.:  
Delta



- Q      S<sub>1</sub>: Load R<sub>1</sub>, 1024    R<sub>1</sub> ← 1024  
       S<sub>2</sub>: Load R<sub>2</sub>, M(10)                          R<sub>2</sub> ← MEM(10)  
       S<sub>3</sub>: Add R<sub>1</sub>, R<sub>2</sub>                                  R<sub>1</sub> ← R<sub>2</sub> + R<sub>1</sub>  
       S<sub>4</sub>: Store M(1024), R<sub>1</sub>                          MEM(1024) ∈ R,  
       S<sub>5</sub>: Store MEM(R<sub>1</sub>), 1024                          MEM(R<sub>1</sub>) ∈ 1024



Dt.:  
Pg.: Delta

### Reservation Table of Pipeline



output X

|                | 1              | 2              | 3              | 4              | 5              | 6              |
|----------------|----------------|----------------|----------------|----------------|----------------|----------------|
| S <sub>1</sub> | X <sub>1</sub> | X <sub>2</sub> | X <sub>3</sub> |                |                |                |
| S <sub>2</sub> |                | X <sub>1</sub> | X <sub>2</sub> | X <sub>3</sub> |                |                |
| S <sub>3</sub> |                |                | X <sub>1</sub> | X <sub>2</sub> | X <sub>3</sub> |                |
| S <sub>4</sub> |                |                |                | X <sub>1</sub> | X <sub>2</sub> | X <sub>3</sub> |
| S <sub>5</sub> |                |                |                |                | X <sub>1</sub> | X <sub>2</sub> |
|                |                |                |                |                | X <sub>1</sub> | X <sub>2</sub> |



Dynamic Pipeline and non-linear

Reservation Table for X      Reservation Table for X

|                | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |                | 1 | 2 | 3 | 4 | 5 | 6 |
|----------------|---|---|---|---|---|---|---|---|----------------|---|---|---|---|---|---|
| S <sub>1</sub> | X |   |   |   |   |   |   |   | S <sub>1</sub> | X |   |   |   |   |   |
| S <sub>2</sub> |   | X | X |   |   |   |   |   | S <sub>2</sub> |   | X |   |   |   |   |
| S <sub>3</sub> |   |   | X | X | X | X |   |   | S <sub>3</sub> | X | X | X | X | X |   |

Choose time latency (min) so that there is no collision. Latency should be as min as possible as then throughput will be maximum.

(MAL)  $\rightarrow$  Minimum Average Latency

After 1 cycle latency of  $\Delta$

Dt.  
Pg.

$\Delta$



Reservation Table of X

|                | 1              | 2              | 3              | 4              | 5              | 6              | 7              | 8              |
|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|
| S <sub>1</sub> | x <sub>1</sub> |                | x <sub>2</sub> |                |                | x <sub>1</sub> |                | x <sub>2</sub> |
| S <sub>2</sub> |                | x <sub>1</sub> |                | x <sub>2</sub> | x <sub>1</sub> | x <sub>2</sub> |                |                |
| S <sub>3</sub> |                |                | x <sub>1</sub> | x <sub>2</sub> | x <sub>1</sub> | x <sub>2</sub> | x <sub>1</sub> | x <sub>2</sub> |

After 2 C.C.  $\Rightarrow$  Latency = 2



|                | 1              | 2              | 3              | 4                              | 5 | 6                              | 7 | 8                              |
|----------------|----------------|----------------|----------------|--------------------------------|---|--------------------------------|---|--------------------------------|
| S <sub>1</sub> | x <sub>1</sub> |                | x <sub>2</sub> |                                |   | x <sub>1</sub>                 |   | x <sub>1</sub> /x <sub>2</sub> |
| S <sub>2</sub> | x <sub>1</sub> |                |                | x <sub>1</sub> /x <sub>2</sub> |   | x <sub>2</sub>                 |   |                                |
| S <sub>3</sub> |                | x <sub>1</sub> |                | x <sub>1</sub> /x <sub>2</sub> |   | x <sub>1</sub> /x <sub>2</sub> |   |                                |

This latency which causes collision  
Forbidden latency

" " " not " "  $\rightarrow$   
Permissible latency

Latency  $\geq 3$

|                | 1              | 2              | 3              | 4              | 5              | 6              | 7              | 8              | 9              | 10 | 11             |
|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----|----------------|
| S <sub>1</sub> | x <sub>1</sub> |                |                | x <sub>2</sub> |                | x <sub>1</sub> |                | x <sub>1</sub> | x <sub>2</sub> |    | x <sub>2</sub> |
| S <sub>2</sub> | x <sub>1</sub> |                | x <sub>1</sub> | x <sub>2</sub> | x <sub>1</sub> | x <sub>2</sub> | x <sub>1</sub> | x <sub>2</sub> |                |    |                |
| S <sub>3</sub> |                | x <sub>1</sub> |                | x <sub>1</sub> | x <sub>2</sub> | x <sub>2</sub> | x <sub>1</sub> | x <sub>2</sub> |                |    |                |

Permissible latency = 1, 3, 6, 8, ...  
Forbidden " = 2, 4, 5, 7

### Collision vector

Collision Free scheduling.  
→ max for bidding.

4 6 5 4 3 2 1      Initial  
1 0 1 1 0 1 0      Collision vector

|    | 1 | 2 | 3 | 4 | 5 | 6 | 7       | 8         |
|----|---|---|---|---|---|---|---------|-----------|
| S1 | X |   |   |   | X | X | 2, 5, 7 | Forbidden |
| S2 |   | X |   |   | X |   | 2       | Latency   |
| S3 |   |   | X | X | X |   | 2, 4    | J.        |

By getting min  
time difference.

Starting stage  $\Rightarrow$  initial collision vector.



1011011



1011010



1011011



1011010



1011011



1011010



1011011



1011010



1011011



1011010



1011011



1011010



1011011



1011010



1011011



1011010



1011011



1011010



1011011



1011010



1011011



1011010



1011011



1011010



1011011



1011010



1011011



1011010



1011011



1011010



1011011



1011010



1011011



1011010



that simple cycle which contain  
min edge sum  
min edge sum  
Delta each

10  
PQ  
Delta node

Simple cycle

Complex cycle

(1, 8)

(1, 8)

(3, 6)

(3, 6)

(6, 8)

(6, 8)

(3)

$$\text{Avg long latency} = \frac{1+8}{2} = 4.5$$

Ans = 3.

Reservation Table for Y

|                | 1 | 2 | 3 | 4 | 5    | 6 |
|----------------|---|---|---|---|------|---|
| S <sub>1</sub> | X |   |   | X |      |   |
| S <sub>2</sub> |   | X |   | X |      |   |
| S <sub>3</sub> |   |   |   | X | 2, 4 |   |

4 3 2 1  
1 0 1 0      S<sup>t</sup>  
0 1 0 1  
1 0 1 0



3. f → 3

outgoing edges

1 0 1 0

Simple cycle

length  
(1, 5) → 1 + 5 = 3

5

3

→ 3

(3, 5)

نیا  
کوہ  
لے

1888  
3 X 150  
9 18  
" 88  
Ounces

|               |   |     |   |
|---------------|---|-----|---|
| A line period | 1 | 2 * | 3 |
|               |   |     |   |

A vertical column of handwritten numbers and symbols on white paper with black horizontal lines. The column starts with a large '1' at the top, followed by a small '1'. Below that is a large '2' with a small asterisk (\*) to its right. Then there is a large '3', followed by a large '4' at the bottom. To the right of the '4' is a small '1'. To the left of the '3' is a small '1'. There are also some faint, illegible markings and smudges on the paper.

Liane Remond

→ Min & Max value

### Bounds on MAC

- ① The MAC is lower bounded by maximum no. of check marks in any row of Representation Table.

minimum no of check marks in any Row of RT :

$\leq \text{MAC} \leq \text{Avg latency}$   
of any greedy cycle in the SD.

- ② MAC is lower than or equal to the average latency of any greedy cycle in the State diagram (SD).

- ③ Average latency of any greedy cycle is upper bounded by no. of 1's in initial collision vector + 1

$\dots \leq \text{no. of 1's in Initial collision vector} + 1$

~~For X~~ For X

$$\frac{3}{-} \leq \text{MAC} \leq 5$$

$\Rightarrow$  Ans as calculated  
before = 3  
(optimized)

For Y

$$\frac{3}{-} \leq \text{MAC} \leq 3$$

$\Rightarrow$  Ans as calculated  
before = 3  
(optimized)

PipelineReservations Table

|    | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
|----|---|---|---|---|---|---|---|---|---|----|----|----|----|----|----|
| S1 | X |   |   |   | X |   |   |   |   |    |    |    | 1  | 0  | 1  |
| S2 |   | X |   | X |   |   |   |   |   |    |    |    | 2  | 3  | 4  |
| S3 |   |   | X | X |   |   |   |   |   |    |    |    | 0  | 1  | 0  |

Bounds

$$2 \leq \text{MAL} \leq 4$$

1011

0001

1011

1011

3

Simple cycle or greedy.

$$\text{MAL} = 3$$

Hence not optimal as if  $\text{MAL} = 2$  we get max throughput.

Hence to optimize  $\text{MAL}$  we insert delays

- ① Find  $R^T$  after inserting delay
- ② Then find  $\text{MAL}$ .

We can get optimal  
no. of delays whenever  
Delays can  
improve leptimer M AL.

Dr.:  
Pg.:  
*eltern*

Repetitio notitia pro modis

A hand-drawn grid diagram consisting of 6 horizontal rows and 7 vertical columns. The grid is defined by thick black lines. Several handwritten labels are present:

- Row 1: Top-left cell contains a large 'D'. To its right are 'S' and 'B'.
- Row 2: Contains 'X' in the 4th column.
- Row 3: Contains 'X' in the 3rd column.
- Row 4: Contains 'D' in the 2nd column and 'X' in the 4th column.
- Row 5: Contains 'X' in the 3rd column and 'S' in the 6th column.
- Row 6: Contains 'X' in the 4th column and 'T' in the 7th column.

Below the grid, there is a sequence of numbers: 198, 216, 2, 1, 6, 5, 4, 3, 2, 1, 0, 0, 0, 0, 0, 1.



卷之三

三

5

七

10

1

10

DELTA Notebook

Assignment

/ simulate

Q. was to find a collision free scheduling  
of a pipeline

Dr. Shrikant  
Pg. 1

| Q1 | 1 | 2 | 3 | 4 | 5 | 6  |       |
|----|---|---|---|---|---|----|-------|
| S1 | x |   |   |   | x |    |       |
| S2 |   | x |   |   |   |    |       |
| S3 |   |   | x |   |   |    |       |
| S4 |   |   |   | x | x |    |       |
|    |   |   |   |   |   | x  |       |
|    |   |   |   |   |   | 6+ |       |
|    |   |   |   |   |   |    | 00010 |

5 4 3 2 1  
1 0 0 1 1

MAL = 3

Throughput =  $1$

$$= \frac{1}{3 \times 20 \text{ ms}} = \frac{10^3}{60} \text{ Jobs/sec}$$

$$= \frac{1000 \times 10^5}{5} \text{ Jobs/sec}$$

Q2

|    |   |   |   |   |   |   |      |      |
|----|---|---|---|---|---|---|------|------|
| S1 | 1 | 2 | 3 | 4 | 5 | 6 | 7    | 8    |
| S2 | x | x | x | x | x | x | 2    | 7, 5 |
| S3 |   |   |   |   |   |   | 2, 4 |      |

7 6 5 4 3 2 1

1 0  $\phi$ ,  $\phi$ , 0, 1, 0

P 8+

PP  
Delta



$$m_4 \rightarrow 1+8 = 4.5$$

OR (3)

mar = 3