

$$5 = \frac{1}{(-0.8) + 0.8} N$$

$$1 = \frac{1}{-0.8 + \frac{4}{N}} \quad (\text{Not possible})$$

~~+ 4/N~~

### Tutorial

Consider a hypothetical processor which runs crystal oscillator on 1 MHz frequency. If a code with 2 M instructions in it is executed in 0.5 sec. Then find the ELPIT.

$$f = 10^6 \text{ Hz}$$

$$\text{Execution time} = IC \times CPI_{\text{avg}} \times \tau$$

$$0.5 = 2 \times 10^6 \times CPI \times \frac{1}{10^6}$$

$$0.5 = 2 \times CPI$$

$$CPI = 1.25$$

What is the mix rating (MIPS)

$$\text{MIPS} = \frac{f}{CPI \times 10^6}$$

$$= \frac{10^6}{1.25 \times 10^6} = \frac{100^4}{1+25.5} = 0.8$$

Q: This microprocessor can be used to execute two types of instructions Class A and Class B of the code containing 40% Class B instructions. If Class B instruction execution time is what will be the overall speedup if Class B instruction execution mode is enhanced with the speed up of 2.5?

$$\text{Speedup} = \frac{T_{old}}{T_{new}}$$

$$T_{old} = f_A \cdot N$$

$$T_{new} = \frac{f_A \cdot N}{2.5}$$

$$Speedup = \frac{f_A \cdot N}{\frac{f_A \cdot N}{2.5}} = 2.5$$

$$\text{Speedup} = \frac{T_{old}}{T_{new}}$$

$$T_{new} = 0.4N$$

Q:- Consider a hypothetical microprocessor on which a code with 10 M instructions are executed in time of 8.5 sec. A

code has two types of instructions i.e. integer type and floating point type. It uses microprocessor is introduced when an enhancement in the integer type instruction mode such that the execution time of these instructions is reduced by the factor of 0.75 (15%). If floating point instructions trap participation of 40% in the overall code then find the overall speedup.

$$\text{Speedup} = \frac{f_A \cdot N}{f_B \cdot N}$$

$$(1-f) + fA$$

$$= \frac{1}{N}$$

$$(1-f) + fA$$

$$= \frac{0.4}{N}$$

$$\text{Speedup} = \frac{f}{(1-f) + fB \cdot f}$$

$$= \frac{1}{N \cdot \text{Sum}}$$

$$= \frac{1}{1.66}$$

$$= 0.4 + 0.6$$

$$= \frac{1}{0.4 + 0.6}$$

$$= 1.66$$

$$S_{\text{en}} = \frac{T_{\text{old}}}{T_{\text{new}}} = \frac{T_{\text{old}}}{0.75 \times T} = \frac{4}{3}$$

$$\text{Speed up} = \frac{1}{(1-f) + f} = \frac{1}{(1 - 0.3) + 0.3 \times 3} = 1.08$$

Flynn's  
Taxonomy for parallel computing architecture.

Based on no. of instruction stream and data stream

- 1) Single Instruction Single Data (SISD)
- 2) Multiple " single (SIMD)
- 3) Multiple " single (MISD)
- 4) Multiple " multiple (MIMD)

### (a) MISD



Ex:-  $z = \sin(x) + \cos(x) + \tan(x)$ .

### (b) MIMD :-



Ex:- Multiprocessor Multicomputer.

### (c) SISD:



### (d) SIMD:



vector and matrix operation  
super computers.

### Modern architecture classification.

- 1) Shared Memory Multiprocessor
- 2) Distributed "
- 3) Message passing Multicomputer.

### UMA (Uniform Memory Access Time) architecture



| Register Name | Number       |
|---------------|--------------|
| \$ zero       | \$ 0         |
| \$ at         | \$ 1         |
| \$ v0, \$ v1  | \$ 2, \$ 3   |
| \$ a0 - \$ a9 | \$ 4 - \$ 7  |
| \$ t0 - \$ t7 | \$ 8 - \$ 15 |

(Temporary variable)



MIPS Architecture. 32 registers.  
 RISC architecture → reduce instruction set,  
 all instructions are of same length (32 bits)  
 ISA (Instruction Set Architecture).

32 registers (\$ 0 - \$ 31)



Memory access model.

Shifted  $\Rightarrow$  amount  
 $\downarrow$  - target  
 $\downarrow$  - destination

## MIPS 32-bit ISA

$$f = n + A[i]$$

①

$$\begin{array}{c} \nearrow \\ f+1 \end{array} \quad \begin{array}{c} \nearrow \\ f+2 \end{array} \quad \begin{array}{c} \nearrow \\ f+3 \end{array} \quad \text{(displacing by } g\text{)}$$

$\$t_0$

$$\$t_2 = add(\text{Mem}[f_0]) \quad 32\text{-bit}$$

$$A[0] = 01101101011011010110110101101101$$

$$A[0] = 01101101011011010110110101101101$$



Memory

direct addressable,

$$01000_a (S10)$$

$$10000_2 (16_{10})$$

$\leftarrow$  shift (left logical).

load and store.

1 word = 32 bits = 4 bytes

| Mnemonic | Format | Encoding    | Offset |
|----------|--------|-------------|--------|
| lw       | I      | 32bit rs rt | "      |

$\leftarrow$   $\leftarrow$   $\leftarrow$

word  
byte  
halfword  
word

lw

lw

lw

lw

| Type | opcode | Shift | 3-bit | RD    | Shamt | Function          |
|------|--------|-------|-------|-------|-------|-------------------|
| R    | 6 bit  | RS    | RT    | Shift | Stype | Stype             |
| I    | 6 bit  | RS    | RT    | Shamt | Stype | Immmediate(16bit) |
| J    | 6 bit  |       |       |       |       | Address (16 bit)  |

MIPS 32-bit architecture

$$A[i] = n + A[i]$$

add  $\$t_5, \$t_8, \$t_5$  I forward:

add  $\$t_6, 0 (\$t_5)$

sw  $\$t_0, 0 (\$t_5)$

I format:

lw

lw

lw

lw

SL → logical shift left

SRL → " " " right

SAR → arithmetic shift right  
(sign extended)

Charlie

Date \_\_\_\_\_  
Page No. \_\_\_\_\_

### Arithmetic and logical instructions.

| Mnemonic | Format | Encoding                               |
|----------|--------|----------------------------------------|
| add      | R      | 010 rs rt rd 010 3210                  |
| addu     | R      | 010 rs rt rd 010 3310                  |
| sub      | R      | 010 rs rt rd 010 3410                  |
| and      | R      | 010 rs rt rd 010 3610                  |
| or       | R      | 010 rs rt rd 010 3710                  |
| XOR      | R      | 010 rs rt rd 010 3810                  |
| addi     | I      | 1210 rs rt <small>intermediate</small> |
| andi     | I      | 1210 rs rt <small>intermediate</small> |

### Shift.

| Mnemonic | Format | Encoding                |
|----------|--------|-------------------------|
| sll      | R      | 010 010 rs rt rd sa 010 |
| srl      | R      | 010 010 rs rt rd sa 210 |
| sra      | R      | 010 010 rs rt rd sa 310 |

### Jump and Branch.

| Mnemonic            | Format | Encoding                                 |
|---------------------|--------|------------------------------------------|
| jump to add.        | J      | 210 <small>→ destination address</small> |
| jump & link         | Jal    | 310 <small>→ ,</small>                   |
| branch if equal     | beq    | 410 rs rt <small>offset</small>          |
| branch if not equal | bne    | 510 rs rt <small>offset</small>          |

Q: if ( $i == j$ )

{

$f = g + h$

Translate in MIPS assembly code.

Suppose  $\$f_0 \rightarrow i, \$f_1 \rightarrow j$

lne  $\$s_0, \$s_1$ , end

add  $\$s_2, \$s_3, \$s_4$

end

if ( $i == j$ )

{

$f = g + h$ ; }  
else

$f = g - h$ ; }

lne  $\$s_0, \$s_1$ , else

add  $\$s_2, \$s_3, \$s_4$

end

else lne  $\$s_2, \$s_3, \$s_4$

end;

for ( $i = 0; i < 100; i+1$ )

{

$a[i] = b[i] + 10;$

}

$\$s_0$

$\$s_1$

$a[i] = b[i] + 10;$

$\$s_0, i < 400, a[i+4]$

+

set  $\rightarrow$  multiply opr.

Date \_\_\_\_\_  
Page No. \_\_\_\_\_  
**Charlie**

addi \$t2, \$0, 100 //  $$t2 = 100$

addi \$t1, \$t0, & //  $$t4 = $t0 * 4$   
=  $i * 4$ .

add \$t3, \$t1, \$t1 //  $$t3 = t\$t1 + $t1$ .

add (\$t0) & i \* 4 = addi (\$t0)

lw \$t4, 0(\$t3) //  $$t4 = b(t^3)$ .

addi \$t5, \$t4, 10 //  $$t5 = b(t^3) + 10$

add \$t6, \$t5, \$t0 //  $$t6 = add$

(array) + i \* 4 = addi (array)

sw \$t5, 0(\$t6) //  $a(ij) = b(r^j + 10)$

addi \$t0, \$t0, 1 //  $$t0 = $t0 + 1 = i+1$

bne \$t0, \$t2, loop

Q: while (i != j)

if (i > j) see

i = i - 1, set less than

use

j = j - 1;

II. sw R1, 32(R2)

1 instruction fetch.

2 execute  $(32 + R2)$  (addressing)

3 less than \$t1

4 memory access in data memory pointing at 1000.

5 write back i.e. R1 is stored at 1000.

else ! sw \$t1, \$t1, \$t0.

END.

Pipelining (MIPS- 5 stage)

assembly ip in instruction memory



Date \_\_\_\_\_  
Page No. \_\_\_\_\_  
**Charlie**

→ Time (cycle

|    | P | D | E | M | W       |
|----|---|---|---|---|---------|
| I2 | F | D | E | M | W       |
| I3 |   | F | D | E | M W     |
| I4 |   |   | F | D | E M W   |
| I5 |   |   |   | F | D E M W |

Instruction type:

5 steps required.

Non-pipeline than for 5 instruction taking 95 cycles.

beg IF ID EX  
R-Type IF ID EX-WB

Pipe pipeline than for 5 instruction take 9 cycles.

SW LW IF ID EX Mem WB.

(all 5 steps)

CPI = 95/5 = 19  
CDI = 9/5 = 1.8 (less Best)

SW R1, D(R2).

IPC (instruction per cycle) = throughput

① Instruction fetch  
decide

② execute =  $O + R_2$

$$\text{③ } \text{IPC} = 5/9 = 0.55 \quad \text{Best}$$

$$\text{④ } \text{IPC} = 9/25 = 0.36$$

Memory access to store

Latency is not improved in pipelining but throughput is improved.

$$\text{latency} = \text{CPI}$$



IP

IF ID EX Mem WB.  
Take max time from two domain

Take max time from two domain

IF ID EX MEM WB

equal time according to max. time

RAW (Read after write) :  $\rightarrow$   
 WAR (Write after read)  $\rightarrow$   
 WAW (Write after write)  $\rightarrow$

add  $R_1, R_2, R_3$  | F D E M W

sub  $R_5, R_6, R_7$  | F D E M W

mul  $R_8, 0(50)$  | F D E M W.

WAR.  $M = C_1 + C_2 \rightarrow$  Read

WAW.  $C = D \times E$ .  $M = C_1 + C_2 \rightarrow$  Read

RAW (Read after write) :  $C = L + M$  (Anti dependency)

Total no. of cycles taken

To complete :  $C = D \times E$ .  $C = L + M$

caused of p dependency

Dependency graph : RAW.

add  $R_1, R_2, R_3$  | F D E M W.

sub  $R_5, R_6, R_7$  | F - - - - - D E M W.

mul  $R_8, 0(50)$  | F - - - - - D E M W.

### Pipeline hazard .

- ① Structural hazard
- ② Data hazard (data depending)
- ③ Control or Branch hazard



$Q_1: C = D \times E$



$S_2: M = C_1 + C$



$S_3: A = B + C$



$S_4: C = L + M$



$S_5: R = A \div E$



3 depending

|          |       |
|----------|-------|
| Date     | Chall |
| Page No. |       |

Bernstein conditions of parallelism:

$$I_1, I_2, O_1, O_2$$

$$I_1 \cap O_2 = \emptyset, T_2 \cap O_1 = \emptyset, O_1 \cap O_2 = \emptyset$$

$$S_1: add R_1, R_2, R_3$$

$$R_1 = R_2 + R_3$$

$$S_2: sub R_4, R_1, R_3$$

$$R_4 = R_1 - R_3$$

Load use hazard.

$$lw R_1, O(R_2) f D E M W$$

$$add R_2, R_1, R_3 f D - E M W$$

$$lw R_1, O(R_5) f D E M W$$

$$sub R_1, O(R_6) f D E M W$$

$$lw R_4 O(R_8) f D E M W$$

$$add R_3, R_1, R_2 f D E M W$$

$$lw R_3, O(R_9) f D E M W$$

$$sub R_6, R_4, R_5 f D E M W$$

$$lw R_6, O(R_{10}) f D E M W$$

$$add R_4, R_1, R_3 f D E M W$$

$$sub R_6, R_4, R_5 f D E M W$$

$$speedup = 8/9 = 0.9$$

Q2:

$$add R_1, R_2, R_3 f D E M W$$

$$sub R_4, R_1, R_3 f D E M W$$

$$add R_6, R_1, R_2 f D E M W$$

$$sub R_8, R_1, R_9 f D E M W$$

$$add R_{10}, R_1, R_{11} f D E M W$$

$$CP1 = 9/5$$

|          |       |
|----------|-------|
| Date     | Chall |
| Page No. |       |

$$lw R_1, O(50)$$

$$lw R_2, O(50)$$

$$add R_3, R_1, R_2$$

$$lw R_4, O(52)$$

$$lw R_5, O(53)$$

$$lw R_6, R_4, R_5$$

$$lw R_6, O(54) \quad 14 \text{ cycles}$$

$$lw R_7, O(55)$$

$$lw R_8, O(56)$$

$$lw R_9, O(57)$$

$$lw R_{10}, O(58)$$

$$lw R_{11}, O(59)$$

$$lw R_{12}, O(60)$$

$$lw R_{13}, O(61)$$

$$lw R_{14}, O(62)$$

$$lw R_{15}, O(63)$$

$$lw R_{16}, O(64)$$

$$lw R_{17}, O(65)$$

$$lw R_{18}, O(66)$$

$$lw R_{19}, O(67)$$

$$lw R_{20}, O(68)$$

$$lw R_{21}, O(69)$$

$$lw R_{22}, O(70)$$

$$lw R_{23}, O(71)$$

$$lw R_{24}, O(72)$$

$$lw R_{25}, O(73)$$

$$lw R_{26}, O(74)$$

$$lw R_{27}, O(75)$$

$$lw R_{28}, O(76)$$

$$lw R_{29}, O(77)$$

$$lw R_{30}, O(78)$$

$$lw R_{31}, O(79)$$

$$lw R_{32}, O(80)$$

$$lw R_{33}, O(81)$$

$$lw R_{34}, O(82)$$

$$lw R_{35}, O(83)$$

$$lw R_{36}, O(84)$$

$$lw R_{37}, O(85)$$

$$lw R_{38}, O(86)$$

$$lw R_{39}, O(87)$$

## MIPS 5 stage pipeline.

lw R1, 0(R2)

lw R3, 0(R4)

add R5, R1, R3

sw R5, 0(R6)

lw R7, 0(R8)

lw R9, 0(R10)

sub R11, R7, R9

sw R11, 0(R12)

F D E M W

F D E M W

F D E M W

F D E M W

F D E M W

F D E M W

14 cycles

Code Generation

LWD R1, 0(R2)

SW R3, 0(R4)

LW R7, 0(R8)

SW R9, 0(R10)

ADD RS, R1, R3

SW R5, 0(R6)

SW R11, R7, R9

SW R11, 0(R12)

F D E M W

F D E M W

F D E M W

F D E M W

F D E M W

F D E M W

F D E M W

F D E M W

Control or Branch Target

loop: lwd \$11, \$13, 36 'F D E'

and \$12, \$13, \$5

OR \$16, \$11, \$7

add \$8, \$11, \$9

XOR \$10, \$11, \$11

beq \$10, \$13, loop

- Branch prediction
- Branch Taken
- Branch Not Taken.

Prediction of delayed control or branch target.

for( i=1000 ; i&gt;0 ; i=i-1 )

$$x[i] = x[i] + s$$

3.

l. add \$t1, \$0, \$0 // t1=0

2. add \$t3, \$t2, \$t0 // t3 = add(t2)

3. sub \$t4, 0(\$t3) // t4 = x[0] / t3

4. add \$t5, \$t4, \$t1 // t5 = x[i] + s

5. SW \$t5, 0(\$t5) // x[0] = x[i] + s

6. subi \$t0, \$t0, 1 // t0 = t0 - 1 = i-1

7. bne \$t0, \$0, loop

8. bne \$t0, \$0, loop

9. F D E M W

10. F D E M W

11. F D E M W

12. F D E M W

13. F D E M W

14. F D E M W

→ Super scalar computer  
VLIN processor.

Date Charlie  
Page No.

### Loop unrolling / Register Renaming

Date Charlie  
Page No.

$fml(i = 1000; i \geq 0; i = i - 1)$   
 $x[i] = x[i-1] + s;$   
 $x[i-1] = x[i-2] + s;$   
 $x[i-3] = x[i-4] + s;$   
 $\vdots$   
 $x[0] = x[-1] + s;$   
 $x[-1] = x[-2] + s;$   
 $x[-3] = x[-4] + s;$   
 $\vdots$   
 $x[-50] = x[-51] + s;$

### Register Renaming

1. add \$t1, \$0, \$1000 //  $t_1 = 0$
2. sub \$t2, \$t0, \$ //  $t_2 = i \times s$ .
3. add \$t3, \$50, \$t2 //  $t_3 = add(x[i])$
4. sub \$t4, \$t3, \$ //  $t_4 = x[i]$
5. sub \$t5, -4(\$t3) //  $t_5 = x[i-1]$
6. sub \$t6, \$t4, \$ //  $t_6 = x[i-2]$
7. sub \$t7, -12(\$t3) //  $t_7 = x[i-3]$
8. sub \$t8, \$t6, \$ //  $t_8 = x[i-4]$
9. add \$t9, \$t5, \$ //  $t_9 = x[i-5]$
10. add \$t10, \$t6, \$ //  $t_{10} = x[i-6]$
11. add \$t11, \$t7, \$ //  $t_{11} = x[i-7]$
12. sub \$t12, \$t8, \$ //  $t_{12} = x[i-8]$
13. sub \$t13, \$t9, -4(\$t3) //  $t_{13} = x[i-9]$
14. sub \$t14, \$t10, -8(\$t3) //  $t_{14} = x[i-10]$
15. sub \$t15, \$t11, -12(\$t3) //  $t_{15} = x[i-11]$
16. sub \$t16, \$t12, -16(\$t3) //  $t_{16} = x[i-12]$
17. done \$t0, \$t1, \$t2, loop.

### Super scalar computer

$fml(i = 1000; i \geq 0; i = i - 1)$   
 $x[i] = x[i] + s;$   
 $x[i] = x[i] + s;$   
 $x[i] = x[i] + s;$   
 $\vdots$   
 $x[0] = x[0] + s;$



Reservation Table for function X.

| -  | 1 | 2 | 3 | 4              | 5              | 6              | 7 | 8              |
|----|---|---|---|----------------|----------------|----------------|---|----------------|
| S1 | X |   |   | I <sub>2</sub> | X              |                | X |                |
| S2 |   | X |   | X              | I <sub>2</sub> | I <sub>2</sub> |   |                |
| S3 |   |   | X |                | X              | I <sub>2</sub> | X | I <sub>2</sub> |

Forbidden latencies.

Stage 1 :- 2, 5, 7.

Stage 2 :- 2.

Stage 3 :- 2, 4.

8, 4, 5, 7.

Collision vector (7 inputs).

|                |                |                |                |                |                |                |
|----------------|----------------|----------------|----------------|----------------|----------------|----------------|
| C <sub>7</sub> | C <sub>6</sub> | C <sub>5</sub> | C <sub>4</sub> | C <sub>3</sub> | C <sub>2</sub> | C <sub>1</sub> |
| 1              | 0              | 1              | 1              | 0              | 1              | 0              |



Simple cycles,

(1, 8), (3, 8), (6, 8), (3, 0),

(3, 3), (6, 6), (6, 3).

Avg. latency :- 5, 6, 7, 5, 3.

Min. Avg latency (MAL) = 3.

00000  
 00001000  
 00011111  
 01010  
 01010  
 Charlie  
 Date \_\_\_\_\_  
 Page No. 0/0

greedy cycles :-  $(3,3)$ ,  $(1,8)$ ,  $(3,2)$ ,  $(3)$ .

Throughput = Instruction / cycle =  $\frac{1}{3}$ .

Reservation Table for function Y.

|    | 1 | 2 | 3 | 4 | 5 | 6 |
|----|---|---|---|---|---|---|
| S1 | Y |   |   | Y |   |   |
| S2 |   | Y |   |   | Y |   |
| S3 |   |   | Y |   |   | Y |

Stage 1  $\rightarrow$  4

Stage 2 = 0

Stage 3 = 0, 4.

|    | 1 | 2 | 3 | 4 | 5 | 6 |
|----|---|---|---|---|---|---|
| S1 | X |   |   | X | X | X |
| S2 |   | X |   | X |   | X |
| S3 |   |   | X |   | X |   |

Collision vector

| A | C6 | C5 | C4 | C3 | C2 | C1 |
|---|----|----|----|----|----|----|
| 0 | 0  | 0  | 1  | 0  | 1  | 0  |
| 1 | 0  | 0  | 1  | 0  | 1  | 0  |



Stage 1 :- 5

Stage 2 :- 3.

Stage 3 :- 0

Stage 4 :- 0

5 := 4

3, 4, 5

Simple cycles :-  $(1,5)$ ,  $(3,3)$ ,  $(5,5)$ .

Throughput =  $\frac{1}{3}$ .

greedy cycle :-  $(3,3)$ ,  $(1,5)$ .



00000  
 00001000  
 00011111  
 01010  
 01010  
 Charlie  
 Date \_\_\_\_\_  
 Page No. 0/0

Decision vector

C5 C4 C3 C2 C1

1 1 1 0 0

F D E M W.  
F D E M W.



01111

Code Reordering and loop unrolling of:

for(i=0; i<1000; i++)

{ A[i] = h+A[i]; }

}

add \$t1, \$0, 1000 // t4=1000.

loop: add \$t2, \$t0, 1000 // t2=i\*4

add \$t3, 0(\$t3) // t3=addr[A[i]].

add \$t4, 0(\$t3) // t4=A[i]

add \$t5, \$t1, \$t4 // t5=h+A[i]

sw \$t5, 0(\$t3) // A[i]=h+A[i]

addi \$t0, \$t0, 1 // i=i+1

bne \$t0, \$t1, loop.

\$t4 → used was hazard.

If I do code reordering then it will solve the problem of scaling.

Code Reordering:

1. addi \$t1, \$0, 1000 // t4=1000.

2. loop: add \$t2, \$t0, \$t1 // t2=i\*4

add \$t3, 0(\$t2) // t3=addr[A[i]]

add \$t4, 0(\$t3) // t4=A[i]

add \$t5, \$t1, \$t4 // t5=h+A[i]

sw \$t5, 0(\$t2) // A[i]=h+A[i]

addi \$t0, \$t0, 1 // i=i+1

lone \$t0, \$t1, loop.

F D E M W.

F D E M W

F D E M W

F D E M W

F D E M W

$$4(\$t3) = 4 + t3.$$

Date — Charlie  
Page No. •

$$\$t0 = \$t1.$$

Date — Charlie  
Page No. •

loop unrolling.

~~for ( i=0 ; i < 1000 ; i = i+4 ) .~~

~~\\$~~  
~~A[i] = h + A[i];~~

~~A[i+1] = h + A[i+1];~~

~~A[i+2] = h + A[i+2];~~

~~A[i+3] = h + A[i+3];~~

~~}~~

~~addi \$t1, \$0, 1000~~

~~lw \$t8, \$t0, 2~~

~~add \$t3, \$t8, \$t2~~

~~lw \$t4, 0(\$t3)~~

~~lw \$t5, 4(\$t3)~~

~~lw \$t6, 8(\$t3)~~

~~lw \$t7, 12(\$t3)~~

~~addi \$t0, \$t0, 4~~

~~add \$t2, \$t1, \$t4~~

~~add \$t9, \$t1, \$t5~~

~~add \$t8, \$t1, \$t6~~

~~add \$t3, \$t1, \$t7~~

~~lw \$t5, 0(\$t3)~~

~~lw \$t3, 12(\$t3)~~

~~bne \$t0, \$t1, loop~~

$$F \quad D \quad E \quad M \quad W.$$