

12/3/18

## UNIT - 5

### code optimization:

- Takes i/p as. o/p of intermediate code.
- Resulting pgm takes less space & less execution time
- construct - group of statements
  - |
  - equivalent efficient

→ The time  
code should  
be more  
than the  
time should  
be code.

### principle source of optimization

#### (1) Inner Loops (Loop variants)

- identify the statements and move point to the outside is called "Loop variants".

- (2) → efficient use of Registers and instructions
- (3) → common sub expressions

$$\text{eg: } A[j+1] = B[j+1]$$

$$P = j + 1$$

$$A[P] = B[P]$$

## (1) For Loops:-

Begin

prod := 0

I := 1

do

begin

Prod = prod + A(I) \* B(I)

I = I + 1

end

while I ≤ 20

end

## Three address code :-

1. Prod := 0 ✓ )

2. I := 1 !

3. T<sub>1</sub> = 4 \* I ~

4. T<sub>2</sub> = addr(A) - 4

5. T<sub>3</sub> = T<sub>2</sub>[T<sub>1</sub>]

6. T<sub>4</sub> = addr(B) - 4

7. T<sub>5</sub> = T<sub>4</sub>[T<sub>1</sub>]

8. T<sub>6</sub> = T<sub>3</sub> \* T<sub>5</sub>

9. Prod = Prod + T<sub>6</sub>

\* each A; occupies  
4 bytes.

Eg → T<sub>2</sub>[0] ↑  
I<sub>2</sub>[4] ↑  
I<sub>2</sub>[8] ↑

→ T<sub>2</sub>[0]  
↓ base ↳ offset

10.  $J = I + 1$

11. IF ( $I \leq 20$ ) GO TO 3.

divide the pgm into basic blocks & construct the flow graph.

In next page.  
Basic blocks is a group of statements, when entered at the beginning can executed in the sequence without halt (on branch)

Q

### Basic block

first Identify the leader

1) first stmt is a leader

group

2) Any stmt target of conditional is a leader.  
(or)  
unconditional

3) The stmt which immediately follows the unconditional is a leader

Block:  $(B_1, B_2)$

→ Start with the leader and upto (or but not including the next leader) is basic block.

Flowgraph (In next page)

→ These basic blocks are the nodes of the graph. (Flowgraph)

$B_1 \rightarrow B_2$

## Basic block:

- A sequence of consecutive stmts which when entered at the begining can be executed in sequence without halt (or any possibility of branch).

## Leader:

- The first stmt is a leader.
- Any stmt which is the target of conditional (or unconditional goto) is a leader.
- Any stmt which immediately follows the unconditional is a leader.

## block: ( $B_1, B_2 \dots$ )

- consists of leaders and all the stmts upto and but no including the next leader

## To construct the flow graph:

- " " " " if there is a con/uncon jump from the last stmt of  $b_1$  to the first stmt of  $b_2$ .

$$b_1 \rightarrow b_2$$

→  $b_2$  imm follows  $b_1$  in the order of the pgm and  $b_1$  doesn't end in an uncond jump..



Code motion:  
 → Identify the states which have same values  
 → Moving the point <sup>into</sup> outside the loop is called  
 "code motion".

motion is



remove I and rewrite the code motion: [which is called as  
Elimination of  
Induction variables]



Reduction in strength:

→ Replace of costlier operation by cheaper  
Operation is called as "Reduction in strength".

DAG (Directed Acyclic Graph)

→ This is also for minimization of code.

13/3/18

DAG: (DiRect Acyclic Graph)

- used for the representation of basic blocks,
- " to identify common sub expression and
- variant of syntax tree      minimize the code

Eg:-  $A = B \text{ op } c$   
      ;  
      ;  
 $x = B \text{ op } c$



$B \text{ op } c$  is also existed for A. So write x on b-side of A.

Q) 1:  $s_1 = 4 * I$

$$s_2 = \text{addr}(A) - 4$$

$$s_3 = s_2[s_1]$$

$$s_4 = 4 * I$$

$$s_5 = \text{addr}(B) - 4$$

$$s_6 = s_5(s_4)$$

$$s_7 = s_3 * s_6$$

$$s_8 = \text{prod} + s_7$$

$$\text{prod} = s_8$$

$$s_9 = I + 1$$

$$I = s_9$$

if  $I \leq 20$  go to (1)

Construct DAG for this

DAG?



In

Advantages of DAG:

- (1) Identify the common sub expressions and minimize the code.
- (2) Gives the list of variables which are used in the Wout. [leaf nodes]
- (3) List of variables which are computed in the block [inner nodes]

→ We are eliminating  $S_4$ ,  $S_8$ , and  $S_9$  from the code. Now rewrite the code.

$$1: S_1 = 4 * I$$

$$S_2 = \text{addr}(A) - 4$$

$$S_3 = S_2 [S_1]$$

$$S_5 = \text{addr}(B) - 4$$

$$S_6 = S_5 [S_1]$$

$$S_7 = S_3 * S_6$$

$$\text{prod} = \text{prod} + 7$$

$$I = I + 1$$

if  $I \leq 20$  go to 1

Data Flow Analysis :- (U & Chaining)

↓      ↓  
use definition

Reaching definition:

→ If a def reaches a particular point in a pgm then we can replace that ~~definition~~ variable with its value of the ~~definition~~ in a pgm and removed the ~~definition~~ is called "reaching definition".

Eg:- ~~A := 3~~      [ A := 3 is definition ]  
          |

$x = A + 1$       } Here A will be replaced  
 $y = A + 5$       } with 3 and we remove  
                        A which is ~~the~~ first stat.

IN(B) - set of all definitions reaching a point just before the first stmt of Block B.

OUT(B) - set of definitions reaching the point just after the last stmt of Block B.

GEN(B) - is a set of generated definitions within Block B that reach the end of the Block B.

KILL(B) - is a set of Definitions outside of Block B that define identifiers which have definitions within Block B.

Calculations:  $\text{[data Flow Equation]} \cap \text{[control Flow Eq's]}$

$$\text{IN}(B) = \bigcup_{p \in \text{predecessors}} \text{OUT}(p)$$

$$\text{OUT}(B) = \text{IN}(B) - \text{KILL}(B) \cup \text{GEN}(B)$$

→ These two equations are called as data flow equation.



$\rightarrow d_1 \rightarrow$  definition 1

$\rightarrow$  Represent definitions with Bit vector:

$d_1, d_2, d_3, d_4, d_5$

| Block | The definition<br>in last block | Gen(B)<br>bit vector | more $d_i$ and $d_k$ values 1 and 0 for<br>remaining | KILL(B)<br>The definitions that are<br>affected by another block wh<br>bit vector |
|-------|---------------------------------|----------------------|------------------------------------------------------|-----------------------------------------------------------------------------------|
| B1    | { $d_1, d_2$ }                  | 11000                | { $d_3, d_4, d_5$ }                                  | '00111                                                                            |
| B2    | { $d_3$ }                       | 00100                | { $d_2$ }                                            | 10000                                                                             |
| B3    | { $d_4$ }                       | 00010                | { $d_2, d_5$ }                                       | 01001                                                                             |
| B4    | { $d_5$ }                       | 00001                | { $d_3, d_4$ }                                       | 01010                                                                             |
| B5    | { $d_3$ }                       | 00000                | { $d_3$ }                                            | 00000                                                                             |

1/3/11

It is not easy to calculate the  $IN(B)$  and  $OUT(B)$  easily for blocks, so that for first parts we consider  $IN(B)$  as NULL set.

$OUT(B)$  as  $IN(B)$

def that are generated by

Port(1) input

Port(2) using formula  $B_1 \cup B_2$

| Block | $IN(B)$ | $OUT(B)$ | $IN(B)$ | $OUT(B)$ |
|-------|---------|----------|---------|----------|
| B1    | 00000   | 11000    | 00100   | 11000    |
| B2    | 00000   | 00100    | 11000   | 01100    |
| B3    | 00000   | 00010    | 01100   | 00110    |
| B4    | 00000   | 00001    | 00110   | 00101    |
| B5    | 00000   | 00000    | 00111   | 00111    |

$$\begin{aligned}
 OUT(B) &= 00100 - (00111) \cup (11000) \\
 &= 00100 \wedge (1(00111)) \cup (11000)
 \end{aligned}$$

$$\begin{aligned}
 B_2 & 11000 - 10000 \cup \\
 & 00100
 \end{aligned}$$

$IN(B_1) \Rightarrow$  same  
 $IN(B_2)$

↓  
The blocks which are coming to  $B_2$  is taken  $B_1, B_5$   
Perform one

$$\begin{array}{r}
 00100 \\
 11000 \\
 \hline
 00100
 \end{array} \quad \text{Perform AND}$$

$$\begin{array}{r}
 00100 \\
 11000 \\
 \hline
 11000
 \end{array} \quad \text{Perform OR}$$

11000

$$\begin{array}{r}
 11000 \\
 01111 \\
 \hline
 01000
 \end{array}$$

$$\begin{array}{r}
 01000 \\
 00100 \\
 \hline
 01100
 \end{array}$$

01100

$$OUT(B_3) = 01100 - 01001 \cup 00010$$

$$\begin{array}{r}
 01100 \\
 10110 \\
 \hline
 00100 \\
 00010 \\
 \boxed{00110}
 \end{array}$$

→ The pass (1) and pass (2) values for  $IN(B)$  and  $OUT(B)$  are not equal. So again do pass (3).

|    | <u>Pass (3)</u> |          |
|----|-----------------|----------|
|    | $IN(B)$         | $OUT(B)$ |
| B1 | 01100           | 11000    |
| B2 | 11111           | 01111    |
| B3 | 01111           | 00110    |
| B4 | 00110           | 00101    |
| B5 | 00111           | 00111    |

$IN(B_2)$  is  
 B1 and B5  
 Perform OR b/w  
 B1 and B5  
 $B_1 = 11000$  (new value @ current value)  
 $B_5 = 00111$  (not yet find so take value in previous pass)  
 OR 11111

$$OUT(B_1) = 01100 - 00111 \cup 11000$$

$$\begin{array}{r}
 01100 \\
 11000 \\
 \hline
 01000
 \end{array}
 \quad
 \begin{array}{r}
 01000 \\
 11000 \\
 \hline
 11100
 \end{array}$$

~~pass(4)~~ Here pass(2) and pass(3) are not same. So  
do one more pass.

| pass(4) | JNC(B) | OUT(B) |
|---------|--------|--------|
| B1      | 01111  | 11000  |
| B2      | 11111  | 01111  |
| B3      | 01111  | 00110  |
| B4      | 00110  | 00101  |
| B5      | 00111  | 00111  |

Pass 4 and  
pass 5 has  
same values.  
So, stop the  
table.

| pass(5) | JNC(B) | OUT(B) |
|---------|--------|--------|
| B1      | 01111  | 11000  |
| B2      | 11111  | 01111  |
| B3      | 01111  | 00110  |
| B4      | 00110  | 00101  |
| B5      | 00111  | 00111  |

20/3/18

## Applications of Ud chaining (on Reaching def.)

1. Minimize the code by using definitions

$A := 3$  by replace A with 3 in all block

2. Undefined variables.

Eg:-  $A = 1$  (dummy def)



$x = A + 1$  (when  $A = 1$  which is  
in dummy def reaches  
 $x = A + 1$ , then this is  
known as undefined  
variables).

(If you have some other  
value for A in between  
the terms, then the  
dummy def will be discarded)

Algorithm for Ud chaining (on data flow)

For each block B do

begin

$$IN(B) = \emptyset$$

$$OUT(B) = GEN(B)$$

end

change = true

while change do

```

begin
  for each block B do
    → change = false
begin
  NEWIN = ∅ out(p)
  if NEWIN ≠ IN(B) do
begin
  change = true;
  IN(B) = NEWIN()
  OUT(B) = IN(B) - KILL(B). UGEN(B)
end
end
end

```

### Live Variable Analysis:-

if x at pt p<sub>.</sub> is live at pt p



If the variable at point p is used further in the example then we can say that var x is live at pt p otherwise dead at pt p  
↑ naturally

USE - Set of variables used before their definitions.

definition - set of variables defined before their use.

$$OUT(B) = \cup IN(s)$$

$$IN(B) = USE(B) \cup (OUT(B) - \text{Def}_e)$$



a)



b) Identify the Reaching definitions



22/3/18

## \* \* imp question Code Generation Algorithm:

I/p is three address code,

O/p is Assembly Language.

→ The <sup>choosing the instruction, Registers and order of evaluation is</sup> imp for this code generation, is

(1) Instruction set

⇒ LOAD B

}

ADD 1

STORE B

AOS B

(2) Registers

(3) Order of Evaluation.

↓  
Same order.

⇒ R<sub>1</sub> R<sub>3</sub>

8bit new

Reg pair / Reg

R<sub>0</sub>, R<sub>1</sub>, R<sub>2</sub>, R<sub>3</sub> → quotient  
↓      ↓      ↓      ↓  
Remainder    Quotient    Remainder

⇒ Even Register [Remainder  
is even  
Register]

⇒ Odd Register [contains  
quotient]

$$a) w = (A - B) + (A - C) + (A - C)$$

$$T = A - B$$

$$U = A - C$$

$$V = T + U$$

$$W = V + U$$

$A = B \oplus C$

MOV  $B, R_0$

$\oplus \leftarrow / R_0$

$\Rightarrow R_0 \in R_0 + C$

$$T = A - B$$

MOV  $A, R_0$

SUB  $B, R_0$

$R_0 \in A - B$

$$U = A - C$$

MOV  $A, R_1$

SUB  $C, R_1$

$R_1 \in A - C$

$U \text{ is in } R_1$

$$V = T + U$$

ADD  $R_1, R_0$

$R_0 = R_0 + R_1$

$V \text{ is in } R_0$

$$W = V + U$$

ADD  $R_1, R_0$

$\hookrightarrow$  Here we are not moving  $T$  &  $U$

because those two variables are already exist in  $R_1, R_0$  [which is concept of live variable].



$\rightarrow$  In this we are using two functions. They are :-

(1) Register descriptor.

(2) Address descriptor.

Here we are not using  $T$  for further uses

(or next uses, so we are not moving it).

(1) Register descriptor

~~~~~

→ A Register descriptor keeps track of what is currently in each register.

(2) Address descriptor:-

→ Keeps track of the location where the current value of the name can be found.

Algorithm for identifying the location of Register

(00) GetReg( )

Q)  $A := B \text{ op } c$ .

The value of this is represented by L

GetReg( )

(1) If the value of B is in a register and B holds no other names and B is not live and has no next use after the execution of B op c then use same register to store A. Adjust the address descriptor indicating that B is no longer in particular register R.

(2) Failing (1) Return an empty register.

(3) Failing (2) Identify a suitable register move

the value of register to a memory location and return the Register.

- (1) If no suitable register is found, then return A to store the result.

Code Generation Algorithm:

- (1) Call the function GetReg to identify the Location L where the result  $A = B \text{ op } C$  could be stored.

L is Register or memory location.

- (2) Consult the address descriptor for B to determine  $B'$ , prefer Reg for  $B'$ .

If  $B'$  is different from L then generate  $\text{MOV } B', L$

- (3) Then Generate  $\text{OP } C', L$  where  $C'$  is the current location of C, prefer Reg for  $C'$  if present in Reg and mov. Adjust the addr descriptor indicating that it is in L'.

- (4) If B and C have no next use then adjust the Reg descriptor correspondingly

$$Q) W = (A - B) + (A - C) + (A - C)$$

|           |                                     | Reg descriptor<br>R <sub>0</sub> contains A | Addr descriptor<br>T is in R <sub>0</sub> |
|-----------|-------------------------------------|---------------------------------------------|-------------------------------------------|
| T = A - B | MOV A, R <sub>0</sub>               |                                             |                                           |
|           | SUB B, R <sub>0</sub>               | R <sub>0</sub> Contains T                   |                                           |
| U = A - C | MOV A, R <sub>1</sub>               | R <sub>1</sub> contains A                   |                                           |
|           | SUB C, R <sub>1</sub>               | R <sub>1</sub> contains U                   | U is in R <sub>1</sub>                    |
| V = T + U | Add R <sub>1</sub> , R <sub>0</sub> | R <sub>0</sub> Contains V                   | V is in R <sub>0</sub>                    |
| W = V + U | Add R <sub>1</sub> , R <sub>0</sub> | R <sub>0</sub> Contains W                   | W is in R <sub>0</sub>                    |
|           | MOV R <sub>0</sub> , W              |                                             |                                           |

24/3/18

Register Assignment Global Reg Allocation

→ Next page.

~~Answer to C~~



$\text{use}(a, B)$  - No. of times  $a$  is used in  $B$  before its def

$\text{live}(a, B)$  -  
 Live on exist from the block and  
 defined in the block  
 otherwise it is "0".

$$\text{Usage Count} = \text{use}(a, B) + 2 * \text{live}(a, B)$$

| Variable | B1  | Live | B2  | Live | B3  | Live | B4  | Live |
|----------|-----|------|-----|------|-----|------|-----|------|
|          | use | Live | use | Live | use | Live | use | Live |
| a        | 0   | 1    | 1   | 0    | 1   | 0    | 0   | 0    |
| b        | 0   | 0    | 0   | 0    | 0   | 1    | 0   | 1    |
| c        | 1   | 0    | 0   | 0    | 1   | 0    | 1   | 0    |
| d        | 1   | 1    | 1   | 0    | 1   | 0    | 1   | 0    |
| e        | 0   | 0    | 0   | 0    | 0   | 1    | 0   | 0    |
| f        | 1   | 0    | 0   | 1    | 1   | 0    | 0   | 0    |

last row

|   | Usage count |
|---|-------------|
| a | 4           |
| b | 6           |
| c | 3           |
| d | 6           |
| e | 4           |
| f | 4           |

$R_0, R_1, R_2, R_3$

→ If you have only 3 registers then

Store max of 3 usage count for first three  
 registers  $(R_0, R_1, R_2)$  and  $R_3$  is for temporary storage.

$a \quad b \quad c \quad d \quad e \quad f$   
 $4 \quad 6 \quad 3 \quad 6 \quad 4 \quad 4$   
 $\underline{-}$

$R_0, R_1, R_2 \quad R_3$   
 $a = R_0$

$b = R_1$

$d = R_2$

code is

$MOV \quad b, R_1$   
 $MOV \quad d, R_2$

*b<sub>def</sub>*  
 We are moving only  
 b and d to R<sub>1</sub> and R<sub>2</sub>  
 but not a bcoz  
 a is not live on it.

$MOV \quad R_1, R_0$   
 $ADD \quad C, R_0$   
 $SUB \quad R_1, R_2$   
 $MOV \quad R_0, R_3$   
 $ADD \quad F, R_3$   
 $MOV \quad R_3, E$

$MOV \quad R_0, R_3$   
 $SUB \quad R_2, R_3$   
 $MOV \quad R_3, F$

$MOV \quad R_2, R_1$   
 $ADD \quad F, R_1$   
 $MOV \quad R_0, R_3$   
 $SUB \quad C, R_3$   
 $MOV \quad R_3, E$

$MOV \quad R_2, R_1$   
 $ADD \quad C, R_1$

*order*  
 $MOV \quad R_1, b$   
 $MOV \quad R_2, d$

*b<sub>def</sub>*  
 This involves  
 more variables  
 are live in  
 not block  
 and are used  
 from previous

$MOV \quad R_0, b$   
 $MOV \quad R_2, d$

## Peep hole optimization:

- Optimization is performed by looking small piece of code.
- There are 4 techniques. They are:

### (1) Redundant instruction elimination:

{ MOV R<sub>0</sub>, a ⇒ copy stmt which moves  
R<sub>0</sub> → a.  
MOV a, R<sub>0</sub>

In this case you eliminate, mov R<sub>0</sub>, a

→ If you have Label for the instruction  
then don't eliminate it.

MOV R<sub>0</sub>, a  
L<sub>1</sub>: MOV a, R<sub>0</sub>

Goto L<sub>1</sub>

## (2) Control flow optimization:

(1) Any unlabelledstmt after an unconditional GOTO may be removed.

(2) (dead code elimination)  $\Rightarrow$  The stmts will never be executed.

Ex: if debug == 1 go to L1  
      go to L2

L1: print -----

L2:

After changing:

if debug != 1 go to L2

print: -----

L2:

Ex: go to L1

L1: Go to L2

After changing:

go to L2

L2: Go to L2

Ex: L1: if a < b  $\xrightarrow{\text{Goto L2}}$  go to L2

L3:

A variable is live at a point in a pgm if its value can be used subsequently in the pgm otherwise it is dead at that point

Here we are not removing L2 bcoz there may be a chance of other instruction for L2:

for L2:

after changing:

if  $a < b$  go to L2

go to L3

L3: —

### (3) Algebraic Simplification:

$$x = x + 0$$

$$x = x * 1$$

use of machine idioms

$$\begin{aligned} x = &x + 1 \\ x = &x - 1 \end{aligned} \quad \left. \begin{array}{l} \text{Auto increment} \\ \text{Auto decrement} \end{array} \right.$$

### (4) Common Sub Expression Elimination:

$$x = \underline{A + B + C}$$

$$y = \underline{A + B + D}$$

$$T_1 = A + B$$

$$x = T_1 + C$$

$$y := T_1 + D$$

(5) Copy Propagation: [These 5 and 6 techniques are also come under peephole optimization]  
→ same as sub expression little bit changes.



(6) constant folding:

→ Evaluate the constant expressions at compile time and replace constant expression by their value.

Eg:  $x = 2 * 3 + 4$

$$x = 6 + 28$$

## Available expression:

An expression  $x+y$  is available at point  $p$  if every path from entry node to  $p$  evaluates  $x+y$  and after the last evaluation for ~~before~~ reaching  $p$ , there are no subsequent assignment to  $x$  or  $y$ .

$$\text{Ex: } a = b + c \quad \emptyset$$

$$b = a - d$$

$$c = b + c$$

$$d = a - d$$

$$b + c$$

$$a - d$$

$$\emptyset$$