

## Backpatching

- It is the activity of filling up unspecified information of labels using appropriate semantics actions in during the code generation process.

e.g.  $x < 100 \text{ || } y > 200 \text{ && } x \neq y$

we know what is logical OR, if any one of the expression is true then the entire result is true

whereas logical AND means if all the expressions are true then only the result is true otherwise the result is false.

Let  $x < 100$  subsitute at 100.

100: if  $x < 100$  then blank specifies there is some label here if  $x < 100$  then entire result is true but we don't know the label for true result.

→ if  $x < 100$  is false then we have to evaluate  $y > 200$  but we don't know what is the address of this instruction.

let 101 be the address of next instruction.

101: goto \_\_\_\_\_

102: if  $y > 200$  then \_\_\_\_\_

103: goto \_\_\_\_\_

104: if  $x \neq y$  then \_\_\_\_\_

105: goto \_\_\_\_\_

106: true

107: false

- In the process of intermediate code generation we may not know all the labels in the first pass so in order to overcome this problem we need second pass, so that 2nd pass is nothing but backpatching.
- Backpatching is the process of filling these missing labels, so these missing labels are filled with the help of backpatching.
- So let us fill the labels in the 2nd pass.
- So in the first pass we don't know what are the labels here.

- In order to fill these labels we use backpatching.

2nd pass

100 : if  $x < 100$  then 106

101 : goto 102

102 : if  $y > 200$  then 104

103 : goto 107

104 : if  $x = y$  then 106

105 : goto 107

106 : true

107 : false

→ let us write the translation rule.

$B \rightarrow B_1 \parallel M B_2$

logical OR operator  
it has 2 arguments  
B - boolean expression  
 $B_1 \& B_2$

Backpatch ( $B_1$ .falselist,  $M$ .inv),  $M$  - Marker <sup>NT</sup> it is a NT  
 $B$ .truelist = merge ( $B_1$ .truelist,  $B_2$ .truelist), it can be epsilon, so  
 $B$ .falselist =  $B_2$ .falselist. epsilon means it is nothing

M gives address of  $B_2$  instr.  
address of  $B_2$  is specified  
with help of  $M$  here.

$\rightarrow$  truelist contains list of  
addresses of true st<sup>m</sup>

$\rightarrow$   $B$ .falselist contains list of  
addresses of false st<sup>m</sup>.

$B \rightarrow B_1 \& M B_2$

{

Backpatch ( $B_1$ .truelist,  $M$ .instruction),

$B$ .truelist =  $B_2$ .truelist,

$B$ .falselist = merge ( $B_1$ .falselist,  $B_2$ .falselist),

3.

(3)  $B \rightarrow ! B_1$ 

1.  $B.\text{trueList} = B_1.\text{falseList};$   
 $B.\text{falseList} = B_1.\text{trueList};$

3.

(4)  $B \rightarrow (B_1)$ 

1.  $B.\text{tl} = B_1.\text{tl};$   
 $B.\text{fl} = B_1.\text{tl};$

3.

(5)  $M \rightarrow E$ 

2.  $m.\text{instr} = \text{nextinstr};$

3.

Annotated parse tree

## Basic Blocks

- The basic block is a sequence of consecutive statements in which flow of control enters at the beginning and leaves at the end without halt or possibility of branching.

e.g:-  $t_1 := a * 5$

$t_2 := t_1 + 7$

$t_3 := t_2 - 5$

$t_4 := t_1 + t_3$

$t_5 := t_2 + b$

- A basic block is a collection of statements which are always executed in a sequential manner i.e nothing but one one one.
- Let us assume that we have 11 statements:

1.  $PROD = 0$

2.  $I = 1$  (initialization of loop counter variable)

3.  $T_2 = \text{addr}(A) - 4$  (load operation)

4.  $T_4 = \text{addr}(B) - 4$  (load operation)

leader 5.  $T_1 = 4 * I$

6.  $T_3 = T_2[T_1]$

leader 7.  $PROD = PROD + T_3$

8.  $I = I + 1$

9. If  $I \leq 20$  goto (5)

leader 10.  $j = j + 1$

11.  $K = K + 1$

- Any basic block mainly contains one entry point and one exit point. So let us assume that this is a block.

↑ entry point



- So first the control will be at the 1st Stmt. i.e Stmt to be executed.
- Then the control will be at the 2nd Stmt.
- Likewise all the stmts will be executed.

- Once all the stmt in a block are executed then the control exits from the block.
  - Any basic block contains one entry point as well as one exit point.
  - \*- Basic block should not contain conditional control statements as well as unconditional control statements in the middle of the block.  
 conditional control stmt  $\rightarrow$  simple if, if-else-etc,  
 else ladder, switch, while, do-while.
  - unconditional control stmt  $\rightarrow$  break, continue, exit etc.
- $\rightarrow$  We can have conditional control stmt as the first statement of the block or as the last stmt of the blk.
- But we should not have conditional control stmt and unconditional control stmt in the middle of the block. ie intermediate level of the block.

Eg:  $x = a + b + c$ .  $\rightarrow$  This is not three address code.  
 convert it to TAC inst?

|                 |                                                                                           |
|-----------------|-------------------------------------------------------------------------------------------|
| $t_1 = a + b$   |                                                                                           |
| $t_2 = t_1 + c$ | $\Rightarrow$ is a basic blk.                                                             |
| $x = t_2$       | $\Rightarrow$ if we place these stmt in basic blk all<br>stmt will be executed one by one |

\* Purpose of forming basic block is for register allocation and for optimization.

## Algorithm for partitioning 3 Address code into Basic block

### Rule 1) Determining the leaders.

- a) First  $\text{stmt}$  of a block is always treated as a leader.
- b) The target of conditional (or) unconditional  $\text{jump}$  <sup>stmt</sup>  $\text{addr}$  is a leader.
- c) The statement following leader conditional or unconditional jump is a leader.

### Rule 2) Determining the basic blocks :- starts at leader & ends before the next leader.

$$1. \text{ PROD} = 0 \quad - \text{leader}.$$

$$2. I = 1$$

$$3. T_2 = \text{addr}(A) - 4$$

$$4. T_4 = \text{addr}(B) - 4$$

$$5. T_1 = 4 * i \quad - \text{leader}.$$

$$6. T_3 = T_2 [T_1]$$

$$7. \text{PROD} = \text{PROD} + T_3 \quad - \text{leader}.$$

$$8. I = I + 1$$

$$9. \text{If } I <= 20 \text{ goto } (5) \quad \rightarrow \text{conditional jump target}$$

$$10. j = j + 1 \quad - \text{leader}.$$

$$11. K = K + 1.$$

$$12. \text{If } j \leq 5 \text{ goto } (*) \quad - \text{leader}.$$

$$13. i = i + j \quad - \text{leader. rule 1(c).}$$

$$\boxed{\text{PROD} = 0}$$

$$I = 1$$

$$T_2 = \text{addr}(A) - 4$$

$$T_4 = \text{addr}(B) - 4.$$

B1

leaders  $\text{stmt}$

1

.5

7

10

13

$$\boxed{T_1 = 4 * i}$$

$$\boxed{T_3 = T_2 [T_1]}$$

B2

$\text{PROD} = \text{PROD} + T_3$

$I = I + 1$

If  $I \leq 20$  goto  $B_2$

$B_3$

$j = j + 1$

$K = K + 1$

If  $j \leq 5$  goto  $B_3$

$B_4$

$i = i + j$

$B_5$

$$C = 3029 \cdot 1$$

$$I = 4 \cdot 18$$

$$P - (A) \cdot EDD = ST \cdot C$$

$$P - (9) \cdot EDD = ST \cdot 18$$

$$18 \cdot P = ST \cdot C$$

$$(ST) \cdot ST = ST \cdot 18$$

$$ST \cdot 3029 = 3 \cdot 18 \cdot 1$$

$$ST \cdot I = I \cdot 18$$

$$(I) \cdot ST = I \cdot 18 \cdot 1$$

$$I \cdot ST = I \cdot 18$$

$$(I) \cdot ST = I \cdot 18 \cdot 1$$

$$(I) = I \cdot 18$$

$$I = 3029 \cdot 1$$

$$P = 3029 \cdot 18 \cdot 1$$

$$P = 54462 \cdot 1$$

### Flow Graph

- A flow graph is a directed graph in which the flow control information is added to the graph or basic blocks. directed graph means there are some directions from one node to another.
- A flow graph is a collection of nodes where we can have some edges from one node to another node.
- Here the basic blocks becomes the nodes of flow graph.
- So if we take previous example we have 5 nodes i.e.  $B_1, B_2, B_3, B_4, B_5$ .
- The edges will specify the flow of information from one block to another block.
- If there is any reference from  $B_1$  to  $B_2$  then we can say that  $B_2$  block immediately follows  $B_1$ .
- If there is a reference  $B_5$  to  $B_4$  then we can say that  $B_4$  immediately follows  $B_5$ .
- In previous eg  $B_1$  is called initial block because it contains the first leadin.



Loop

- Loop is a collection of nodes in the flow graph such that
- i) All such nodes are strongly connected. That means always there is a path from any node to any other node within that loop.
- ii) The collection of nodes has unique entry. That means there is only one path from a node outside the loop to the node inside the loop.
- iii) The loop that contains no other loop is called inner loop.

Begin

prod := 0

i = 1;

do begin

prod := prod + a[i] \* b[i];

i = i + 1;

end

while i &lt;= 20

end .

Three address code is

- B1 { 1. prod := 0                          1, 3  
   2. i := 1  
   3. t1 := 4 \* i  
   4. t2 := a[t1]  
   5. t3 := 4 \* i  
   6. t4 := b[t3]  
   7. t5 := t2 \* t4  
   8. t6 := prod + t5  
   9. prod := t6.  
 10. t7 := i + 1  
 11. i := t7  
 12. if i <= 20 goto (3)
- . . . . .

B<sub>1</sub>: 1. prod := 0  
 2. i := 1  
 3. t<sub>1</sub> := 4\*i  
 4. t<sub>2</sub> := a[t<sub>1</sub>]  
 5. t<sub>3</sub> := 4\*i  
 6. t<sub>4</sub> := b[t<sub>3</sub>]  
 7. t<sub>5</sub> := t<sub>2</sub> \* t<sub>4</sub>  
 8. t<sub>6</sub> := prod + t<sub>5</sub>  
 9. prod := t<sub>6</sub>.  
 10. t<sub>7</sub> := i + 1  
 11. i := t<sub>7</sub>  
 12. if i <= 20 goto (3).

B<sub>2</sub>

B<sub>1</sub> is the predecessor of B<sub>2</sub> and B<sub>2</sub> is the successor of B<sub>1</sub>.

Loop (Example)

## Transformations on Basic blocks

- Basic block computes set of expressions
- Values of the variables outside the block is decided by the computation inside the block.
- Two basic blocks are equivalent if they compute the same set of expressions.

## Equivalence of Basic blocks

$$\begin{array}{|l|l|} \hline b = 0 & \\ \hline t_1 = a + b & \\ \hline t_2 = c * t_1 & \\ \hline a = t_2 & \\ \hline \end{array}$$

↓

$$\begin{array}{l} a = c * a \\ b = 0 \end{array}$$

$$\begin{array}{|l|l|} \hline a = c * a & \\ \hline b = 0 & \\ \hline \end{array}$$

↓

$$\begin{array}{l} a = c * a \\ b = 0 \end{array}$$

### \* Next-use Information

- Next-use information is a collection of all the names that are useful for next subsequent statement in a block. The use of a name is defined as follows:- consider a statement

$x := i$

$j := x \text{ op } y$

that means the statement  $j$  uses value of  $x$ .

- The next-use information can be collected by making the backward scan of the programming code in that specific block.

- Suppose the 3-address statement is as given below.

L:  $a := b \text{ op } c$  then the steps in backward scan are -

- i) The currently found information in symbol table regarding the next use and liveness of  $a$ ,  $b$  and  $c$  is annotated with the statement.
- ii) In the symbol table, set ' $a$ ' to "not live" and "no next-use".
- iii) Set ' $b$ ' and ' $c$ ' to live and next uses of  $b$  and  $c$  in symbol table.

### \* Register Allocation and Assignment

- The use of register operands instead of memory operands is always the faster and shorter. Proper use of registers help in generating the good code.
- Certain strategies are adopted by the compiler for register allocation and assignment.

- The most commonly used strategy to register allocation & assignment is to assign specific values to specific registers. For instance for base addresses separate set of registers can be used. Similarly for storing the stack pointers again a separate set of registers can be assigned; arithmetic computations can be done using separate set of registers.

Adv: - the design process of compiler for code generation becomes simplified.

disadvantages - the design of compiler becomes complicated because of restrictive use of registers.

### Strategies used in register allocation & Assignment

(1) Global register allocation.

(2) Usage count

(3) Register assignment for outer loop.

(4) Graph coloring for register assignment

#### 1) Global register allocation

- while generating the code the registers are used to hold the values for the duration of single block.
- All the live variables are stored at the end of each block.
- for the variables that are used consistently we can allocate specific set of registers.
- Allocation of variables to specific registers that is consistent across the block boundaries is called global register allocation.

→ Strategies adopted while doing the global register allocation

- \* It stores the most frequently used variables in fixed registers throughout the loop.
- \* Another strategy is to assign some fixed number of global registers to hold the most active values in each inner loop.
- \* The registers not already allocated may be used to hold values local to one block.
- \* In certain languages programmer can do the register allocation by using register declaration.

#### 2) Usage Count

- The usage count is the count for the use of some variable  $x$  in some register used in any basic block.
- The usage count gives the idea about how many

units of cost can be saved by selecting a specific variable for global register allocation.

- The approximate formula is given as,

$$\sum_{\text{block } B \in L} (\text{use}(x, B) + 2 * \text{live}(x, B))$$

(Where  $\text{use}(x, B)$  = no of times  $x$  is used in block  $B$  prior to any definition of  $x$ )

$\text{live}(x, B) = 1$  if  $x$  is live on exit from  $B$ ; otherwise  $\text{live}(x) = 0$

### 3) Register Assignment for Outer loop

- Consider that there are 2 loops.  $L_1$  is outer loop and  $L_2$  is an inner loop.
- Allocation of variable  $a$  is to be done to some register.
- Consider the scenario.



Nested loop

⇒ following criteria should be adopted for register assignment for outer loop.

- 1) If  $a$  is allocated in Loop  $L_2$  then it should not be allocated in  $L_1 - L_2$ .
- 2) If  $a$  is allocated in  $L_1$  and it is not allocated in  $L_2$  then store  $a$  on a entrance of  $L_2$  and load  $a$  while leaving  $L_2$ .
- 3) If  $a$  is allocated in  $L_2$  and not in  $L_1$  then load  $a$  on entrance of  $L_2$  and store  $a$  on exit from  $L_2$ .

## → Directed Acyclic Graph

### DAG Representation of Basic blocks:

- DAG represents the structure of a basic block.
- 1) In a DAG, internal node represents operators.
- 2) Leaf node represents <sup>(names of variables)</sup> identifiers, constants <sup>(values)</sup>.
- 3) Internal node also represents result of expressions.

### Applications of DAG: → If expression is repeated several times.

- 1) Determining the common sub expressions.
- 2) Determining which names are used inside the block and computed outside the block.
- 3) Determining which statements of the block could have their computed value outside the block.
- 4) Simplifying the list of Quaduples by eliminating common sub expressions.

Eg' let us consider our block contains only one Stmt.  
 $t = a + b$       internal node also represents result of expression



good b/t & result

### Construction of DAG for a basic block.

Eg' construct DAG for the expression

$a + a * (b - c) + (b - c) * d$ .      right child      parenthesis is having higher precedence so (b-c) will be evaluated first



Eg: Construct DAG for the block.

$$\textcircled{1} \quad a = b + c \quad \textcircled{3} \quad c = b + c$$

$$\textcircled{2} \quad b = a - d \quad \textcircled{4} \quad d = a - b$$



Since  $a-d$  is also represented and stored in  $b$  it is  $b, d$ .

$$(+a+b) - (+a-b) = b$$

Eg: Construct DAG for the following.

$$a = b + c$$

$$b = b - d$$

$$c = c + d$$

$$e = b + c$$



$$d = b * c$$

$$e = a + b$$

$$b = b * c$$

$$a = e - d$$



$b$  value doesn't change  
hence can be written as  $d, b$ .

Unary minus operator have higher priority than \* and +.  
Page No. 20

e.g:

$$a = b * -c + b * -c$$



e.g:

$$a = (a * b + c) - (a * b + c)$$



Ex:  
 Consider:  $\text{sum} = 0;$

$\text{for } (i=0; i \leq 10; i++)$

$\text{sum} = \text{sum} + a[i];$

The 3-address code is:

- 1)  $\text{sum} = 0$       a)  $i = 0$
- 2)  $t_1 = 4 * i$       b)  $t_2 = a[t_1]$
- 3)  $t_3 = \text{sum} + t_2$       c)  $\text{sum} = t_3$
- 4)  $t_4 = i + 1$       d)  $i = t_4$
- 5) if  $i \leq 10$  goto(3)

Basic blocks are:



B1



B2

let us consider block B2 for construction of DAG.



DAG for block B2.

## Peephole optimization

- instead of examining entire program we examine short sequence & if there is some portion L/H replace by short sequence of instructions or fastest one.

- This technique is applied to improve the performance of program by examining a short sequence of instructions in a window (Peephole) & replace the instructions by a faster or short sequence of instructions techniques.

### Characteristics of Peephole optimization

- (1) Redundant instruction elimination
- (2) Flow of control optimization
- (3) Algebraic simplification
- (4) Use of machine idioms
- (5) Removal of unreachable code

#### 1) Redundant instruction elimination

e.g. Consider the instructions:

$\text{MOV R0, A}$  } value is moved to R0 - R0 contains a value  
 $\text{MOV R0, A}$  } value is moved to R0 - R0 contains a value  
 $\text{MOV A, R0}$  } R0 contains a value

we can eliminate 2nd instn since 'A' is already in R0.

Ex:

#### 2) Removal of unreachable code

- eliminate the statement which are unreachable i.e never executed.

e.g.  $i = 0;$  }  $i = 0;$  throughout the program i value  
 $\{ \text{if } (i == 1)$  }  $i = 0;$  should be 0 but what is  
 $\{ \quad \{ \text{sum} = 0;$  } the condtn i value is 1  
 $\}$

can eliminate  
this block

so after optimization code is  $i = 0;$

e.g.  $\text{int add (int a, int b)}$

{  $\text{int c} = a + b;$

$\text{return c;}$

}  $\text{printf (" %d ", c);}$  → eliminate :

↓ is never executed.

- 3) Flow-of-control optimizations - using peephole optimizations.  
 - unnecessary jumps can be eliminated.

eg: goto L1

L1: goto L2

L2: goto L3

L3: ~~Mov a, R0~~

Multiple jumps can make the code inefficient. Above code can be replaced by

goto L3

L1: ~~goto L3~~

L2: ~~goto L3~~

L3: ~~goto Mov a, R0~~

#### 4) Algebraic Simplifications

eg:  $x = x + 0$  or  $(x = x * 1)$   $\Rightarrow$  is nothing but  $x$  only. result of  $x$  won't change by executing these statements.

The above statements can be eliminated because by executing those statements, the result ( $x$ ) won't change.

$\Rightarrow$  power operator

$a = x^2$  replaced with  $a = x * x \Rightarrow$  replace expensive operator with cheapest operator.

eg:  $b = y \gg 3$  " "  $b = y \gg 3 \Rightarrow$  divide by 2.

$\Rightarrow$  expensive operator

#### 5) Use of Machine Idioms

- It is the process of using powerful features of CPU instructions.

eg: auto inc/dec features can be used to inc/dec variable.

eg:  $a = a + 1$ ;  $y = inc a;$

$a = a - 1$ ;  $y = dec a;$

- \* A Simple code Generator
- Code generation is the last phase of compiler.
- It generates target code for a sequence of what?
- It uses a function getReg() to assign registers to variables.
- It uses 2 data structures : ① Register descriptor  
② Address descriptor

Register descriptor :- used to keep track of which variable is stored in a register. Initially all registers are empty.

Address descriptor :- used to keep track of location where variable is stored. Location may be register, memory address, stack ...

### A Simple code generator Algorithm

- The following actions are performed by code generator for an instruction  $x = y \text{ op } z$ . Assumes that L is the location where the output of  $y \text{ op } z$  is to be saved.
- 1) Call function getReg() to get the location of L.
  - 2) Determine the present location of 'y' by consulting Address descriptor of y. If y is not present in getReg() by L location, then generate the instruction MOV y, L to copy value of y to L.
  - 3) The present location of z is determined using step 2 and the instruction is generated as op z, L.
  - 4) Now L contains the value of  $y \text{ op } z$ , ie Assigned to x. So, if L is a register then update its descriptor that it contains value of x. update Address descriptor of x to indicate that it is stored in 'L'.
  - 5) If y, z have no future use, then update the descriptors to release remove y & z.

$$\text{eg: } d = (a-b) + (a-c) + (a-c)$$

Three address code:  $t_1 = a-b$

$$t_2 = a-c$$

$$t_3 = t_1 + t_2$$

$$d = t_3 + t_2$$

### Statement

### Code Generation

### Register Descriptor

### Address Descriptor

|       |             |                                              |                             |                       |
|-------|-------------|----------------------------------------------|-----------------------------|-----------------------|
| $R_0$ | $t_1 = a-b$ | $\text{Mov } a, R_0$ <small>register</small> | $R_0 \text{ contains } t_1$ | $t_1 \text{ in } R_0$ |
| $R_1$ | $t_2 = a-c$ | $\text{Sub } b, R_0$                         | $a-b$                       | $t_1 \text{ in } R_0$ |

|       |             |                      |                             |                       |
|-------|-------------|----------------------|-----------------------------|-----------------------|
| $R_0$ | $t_2 = a-c$ | $\text{Mov } a, R_1$ | $R_0 \text{ contains } t_1$ | $t_1 \text{ in } R_0$ |
| $R_1$ | $t_1 = a-b$ | $\text{Sub } c, R_1$ | $R_1 \text{ contains } t_2$ | $t_2 \text{ in } R_1$ |

$t_3 = t_1 + t_2$  Add.  $R_1, R_0$  our choice  $\rightarrow R_0 \text{ contains } t_3$   $\rightarrow$   $t_3 \text{ in } R_0$

$R_0$  or  $R_0, R_1$

$\rightarrow R_1 \text{ contains } t_2$   $\rightarrow$   $t_2 \text{ in } R_1$

$d = t_3 + t_2$  add.  $R_1, R_0$  in R\_0  $\rightarrow$   $R_0 \text{ contains } d$   $\rightarrow$   $d \text{ in } R_0$

|       |                             |                                            |
|-------|-----------------------------|--------------------------------------------|
| $R_0$ | <del><math>t_1</math></del> | $R_1$ contains $t_2$ <small>in R_0</small> |
| $R_1$ | <del><math>t_2</math></del> | $t_2 = ?$                                  |
| $R_2$ | <del><math>t_3</math></del> | $t_3 = ?$                                  |

$(d+3) + (3+3) = (6+6) = 12$

and were obtained with the help of both registers  $R_0$  and  $R_1$ .

Implementation of both methods is as follows:

and if one register is available:

## Generating code from DAG

- Generating code from DAG is much simpler than the linear sequence of 3 address code.
- Using DAG we can rearrange some sequence of instructions and generate an efficient code.
- Various algorithms used for generating code from DAG



### (1) Rearranging Order.

- The order of 3 address code affects the cost of the object code being generated.
- By changing the order in which computations are done we can obtain the object code with minimum cost.

Ex:

$$t_1 = a + b$$

$$t_2 = c - d$$

$$t_3 = e + t_2$$

$$t_4 = t_1 + t_3$$

For the expression  $(a+b) + (e + (c-d))$

- A DAG can be constructed for the above sequence.



- The code can be generated by translating the 3 address code line by line.

Mov a, R0

ADD b, R0

Mov c, R1

SUB d, RI  
 MOV RO, t<sub>2</sub>, t<sub>1</sub>  
 MOV e, RO  
 ADD RO, RI  
 MOV t<sub>1</sub>, RO  
 ADD RI, RO  
 MOV RO, t<sub>4</sub>.

Now if we change the ordering sequence of the above TAC.

$$t_2 = c - d$$

$$t_3 = e + t_2$$

$$t_1 = a + b$$

$$t_4 = t_1 + t_3$$

then we get an improved code as

MOV C, RO

SUB d, RO; moving bottom after addition with following

MOV e, RI

ADD RI, RI; performing bottom addition with right

MOV a, RO

ADD RI, RO; moving left to right for summing

MOV RO, t<sub>4</sub>.

2)

### Heuristic Ordering

- The heuristic ordering algorithm is as follows:
- (1) obtain all the interior nodes. consider these interior nodes as unlisted interior nodes.
- (2) while (unlisted interior nodes remain)
  - {
  - (3) pick up an unlisted node n, whose parents have been listed.
  - (4) list n; (and its number of leftmost child node list)
  - (5) while (the leftmost child m of n has no unlisted parent AND is not leaf)
    - {
    - (6) list m;
    - (7) n = m; (i.e. m is now the parent of n, and m is leftmost)
    - }
- }

- First we will draw a DAG.



- The DAG is first numbered from top to bottom and from left to right. Then consider the unlisted interior nodes 1 2 3 4 5 6 8

Initially the only node with unlisted parent is 1

∴ Set  $n=1$  by line 4) of algorithm.

Now left argument of 1 is 2 and parent of 2 is 1 which is listed. Hence list 2.

∴ Set  $n=2$  by line 4) of algorithm.

+ 2 3 4 5 6 8

Now we will find the leftmost node of 2 and that is 6. But 6 has unlisted parent 5. Hence we cannot select 6. We can switch to 3. The parent of 3 is 1 which is listed one. Hence list 3.

Set  $n=3$ .

+ 2 3 4 5 6 8

The left of 3 is 4. As parent of 4 is 3 and that is listed hence list 4. Left of 4 is 5 which has listed parent (i.e. 4) hence list 5. similarly list 6.

+ 2 3 4 5 6 8

As now only 8 is remaining from the unlisted interior nodes we will list it.

Hence the resulting list is 1 2 3 4 5 6 8. Then the order of computation is decided by reversing this list.

We get the order of evaluation as 8 6 5 4 3 2 1. That also means that we have to perform the computations at these nodes in the given order.

$$t_8 = d/e$$

$$t_6 = a - b$$

$$t_5 = t_6 + c$$

$$t_4 = t_5 * t_8$$

$$t_3 = t_4 - e$$

$$t_2 = t_6 + t_4$$

$$t_1 = t_2 * t_3$$

This gives the optimized code for DAG even though there are any no of registers.