

Tuesday

23/08/22



Date \_\_\_\_\_

Page \_\_\_\_\_

# Compiler Design

## Unit-1



Question - Difference between application and system software.

It is a system soft.

Topic-2 Translation → convert one lang to another.  
3 types

(1) compiler - HLL to object language (may be machine level or not)

HLL → compiler → object lang. or LL  
line by line

(2) Interpreter (more efficient)

HLL → ~~compiler~~ → object Lang  
Interpreter

(3) Assembler

Assembly lang → ~~compiler~~ → object lang  
Assembler

e.g.: O1 → near to the system

MOV A,B  
ADD A,B

⇒ A=B  
A=A+B

by user  
neater.



- Q What is Translator
- Q What is the need of translator
- Q Explain types of translator
- Q Differentiate between compiler and interpreter.

$\# \text{define}$ ,  $\# \text{end}$   
First it deals with macros ( $\# \text{define}$  in C)



Wednesday 22/03/22

Date \_\_\_\_\_ / \_\_\_\_\_ / \_\_\_\_\_  
Page \_\_\_\_\_

## Topic: Assembler

- ① Specify the problem
- ② Identify the data structure
- ③ Define the format of data structure
- ④ Specify algorithm
- ⑤ Look for modularity
- ⑥ Repeat steps 1 to 5 for each module

- ⑦ Specify the problem

Assembly lang.  $\rightarrow$  [Assembler]  $\rightarrow$  object code.

numeric  $\rightarrow$  ADD A,B } Addressing mode  
 $\downarrow$  { mov A,B }

- o Immediate addressing mode
- o Data is provided within instruction

- o Direct addressing

MOV R1, #3100

~~MOV~~ providing the address.

- o Indirect addressing mode

MOV R1, R2

R2 = R3

Providing address of address

- o Base address

gender address

- ① Requirements analysis
- ② Planning & analysis
- ③ Design and analysis
- ④ Implementation
- ⑤ Testing

→ Generate Instruction

→ Evaluate opode  
→ Evaluate operand

ADD  
A, B  
variable  
size  
fixed  
size

→ Evaluate pseudops

~~evaluate~~

get act as instruction to

the assembler.

• Single pass algorithm

• Two pass algorithm

Out put of first pass is given as input to second pass (like pipelining)

→ speed will increase

→ complexity increase

comprises

multiple phases → one pass

→ Pass is one set of iteration of module

→ ~~operation of pass is pseudops~~ we will combine phases into passes

Q Difference between pass and phase

pass

pseudops  
JOHN  
START 0  
USING \*, 15  
L 1, FIVE  
A 1, FOUR  
ST 1, TEMP

Labels  
JOHN  
FOUR  
FIVE  
TEMP

FOUR DC F'4' FOUR=4 Define constant  
FIVE DC F'B' FIVE=5 Define storage  
TEMP DS 1F one fullword  
END

Load (L)

R1 = FIVE

Addl (A)

Store (ST)

R1 = R1+FOUR

TEMP = R1

→ Register 15 act as base register

→ Load the current value of location counter is the value base register.

→ ST store at location 0 (START 0)

Q) what is pseudops

Q) Difference between single pass and two pass.

Q) what are pseudops

Difference between DC and DS

Q) what is the size of fullword  
now we can define a constant

F = fullword = 2 or 4 bytes

HW = Halfword = 2 byte

B = 1B

nibble = 1/2 B

0W

Double word  
= 8B

Thursday 01/09/22

RX  
base register  
Registration

size = 4 bytes

DS

F'4'  
in in number  
fullword  
size = 4B

we can define storage using

DS F 'a', 'b', 'c', 'd'  
 in    constant  
 fullword just like an array)

$$\text{LC} \rightarrow 0$$

$$\text{RIG} \rightarrow 0$$

### Conversion

JOHN START 0  
 USING \*15  
 L A, FIVE  
 A 1, FOUR  
 ST 1, TEMP  
 FOUR DC F'4'  
 FIVE DC F'1'  
 TEMP DF 1F  
 END

| Relative Location |    | LC = 8, 4          |
|-------------------|----|--------------------|
| 0                 | L  | 1 → (0, 15)        |
| 4                 | A  | 1 → (0, 15) offset |
| 8                 | ST | 1 → (0, 15)        |
| 12                | 4  |                    |
| 16                | 5  |                    |
| 20                |    |                    |

1-pass

\* Pseudops: ~~are~~ converted into machine code.

offset (index, base)

| Pass - 2 |                  |
|----------|------------------|
| 0        | L 1, 18 (0, 15)  |
| 4        | A 1, 12 (0, 15)  |
| 8        | ST 1, 20 (0, 15) |
| 12       | 4                |
| 15       | 5                |
| 20       | -                |



• Symbols are stored in symbol table

• Literals are either constant or value of variables  
 ex. A(4), A(5)

• If any pseudops is left in pass-1 it will be ~~resolved~~ in pass-2 process

\* Information that has to be maintained stored in

(1) Machine instruction.  $\Rightarrow$  we create machine op Table (mop T)  
 (2) Pseudo ops  $\Rightarrow$  Pseudo op table (POT)

(3) Location counter  $\Rightarrow$  store next available location

(4) Symbol table - To store information about ~~defined~~ symbols.

(5) Literal table - stores literals

(6) Base table - ~~which~~ particular registers will act as base registers

Friday 09/09/22

## Format of Datastructure in Assemblies

① Machine op table (MOT) → 6 bytes

| Mnemonic<br>(Character) | Binary code<br>(1B)<br>decimal) | Instruction<br>size<br>2 bits | Instru-<br>ction<br>format<br>3 bits | unused<br>bits |
|-------------------------|---------------------------------|-------------------------------|--------------------------------------|----------------|
| 'Albbb'                 | 5A                              | 01 → 2B                       | 000 - RR Y                           | 2B             |
| 'STbbb'                 |                                 | 10 → 9B                       | 001 - RX Y                           | 4B             |
| 'STRbb'                 |                                 | 11 → 6B                       | 010 - PS Y                           |                |
|                         |                                 |                               | 011 → SI Y                           | 6B             |
|                         |                                 |                               | 100 → SS Y                           |                |

Pseudo op Table (1B)

|                   |                                                     |
|-------------------|-----------------------------------------------------|
| Pseudo<br>op (1B) | 3 Bytes Address<br>of routine to<br>process pseudop |
|-------------------|-----------------------------------------------------|

'START'

'USING'

'DROP'

Symbol op Table (1B)

| Symbol<br>char | Name<br>(1B)<br>Hexadecimal | Length<br>1B                       | Allocation<br>1B<br>Relative/Absolute | Relocatable<br>→ Address can be<br>changed |
|----------------|-----------------------------|------------------------------------|---------------------------------------|--------------------------------------------|
| 'JOHNbbbb'     | Relative loc#<br>(address)  | R/A → Address cannot<br>be changed |                                       |                                            |
| FOUR!          |                             | L:                                 |                                       |                                            |

## Literal op Table

| Literal<br>character | Value<br>Hexadecimal | Length | Allocation<br>Relative/Absolute |
|----------------------|----------------------|--------|---------------------------------|
| 'B'                  | 4B                   |        |                                 |
| 'F4'                 | 4                    | value  |                                 |

## Base op Table

|                              |                                   |      |
|------------------------------|-----------------------------------|------|
| Available<br>(Y/N) character | Content<br>3 Bytes<br>Hexadecimal | 0 N  |
|                              |                                   | 1 N  |
|                              |                                   | 19 Y |

## Base Table

|                                   |                                |      |
|-----------------------------------|--------------------------------|------|
| Available (1B)<br>(Y/N) character | Content 3 Bytes<br>Hexadecimal | 0 N  |
|                                   |                                | 1 N  |
|                                   |                                | 19 N |

8420

A  
PART 1

|     |    |
|-----|----|
| 21  | 12 |
| 26  | 0  |
| 23  | 0  |
| 100 |    |

## Symbol Table

| Symbol     | Value | Length | Allocation |
|------------|-------|--------|------------|
| 'JOHNbbbb' | 0000  | 1      | R          |
| 'FOURbbbb' | 0000C | 4      | R          |
| 'FIVEbbbb' | 00010 | 4      | R          |
| 'TEMPbbbb' | 0014  | 4      | R          |

## External Register

| Literal | Value | Symbol  |
|---------|-------|---------|
| F '4'   |       | literal |
| F '5'   |       | literal |

SR - subtract

LA - load address

Wednesday 08 September '22

```

LC
0 ① PROGRAM2 START: 0
0 2.
0 3.
4 4.
5.
6.
7.
8.
9.
10.
11.
12.
13.

PROGRAM2 START: 0
USING R15
LA 15/SETUP
SR TOTAL,TOTAL
AC EQU 2
INDEX EQU 3
TOTAL EQU 4
DATABASE EQU 13
SETUP EQU 12
USING SETUP15
DATABASE = A(DATA1)
USING DATAAREA, DATABASE
SR INDEX, INDEX

```

BNE - Branch not equal

|    |          |                               |                  |
|----|----------|-------------------------------|------------------|
| 10 |          | AC                            | = DATA1(INDEX)   |
| 12 | 19       | LOOP                          |                  |
| 13 | 15       | AR                            | TOTAL, AC        |
| 18 | 16       | A                             | AC, = F'5'       |
| 22 | 1A       | ST                            | AC, SAVE(INDEX)  |
| 28 | 18       | A                             | INDEX, = F'4'    |
| 30 | 19       | C                             | INDEX, = F'8000' |
| 34 | 20       | BNE                           | LOOP             |
| 36 | 81       | LR                            | , TOTAL          |
| 38 | 02       | RR                            | 14               |
| 40 | 83       | L7ORG - Pseudops - Relocation | of byte          |
| 42 | 09       | SAYE                          | 2000F            |
| 45 | DATAAREA | EQU                           | *                |
| 46 | DATA1    | DC                            | F'25', 26, ---   |
| 47 | END      | END                           | (Double word)    |

BR - Branch register

Base Table

L7ORG - Literal table organizer

## Symbol Table

| Symbol   | Value |   |   |
|----------|-------|---|---|
| PROGRAM2 | 0020  | 1 | A |
| AC       | 2     | 1 | A |
| INDEX    | 3     | 1 | A |
| TOTAL    | 4     | 1 | A |
| DATABASE | 13    | 1 | A |
| SETUP    | 12    | 1 | R |
| LOOP     | 12    | 1 | R |
| SAYE     | 04    | 1 | R |
| DATAAREA | 8084  | 1 | A |
| DATA1    | 8084  | 1 | R |

## L1 C. LITERAL TABLE

|         |    |   |   |
|---------|----|---|---|
| A(DATA) | 48 | 9 | A |
| F'F'    | 72 | 9 | A |
| F'4'    | 76 | 4 | A |
| F'8000' | 60 | 9 | A |

## BASE TABLE

|             | Base register |           |
|-------------|---------------|-----------|
| 1. BR = 15  | 0200          |           |
| 10. BR = 15 | 6             | Setup     |
| 12. BR = 13 | 8084          | DATA AREA |

## QUESTIONS

- (1) Construct tables - 18 marks
- (2) Identify different literals
- (3) Various functionality of pass-1 and 2
- (4) Draw algs of pass-1 and pass-2
- (5) Discuss the 2-pass assembler
- (6) What are the modules/datastructure for successful implementation of 2-pass assembler
- (7) Difference between single and multipass assembler.

Friday 09/09/22

```

TEST      START
BEGIN    BALR 15,0   112 bytes
         USING BEGIN1D,15
         LR 4,9
         L 3FF10'
         LOOP   L 2,DATA(4)
         A 4,=F14'
         BT 3,*-16
         BCR 14
         LTORG
DATA    DC F11',3,3

```

Branch and link register instruction to transfer computer to load a register with the next address branch to the address in next field R15 < R0

114 Bytes  
112 Bytes

| Symbol | Value | Length | Allocation |
|--------|-------|--------|------------|
| TEST   | 0000  | 1      | A          |
| BEGIN  | 0000  | 1      | A          |
| LOOP   | 8     | 1      | R          |

## LITERAL TABLE

| Literal | Value | Length | Allocation |
|---------|-------|--------|------------|
| F110'   | 38    | 4      |            |
| F14'    | 42    | 4      |            |

Base table

Base registers

BR=15

2

DATA1 DATA2 DATA3

DATA1 DATA2 DATA3



Tuesday 13/09/22

## Macro processor

Assembler

macro definition

macro call

how to define macro -

I/P parameters: ARG1, ARG2, ARG3  
 MACRO MACRO NAME: ~~8ARG1, 8ARG2~~  
 number of arguments  
 set of instruction  
 MEND

START

A 1, DATA1  
A 2, DATA1  
A 3, DATA1

A 1, DATA2  
A 2, DATA2  
A 3, DATA2

MEND

MACRO

MACRO1 & ARG1, & ARG2,  
& ARG3

A 1, DATA1  
A 2, DATA2  
A 3, DATA3

LOOP1

A 1, DATA1  
A 2, DATA2  
A 3, DATA3

LOOP2

A 1, DATA1  
A 2, DATA2  
A 3, DATA3

MEND

START

MACRO1 DATA1, DATA2  
DATA3, LOOP1

A 1, DATA1  
A 2, DATA2  
A 3, DATA3

MACRO1 DATA1, DATA2  
DATA3, LOOP2

END

Mapping: LOOP1, DATA1, DATA2, DATA3  
MACRO1

Macro call

① positional argument

8ARG1 MACRO1 8ARG1  
↓ ↓ ↑  
LOOP MACRO1 DATA1

one to one  
mapping

### ③ Keyword argument

call → MACRO1 &ARG1 = DATA1  
 &ARG4 = LOOP

(A) Date / /  
 by sum(a,b)

arg  
 sum(a=9,b=5)

### ④ Conditional macros

~~macro definition branches~~  
 AGO - goto

→ AIF (Condition) Label  
 conditional branch

→ AGO: Label  
 unconditional branch

### Example

CBPART

|       | MACRO                                                                     |
|-------|---------------------------------------------------------------------------|
| LOOP1 | &ARG1 MACRO1 &COUNT,&ARG2<br>&ARG1 A. 1,&ARG2<br>A 2, DATA1<br>A 3, DATA1 |
|       | AIF (&COUNT EQU 1),FINI<br>A 2,&ARG2                                      |
| LOOP2 | AIF (&COUNT EQU 2),FINI<br>A 3,&ARG2                                      |
|       | FINI MEND                                                                 |
| LOOP3 | A 1, DATA1                                                                |
|       | This belongs to macro definition only                                     |
| END   |                                                                           |

### CBPART

LOOP1 MACRO1 3,DATA1

!  
 LOOP2 MACRO1 2,DATA1

!  
 LOOP3 MACRO1 1,DATA1

!

END

- ② Specify the syntax of a macro definition
- ③ Difference between positional and keyword argument
- ④ Difference between AIF and AGO
- ⑤ Discuss with the example how ~~selection~~ conditional macro can be constructed.
- ⑥ Discuss with the example how to create macros.

Thursday 15/09/22

### MACRO

&ARG1 MACRO1 &ARG2,&ARG3  
 &ARG1 A 1,&ARG2  
 A 2,&ARG3  
 A 3,&ARG2

MEND

role of part 1 definition gets evaluated  
 role of part 2 encounter macro call

Argument List Array (ALA)

Macro Definition Table (MDT)

Macro Name Table (MNT)

### ALA

| Index | Arguments (BB) |
|-------|----------------|
| 0     | 'ARG1bbbb'     |
| 1     | 'ARG2bbbb'     |
| 2     | 'ARG2bbbb'     |

### MDT

counter

| MDTC<br>(Assuming) | Index | MD (80 bytes)             |
|--------------------|-------|---------------------------|
| 15                 |       | &ARG1 MACRO1 &ARG2 ,&ARG3 |
| 16                 | #0    | A 1, #1                   |
| 17                 |       | A 2, #2                   |
| 18                 |       | A 3, #1                   |
| 19                 |       | MEND                      |

### MNT

| Index | MacroName (BB) | MDT index |
|-------|----------------|-----------|
| 04    | MACRO1         | 15        |

ALA

index

0  
1  
2

Arguments (BB)

COOP1  
DATA1  
DATA2

one to one mapping

### MACRO call

LOOP1 MACRO1 DATA1, DATA2

= {  
    LOOP1 A 1, DATA1  
    A 2, DATA2  
    A 3, DATA1

Friday 16/09/20

### Loaders

- ① Compile and Go
- ② General Loader scheme
- ③ Absolute Loader
- ④ Relocating Loader
- ⑤ Direct Linking Loaders

### Concepts

- Dynamic loading
- Overlay
- Dynamic linking

- Allocation
- Linking
- Relocation
- Loading

### Overlay



memory size = 70K



- Q) Write a P-E over the symbol abc  
starts with ab ends with ab & has a  
scattering ab

(ab)(a+b+c)\* (ab)(a+b+c)\* (ab)

(ab)(a+b+c)\* (ab)(a+b+c)\* (ab)

Monday 19/09/22

### Phases of compiler



example  
 $a = a + b$

$T_1 := a$

$T_2 := b$

$T_3 := T_1 + T_2$

$c := T_3$

syntax tree



parse tree cannot generate without grammar.

- ① Difference between lexeme, token and pattern?
- ② Difference between syntax and parse tree?
- ③ What is the need of lexical analysis in syntax?
- ④ Define compiler explain its phases in detail.
- ⑤ For the given expression show the output of each phase and explain each phase.
- ⑥ Which of the phase is optional phase in compiler.
- ⑦ Can we combine which phases can be combined to form multipass compiler.

Ans. Lexical and syntax analysis in pass-1 and in the next pass rest of the phases is combined.

Tuesday

20/09/22

convert it into

Lexical Analysis  $\rightarrow$  stream of tokens

Lexical analyser or scanner

Specification of token

Recognition of token

for the specification of token regular expression is used  
 for recognition  $\Rightarrow$  finite automata

R.E.  $\longrightarrow$  F.A./DFA  
 regular expression  $\downarrow$   
 finite automata.

R.E.  $\longrightarrow$  NFA  $\xleftarrow{\text{with } \epsilon}$  NFA  $\xrightarrow{\text{without } \epsilon}$  DFA  $\xrightarrow{\text{minimized}}$  DFA

a\* (bc)\*abb + (cb)\*

