

CZ1106  
CE1106

## Chapter 4

# Addressing Modes

©2020 SCSE/NTU

1

CZ1106  
CE1106

# Addressing Modes

## Introduction to Assembly Programming and Addressing Modes

### Learning Objectives (4.1)

1. Identify why and when to use assembly language programming.
2. Describe what are addressing modes.

©2020 SCSE/NTU

2

each mnemonic is represented  
by 4 bytes ; one-to-one  
correspondence ; every mnemonic  
has a series of machine codes  $\Rightarrow$  stored in memory

CZ1106  
CE1106

## What is an Assembly Program?

- Unlike high-level programming languages, assembly level statements:
- Are known as **mnemonics**. Each has a one-to-one correspondence with a binary pattern (**machine code**) that is directly understood by CPU.
- Are **hardware-dependent** and address the architecture of processor directly. (e.g. they are CPU register-aware and reference them by name).
- Are converted to machine code by an **assembler**.

not  
compile,  
"assemble"  
is the  
correct word

can be anything  
`if (a > c)  
 b = a;  
else  
 b = c;`

C program example

standard name  
`CMP R0, R2  
BLE Else  
MOV R1, R0 ;b = a  
B Skip  
Else MOV R1, R2 ;b = c  
Skip :`

ARM assembly program equivalent

©2020 SCSENTU

3

CZ1106  
CE1106

## Why Use Assembly Language?

- More efficient codes can be created:
- Codes with faster execution **speed**.  
e.g. Algorithms for **real-time** signal processing in handheld devices can be **computationally demanding**.
- More compact program **size**.  
e.g. Low cost embedded devices may have **small memory** capacity but require many functionalities.
- Exploit **optimized features** of processor's ISA.  
e.g. High-level language compiled codes may not **exploit optimized** instructions, addressing modes and features available in the processor instruction set architecture to produce efficient run-time code.
- Many cybersecurity jobs needs good knowledge in assembly programming.

©2020 SCSENTU

4

```

16 Start   MOV    R1,#Count_A      ;initialise pointer R1 to mem var Count_A
17       ADD    R2,R1,#4      ;initialise pointer R2 to start of String_A
18       MOV    R3,R2           ;get current string element
19       STM    R3,[R1]         ;clear mem var Count_A before counting
20       LDRB   R3,[R2]         ;get current element in string from memory
21       CMP    R3,#null_char  ;compare with Null value
22       BEQ    Done            ;branch to done if Null character
23       LDR    R4,[R1]         ;get mem var Count_A
24       ADD    R4,R4,#1      ;increment string length count
25       STM    R4,[R1]         ;save new length back to mem var Count_A
26       ADD    R2,R2,#1      ;increment pointer to next string element
27       B     Loop             ;loop back to loop
28 Done    END

```

ARM assembly code for counting string length (In Lab Expt 1)



167  
cycles  
Given  
code

83  
cycles  
My best  
effort

Exploit **optimized features** of processor's ISA.

e.g. High-level language compiled codes may not **exploit optimized** instructions, addressing modes and features available in the processor instruction set architecture to produce efficient run-time code.

CZ1106  
CE1106

## When to Use Assembly Language?

- Critical parts of the operating system's software.  
Especially parts of system kernel that are constantly being executed (e.g. scheduler, interrupt handlers).
- Input/Output intensive codes.  
Device drivers and "loopy" segments of code that processes streaming data (e.g. video decoders, etc).
- Time-critical codes.  
Code that detect incoming sensor signals and respond rapidly, e.g. Anti-lock brake system (ABS) in cars.

**Learn More:** Google "Is Linux kernel written in assembly"

©2020 SCSE/NTU

5

CZ1106  
CE1106

## Addressing Modes

In memory  
in memory  
or in the  
instruction  
register

- Addressing mode (AM) is concerned with how data is **accessed**, not the way data is processed.
- The correct AM allows the CPU to identify the actual operand or the address location where operand is stored.
- The ARM processor instruction set architecture supports many different addressing modes.
  - Register direct
  - Immediate data
  - Register indirect
  - Register indirect with offset
  - Register indirect with index register
  - Pre and post auto-indexing

**Learn More:** Google "addressing modes"

©2020 SCSE/NTU

6

CZ1106  
CE1106

## Addressing Mode Examples

| Addressing Mode               | ARM              | Intel           |
|-------------------------------|------------------|-----------------|
| Absolute (Direct)             | None             | MOV AX, [1000h] |
| Register Direct               | MOV R1, R0       | MOV AX, DX      |
| Immediate                     | MOV R1, #3       | MOV AX, 0003h   |
| Register Indirect             | LDR R1, [R0]     | MOV AX, [BX]    |
| Register Indirect with Offset | LDR R1, [R0, #4] | MOV AX, [BX+4]  |
| Register Indirect with Index  | LDR R1, [R0, R2] | MOV AH, [BX+DI] |
| Implied                       | BNE LOOP         | JMP -8          |

©2020 SCSE/NTU

7

CZ1106  
CE1106

## Summary

- Codes written well in assembly language can usually execute **faster** and are smaller in **size**.
- Code for **low-level** OS kernels, **I/O** intensive and **time-critical** operations can benefit significantly from assembly-level coding.
- Understanding the **characteristics** and **application** of different **addressing modes** available in a processor's ISA allows programmers to write efficient codes.

©2020 SCSE/NTU

8

CZ1106  
CE1106

## Chapter 4

# Addressing Modes

## Register Direct and Immediate Addressing

### Learning Objectives (4.2)

1. Describe what is register direct.
2. Describe what is immediate data and its application.

©2020 SCSE/NTU

9

CZ1106  
CE1106

## Register Direct

- Operand is the content of the specified **register**.
- Register direct can be used for both destination and source operand.
- In the **MOV** instruction, the right operand is the source and left operand is the destination.



|    |            |
|----|------------|
| R0 | 0x12345678 |
| R1 | 0x00000000 |

Before execution

|    |            |
|----|------------|
| R0 | 0x12345678 |
| R1 | 0x12345678 |

After execution

- A **fast** addressing mode since no further memory access is involved during execution.
- Should be used to optimise execution speed.

©2020 SCSE/NTU

10

↓  
try to use register direct

CZ1106  
CE1106

## Register Direct (cont)

- All ARM's 16 registers can be a register direct operand.
- These registers can be either a source or destination operand.

```
MOV R3, LR ;make copy of LR in R3
MOVS R0, R0 ;test for N or Z condition in R0
MOV PC, R1 ;make a jump to address in R1
```

|     |
|-----|
| R0  |
| R1  |
| R2  |
| R3  |
| R4  |
| R5  |
| R6  |
| R7  |
| R8  |
| R9  |
| R10 |
| R11 |
| R12 |
| SP  |
| LR  |
| PC  |

©2020 SCSE/NTU

11

CZ1106  
CE1106

## Immediate Addressing

- Operand is directly specified **within the instruction itself**.
- "#" symbol precedes the immediate value.
- Example: **MOV R1, #3** *(can be hexadecimal / decimal → remember will convert for you)* *(32 bits value of 3)*
- After execution, the **immediate value is copied** into the destination register (left operand).
- Immediate addressing can only be used as a **source operand**. *→ cannot move register to the value 3, etc*
- Used for loading **constant values** into registers. Values must be known at the time of coding (e.g. load loop count into a loop counter register).

R1 0x12345678

Before execution

R1 0x00000003

After execution

©2020 SCSE/NTU

12

not a destination operand

CZ1106  
CE1106

## Immediate Addressing (cont)

- How is the 32-bit immediate value encoded?
- The immediate value is specified **within the instruction** bit pattern itself.



- How can a 12-bit operand encode a 32-bit immediate value?
- It can only describe a **subset** of all  $2^{32}$  possible values.
- Immediate value is a number between (0..255) rotated right by  **$2n$**  bits, where the value of **n** is given by 4 bits ( $0 \leq n \leq 15$ ).

©2020 SCSE/NTU

13

CZ1106  
CE1106

## Immediate Addressing (cont)

- Assembler does the necessary calculations and gives warning if requested immediate value cannot be encoded.

**MOV R3, #0xFF** ;immediate values within 8 bits always valid

**MOV R0, #0x100** ;right rotate 8-bit value of 0x01 with  $n=12$

**X MOV R1, #0x102** ;this is not a valid immediate value

all is in 8 bit ; no rotation

24 bit right rotate

- A combination of instructions can be used to achieve the desired immediate values that is not valid.

**MOV R1, #0x100** ; load 0x100 to R1  
**ADD R1, R1, #2** ; add 2 to R1

at divide by 4^n when cannot

Only possible with this

14



not possible

1 bit only (right rotate are in 2n bits)

# Slide 13 (cont)

## Immediate Addressing (cont)



2n = 0

2n = 2

2n = 30



with 4 bits , you  
can have the  
value 0 to 15

CZ1106  
CE1106

## Quiz: The MOV mnemonic

- Which of the following **MOV** mnemonic(s) is(are) **not valid** ARM instructions?
- You may select more than one answer.



Current Processor Status Register (CPSR) cannot be accessed using the **MOV** operator. \*

Register **R13** is also known as the stack pointer (**SP**). Both notations are valid for register direct. ✓

**MOV R1, R13**



**MOV R1, SP**



**MOV #0, R0**



**MOV R2, CPSR**



↓  
**R0-R15 only**



11:02

-03:46

3:17 PM Thu 21 Jan 70% 3:17 PM Thu 21 Jan 70%

Chapter\_04\_2-Register Direct and Immediate.mp4

CZ1106 CE1106 Quiz: Immediate Addressing

• Which of the following immediate value(s) of  $I$  is(are) **not valid** for the ARM mnemonic **MOV R3, #I**?



Bit 11 2 5 8 Bit 0  
0 0 1 0 0 1 0 1 1 0 0 0  
**0x96**

Decimal value of **0x100** ✓  
 $I = 256$

Bit 11 3 F 0 Bit 0  
0 0 1 1 1 1 1 1 0 0 0 0  
**0x3F**

8-bit hex value of **0x96** rotated right by 30 ( $n=15$ ) bits ✓  
 $I = 0x258$

8-bit hex value of **0x3F** rotated right by 28 ( $n=14$ ) bits ✓  
 $I = 0x3F0$

**b**

**conversion of hex**

**c** **d**

12:48 -01:58

3:17 PM Thu 21 Jan 70% 3:17 PM Thu 21 Jan 70%

Chapter\_04\_2-Register Direct and Immediate.mp4

CZ1106 CE1106 Quiz: Immediate Addressing

• Which of the following immediate value(s) of  $I$  is(are) **not valid** for the ARM mnemonic **MOV R3, #I**?



Bit 11 2 5 8 Bit 0  
0 0 1 0 0 1 0 1 1 0 0 0  
**0x96**

Decimal value of **0x100** ✓  
 $I = 256$

Bit 11 0 Bit 0  
0 0 1 1 1 1 1 0 0 0 0 0  
**0x3F**

8-bit hex value of **0x96** rotated right by 30 ( $n=15$ ) bits ✓  
 $I = 0x258$

8-bit hex value of **0x3F** rotated right by 28 ( $n=14$ ) bits ✓  
 $I = 0x3F0$

**a** **b** **c** **d**

CZ1106  
CE1106

## Summary

- Register direct is **efficient** as its execution involves no access to memory.
- Immediate addressing encode the operand **within the instruction**.
- Like register direct, immediate addressing in the ARM is **efficient** as memory access is not incurred during execution, only when fetching the instruction.
- Because data is encoded within fixed-length instruction, **only a subset** of immediate values are available.
- Immediate addressing is used when the operand **value is known** during the time of coding (e.g. loading known constants into registers).

1 memory  
cycle instruction

©2020 SCSE/NTU

15

CZ1106  
CE1106

©2020 SCSE/NTU

16

CZ1106  
CE1106

## Chapter 4

# Addressing Modes

## Register Indirect with Base Register

### Learning Objectives (4.3)

1. Describe what is register indirect and the ARM instructions that support this addressing mode.
2. Describe the variants and application of register indirect that uses base plus offset and index register.
3. Compare the relative pros and cons of register direct and register indirect addressing modes.

©2020 SCSE/NTU

17

CZ1106  
CE1106

## Limitation of Register Direct and Immediate Addressing

- Register direct and immediate addressing **do not** allow CPU to access operands stored in **memory**.  
*e.g. C arrays*  
*integer array*
- C variables are usually allocated memory for storage (especially large arrays).
- How do you specific a 32-bit address in memory using a 32-bit long instruction?
- The ARM specifies the 32-bit address of the operand in a 32-bit **register**.
- The register with the memory address **points to the memory** location where the operand is stored.
- Memory operand is fetched during instruction execution using **register indirect** addressing.
- The ARM uses the **LDR** and **STR** mnemonics to access memory operands.

©2020 SCSE/NTU

18

CZ1106  
CE1106

## The LDR Instruction

- The **LDR** operator is used to copy **memory** content to a register.
- The left operand the destination register.
- The right source operand is a memory location whose address is contained in a register (register indirect addressing).



©2020 SCSE/NTU

19

CZ1106  
CE1106

## The STR Instruction

- The **STR** operator is used to copy register content to **memory**.
- The left operand is always a source register.
- The right destination operand is a memory location whose address is contained in the indirect register.



©2020 SCSE/NTU

20

← little endian  
→ big endian

CZ1106  
CE1106

## Data Alignment Constraints

- Access of 32-bit operand from memory must follow data alignment constraints.
- The **4-byte data** read or written to memory must start at an address that is a **multiple of 4**.
- The effects of an unaligned memory access depend on the ARM architecture but they invariably result in **performance degradation**.



©2020 SCSE/NTU

21

CZ1106  
CE1106

## Register Indirect with Offset

- Adds** a specific **offset value** to the indirect register to compute the effective address (EA) in memory.
- Base Plus Offset** addressing does not change indirect register's content.
- Offset value allows required element in an array to be retrieved with respect to its base address (**BA**) in **R0**.



©2020 SCSE/NTU

22

CZ1106  
CE1106

## Program Example Accessing Array Elements

- Use base plus offset to access array element whose index is known during coding time.

```
main()
{
// assume base address
// of array i is 0x100
int i[5];
i[0]=7; - in the base address
i[4]=7;
}
```

(6 bytes array)

### C program example

Assign first & last elements of array **i** with value of 7.

|                   |   |
|-------------------|---|
| MOV R2, #0x100    | 1 |
| MOV R1, #7        | 2 |
| STR R1, [R2, #0]  | 3 |
| STR R1, [R2, #16] | } |

### Using register indirect with base plus offset

- Initialize base address of array **i** into register **R2**. *→ indirect register*
  - Load value of 7 into register **R1**. *→ temp register*
  - Store the value of 7 into **i[0]** and **i[4]** using offsets of 0 and 16 of register **R2** respectively . *→ in memory location*
- In computing offset, note that each integer element occupies 4 bytes in memory.

©2020 SCSE/NTU

23

CZ1106  
CE1106

## Register Indirect with Index Register

- This variant **adds** the content of the **index register** to the indirect register to compute EA.
- Base Plus Index Register** does not change base register's content.
- Modifiable** index value in **R2** allow different array elements to be retrieved with respect to base address (**BA**) during program execution.

**LDR R1, [R0, R2]**

Destination (Register)      Source (Memory)



©2020 SCSE/NTU

24

CZ1106  
CE1106**Program Example****Clearing All Array Elements**

- Use base plus index register to access each array element in turn.

```
main() {
    // assume base address
    // of array i is 0x100
    int i[400];
    int n=0;
    while (n < 400) {
        i[n] = 0;
        n = n + 1;
    }
}
```

**C program example**

Initialise all 400 elements in array **i** with zero.

©2020 SCSE NTU

**Using register indirect with base plus index**

- Initialize base address of array **i** into register **R2**. *base register*
- Load value of 0 into register **R0** and **R1** (index register). *temp register required to do the copy with STR instruction*
- Store 0 in **R0** into **i[n]** using current index value in **R1** plus base address in **R2**. *initialized to 0 to start with* *Base + current index (0) ⇒ value 0 is copied into the address 0x100*
- Increment index by 4, the size of each integer element in array.

*↳ to get the next element*

25

CZ1106  
CE1106**Summary**

Memory to Register  
Register to Memory Location

- Register indirect (with the **LDR** and **STR** operators) allows memory operands to be accessed.
- There are two variants of register indirect using base register.
  - Register indirect with **offset** (base plus offset)
  - Register indirect with **index** (base plus index register)
  - Contents in base register **do not change** after execution.
- Given the **base address** of an array, register indirect with base addressing is useful for accessing the contents of the array.
  - Use base plus offset if position of array element is known during coding time
  - Use base plus index if array element position is computed during run time.

index register

©2020 SCSE NTU

26

# Quiz: Register Indirect



- Given the initial states shown, which are the correct states of **R0** and **R2** after the execution of the instruction **STR R2, [R0, #4]**?

| Initial States |            |
|----------------|------------|
| R0             | 0x00000100 |
| R2             | 0x12345678 |
| <br>           |            |
| Address        | Memory     |
| 0x100          | 0x00       |
| 0x101          | 0x00       |
| 0x102          | 0x00       |
| 0x103          | 0x00       |
| 0x104          | 0x88       |
| 0x105          | 0x88       |
| 0x106          | 0x88       |
| 0x107          | 0x88       |
| :              | :          |

|    |            |
|----|------------|
| R0 | 0x00000100 |
| R2 | 0x12345678 |



|    |            |
|----|------------|
| R0 | 0x00000100 |
| R2 | 0x88888888 |

Adding **R0** with offset **4** to get the effective address of memory operand does not alter content of indirect register **R0** after execution.

|    |            |
|----|------------|
| R0 | 0x00000104 |
| R2 | 0x1245678  |

base don't  
↑ change

|    |            |
|----|------------|
| R0 | 0x00000100 |
| R2 | 0x00000000 |

**STR** operator copies register content to memory. So source register is not altered after execution. Only destination memory location is modified.

→ register itself does not get altered  
(unless maybe LDR)

CZ1106  
CE1106

## Quiz: Execution Speed

- With the initial states shown, which instruction will be **able** to load the value **0x102** into register **R2** with the **least** number of clock cycles?



| Initial States |             |
|----------------|-------------|
| R0             | 0x000000102 |
| R2             | 0x12345678  |
| Address Memory |             |
| 0x100          | 0x02        |
| 0x101          | 0x01        |
| 0x102          | 0x00        |
| 0x103          | 0x00        |
| 0x104          | 0x00        |
| 0x105          | 0x00        |
| :              | :           |

**MOV R2, #0x102**

↙  
The immediate value  
#0x102 is not valid

**MOV R2, R0**

↙  
0x102 in R0 is copied into R2.  
only 1 memory cycle



**LDR R2, [R0]**

↙  
Address 0x102 in R0  
does not meet data  
alignment constraints  
for 4-bytes memory access.

**LDR R2, [R0, #-2]**

↙  
Effective address 0x100 meets  
data alignment constraints but  
LDR takes 2 memory cycles  
(not the least no. of clk cycles)



17:51

-03:47



# Lecture (22 / 1)

**CZ1106  
CE1106 Review Previous Session (VisUAL)**

Which ARM register(s) contents will change after the execution of **MOVS R0,R1** given the initial states shown below?

A. R0 → Should have program counter (PC)  
 ✓ B. R0, PC  
 C. R0, PC, CPSR  
 D. R0, R1, PC, CPSR

*(S)hould be changed (S)ource*  
*introduce the NZCV flag*

Initial value → probably negative

Initial States: R0 = 0x12345678, R1 = 0xFFFFFFFF, R2 = 0x0, R3 = 0x0, R4 = 0x0

Clock Cycles: Current Instruction: 1 Total: 5  
 CPSR Status Bits (NZCV): 1 0 0 0

N Z C V      1 0 0 0

1.4x

register to register:  
 LDR won't work

if its:  
 0 0 0 0  
 ⇒ ANSWER: C

**CZ1106  
CE1106 MOV and LDR Instructions**

Which of this ARM mnemonic below will not result in **R1=0x00000200** given the initial states shown.

Note: Little Endian byte-ordering is used  
 MUST USE MOV R1, R0  
 to do register direct

✓ A. **LDR R1, R0**  
 B. **MOV R1, #0x200**  
 C. **LDR R1, [R0]**  
 D. **None of the above**

Address Memory: 0x200, 0x201, 0x202, 0x203, ..., Initial States: R0 = 0x00000200, R1 = 0x12345678

44% A, 15% B, 17% C, 24% D

1.4x

**CZ1106** **CE1106** Register Indirect with Base & Offset

**CX1106**

**MOV R0, R1**  
**MOV R0, #0x100**  
**LDR R0, [R0, #4]**

Given the initial states below, what is the value in R0 after executing the code segment shown.

Note: Byte-ordering is based on Little Endian

R0 **0x12345678**  
R1 **0x12345678**

| Address | Memory |
|---------|--------|
| 0x100   | 0x00   |
| 0x101   | 0x11   |
| 0x102   | 0x22   |
| 0x103   | 0x33   |
| 0x104   | 0xDD   |
| 0x105   | 0xCC   |
| 0x106   | 0xBB   |
| 0x107   | 0xAA   |
| 0x108   | :      |

Initial States

Little Endian

Register indirect!

$EA = 0 \times 100 + 4 = 0 \times 04$

A. R0 **0x00000000**  
B. R0 **0x00000100**  
C. R0 **0x12345678**  
D. R0 **0x33221100**  
E. **✓ R0 0xAABBCCDD**

77%  
9%  
1%  
11%

©2020 SCSENTU

21:17 / 58:34

1.4x

**CZ1106** **CE1106** Review of Addressing Mode (Part 1)

**CX1106**

**R2 0x00001100**

Which of the following instruction will not give this result in R2 after execution.

Note: Byte-ordering is based on Little Endian

A. **MOV R2, R0**  
B. **MOV R2, #0x1100**  
C. **LDR R2, [R0, R1]** EA =  $0 \times 1100 + 4 = 0 \times 1104$   
D. **LDR R2, [R0, #1]** EA =  $0 \times 1100 + 1 = 0 \times 1101$   
E. None of the above

So that no data alignment violation!  
use LDRB  
fix size  
index

it is like  
0x1001  
will have problem

R0 **0x00001100**  
R1 **0x00000004**

| Address | Memory |
|---------|--------|
| 0x1100  | 0x00   |
| 0x1101  | 0x00   |
| 0x1102  | 0x11   |
| 0x1103  | 0x00   |
| 0x1104  | 0x00   |
| 0x1105  | 0x11   |
| 0x1106  | 0x00   |
| 0x1107  | 0x00   |
| 0x1108  | :      |

Initial States

however,  
data alignment  
violation  
over placed  
into R2

63%  
2%  
6%  
14%  
15%

©2020 SCSENTU

25:52 / 58:34

1.4x

CZ1106  
CE1106

## Chapter 4

# Addressing Modes

## Register Indirect with Autoindexing and Stacks

### Learning Objectives (4.4)

1. Describe what is autoindexing feature of ARM's register indirect addressing mode.
2. Describe the differences between pre-index and post-index addressing modes.
3. Describe the various stack implementations and operations using the ARM addressing modes.

©2020 SCSE/NTU

27

CZ1106  
CE1106

## Register Indirect with Autoindexing

- Recap – Register indirect with base register:
  - The base address in the indirect register can be added with an offset or the contents of index register to compute effective address of operand in memory.
  - Base address is **never modified** after execution as it is assumed to be the sole reference to the start of the array.
- What if we allow the indirect register to be modified before or after computing the effective address?
  - Keep a copy of the array's base address elsewhere.
  - **Autoindexing** allows the indirect register's content to be **modified** during execution.
  - Autoindexing provides an efficient way to access **consecutive** array elements.

©2020 SCSE/NTU

28

CZ1106  
CE1106

## Offset with Autoindexing

- Adds offset value to the autoindex register (AR) to compute effective address (EA) and AR gets modified.
- "!" in mnemonic causes autoindex register to be modified with the EA.
- Offset value added to autoindex register **R0** to compute the EA in memory. **R0** takes on EA value after execution.

**LDR R1, [R0, #4]!**

Destination (Register)      Source (Memory)

indicates auto-index

| Address | Memory |      |
|---------|--------|------|
| 0x100   | 0x00   | i[0] |
| 0x101   | 0x00   |      |
| 0x102   | 0x00   |      |
| 0x103   | 0x03   |      |
| 0x104   | 0xDD   | i[1] |
| 0x105   | 0xCC   |      |
| 0x106   | 0xBB   |      |
| 0x107   | 0xAA   |      |
| 0x108   | :      |      |

Integer Array



©2020 SCSE/NTU

29

CZ1106  
CE1106

## Program Example (Optimised Version)

### Clearing All Array Elements

- Use offset with autoindex to efficiently access each array element in turn.

```
main() {
    // assume base address
    // of array i is 0x100
    int i[400];
    int n=0;

    while (n < 400) {
        i[n] = 0;
        n = n + 1;
    }
}
```

#### C program example

Initialise all 400 elements in array **i** with zero.



#### Using register indirect plus offset with autoindex

- Initialize base address of array **i** into register **R2**.
- Load value of 0 into source register **R1**.
- Store 0 in **R1** into first element of array **i[0]**.
- Store 0 in **R1** into **i[n]** using current effective address (EA) of autoindex register **R2** plus offset **4**. Then put this EA into **R2**.

©2020 SCSE/NTU

30

CZ1106  
CE1106

## Index Register with Autoindex

- Adds index register value to autoindex register (AR) to compute effective address (EA) and AR gets modified.
- Use “!” in mnemonic to update autoindex register with EA value.
- Index value (-8) in R2 added to autoindex register R0 to compute next EA in memory. R0 takes on EA value after execution.

**LDR R1, [R0, R2] !**

Destination (Register)      Source (Memory)



## Pre-index and Post-index

- In **pre-index**, the indirect register is **autoindex** **before** being used to compute effective address.

**LDR R1, [R0, #4] ! ; R0 = R0+4 ; R1 = mem[R0]**

Offset with Autoindexing (pre-index)

**LDR R1, [R0, R2] ! ; R0 = R0+R2 ; R1 = mem[R0]**

Index with Autoindexing (pre-index)

- In **post-index**, the indirect register is used to compute the effective address **after** it is **autoindexed**.

**LDR R1, [R0], #4 ; R1 = mem[R0] ; R0 = R0+4**

Offset with Autoindexing (post-index)

**LDR R1, [R0], R2 ; R1 = mem[R0] ; R0 = R0+R2**

Index with Autoindexing (post-index)

©2020 SCSE/NTU

CZ1106  
CE1106

## Program Example (Alternative Version) Clearing All Array Elements

- Use offset with **post-index autoindex** to keep all array access within loop.

```
main() {
    // assume base address
    // of array i is 0x100
    int i[400];
    int n=0;
    while (n < 400) {
        i[n] = 0;
        n = n + 1;
    }
}
```

### C program example

Initialise all 400 elements in array **i** with zero.

©2020 SCSE/NTU



### Using offset with post-index autoindexing

- Initialize base address of array **i** into register **R2**.
- Load value of 0 into source register **R1**.
- Store 0 in **R1** into **i[n]** using current effective address (EA) in indirect register **R2**. Then add offset **4** to the current EA in **R2** so that **R2** is now pointing to the next array element.

33

CZ1106  
CE1106

## The System Stack

- A stack is a first-in, last-out linear data structure that is maintained in the memory's data area.
- The system stack in the ARM is maintained by a **dedicated stack pointer (SP)** or **R13**.
- The **FD** stack grows towards **lower memory addresses**. (e.g. by default, **SP** starts at **0xFF000000** in VisUAL ARM simulator).
- In the **FD** stack, the **SP** points to the **top item (full)** on the stack (but SP can also point to the next **empty** space on the stack).
- The 3 basic stack operations are **push**, **pop** and **access** items on the stack.

©2020 SCSE/NTU

**Learn More:** Google "ARM stack implementation"

### Full Descending (FD) Stack



34

Empty stack from the top  
 Items on the stack from the top  
 put addition

CZ1106  
CE1106

## ARM Stack Implementation (FD)

- The are 4 possible stack implementations supported by the ARM instruction set.
- Full Descending, Full Ascending, Empty Descending and Empty Ascending**

Example of **Full Descending (FD)** stack implementation:

35

CZ1106  
CE1106

## Empty Ascending Implementation (EA)

- EA is an alternative stack implementation.
- Empty** means **SP** points to an available **unoccupied** stack space.
- Ascending** means stack grows toward **higher** memory address.



36

→ offset one in values of 4,  
cause items on stack is 32 bits

1 /  
must use complementary  
addressing modes

# Quiz: Autoindexing

- Given the initial states shown, which are the correct states of **R0** and **R2** after the execution of the instruction **LDR R2 , [R0] , #4?**



**Initial States**

|    |            |
|----|------------|
| R0 | 0x00000100 |
| R2 | 0x12345678 |

| Address | Memory |
|---------|--------|
| 0x100   | 0x00   |
| 0x101   | 0x00   |
| 0x102   | 0x00   |
| 0x103   | 0x00   |
| 0x104   | 0x88   |
| 0x105   | 0x88   |
| 0x106   | 0x88   |
| 0x107   | 0x88   |
| :       | :      |

**X R0 0x00000100      R2 0x00000000**

Autoindex will modify the indirect register R0.

**X R0 0x00000100      R2 0x88888888**

This is offset with post-index autoindexing.  
This mean EA is given by 0x100 in R0 before offset 4 is added to R0.

**R0 0x00000104      R2 0x88888888**

Offset with pre-index autoindexing given by LDR R2 , [R0 , #4] ! gives this answer.

©2020 SCSE NTU

LDR R2 , [R0] , #4



# Quiz: The ED Stack

- Give the mnemonic to **push** the register **R1** on to an **Empty Descending (ED)** stack.



Then, add -ve offset of -4 to R13 to move SP towards lower memory address (**descending**)

**STR R1 , [R13 , #4]**

First, copy R1 to the empty stack space **SP** is currently pointing to before autoindexing. Need **post-index**.

**STR R1 , [R13 , #-4]!**

|                   |
|-------------------|
| Empty Stack Space |
| 0xFE000000        |
| 0xFE000001        |
| 0xFE000002        |
| 0xFE000003        |
| 0x88              |
| 0x77              |
| 0x66              |
| 0x55              |
| 0x44              |
| 0x33              |
| 0x22              |
| 0x11              |
| 0xFE00000C        |

R1 0x55667788

Grow towards lower memory address

**STR R1 , [R13] , #-4**



**STR R1 , [R13] , #4**

ascending stack

CZ1106  
CE1106

## Summary

- Autoindexing modifies the indirect register besides just computing the effective address.
- The autoindexing can use either **offset** or **index register**.
- **Pre-index** does the autoindexing first before computing the effective address.
- **Post-index** computes the effective address first, then does the autoindexing.
- Autoindexing can be used to implement stacks.
- There are 4 possible stack implementations (FD, FA, ED, EA).
- For a given stack implementation, the Push and Pop operation must complement each other to ensure the stack grows and collapses correctly.

©2020 SCSE/NTU

37

CZ1106  
CE1106

©2020 SCSE/NTU

38

CZ1106  
CE1106

## Chapter 4

# Addressing Modes

## PC-related Addressing Modes

### Learning Objectives (4.5)

1. Describe difference between absolute & relative jump.
2. Describe the concept of position-independent code and how it is achieved.
3. Describe how data can be accessed using PC relative addressing

©2020 SCSE/NTU

39

CZ1106  
CE1106

## Absolute Jump

- A **new address** can be loaded into the **PC** to alter the sequential order of program execution.
- An **absolute jump** to a new code position is done by loading the address to jump to into the **PC**.
- Example: **MOV PC, #0x060 ;Jump to CodeB**



- Absolute jump is not **position-independent**. This code can only execute correctly in this specific area of code memory.

©2020 SCSE/NTU

40

CZ1106  
CE1106

## Relative Jump

- An **offset** can be added to the **PC** to alter the sequential order of program execution.
  - A **relative jump** is done using the branch instruction (e.g. **B**) with an appropriate **signed offset**. (Note: the range of this offset in ARM is **+/- 32 Mbytes**).
  - Example: **B CodeB ; Jump to CodeB**
- 
- Note:** In the ARM processor the PC points 8 bytes ahead of the current executed instruction due to its pipeline architecture.

©2020 SCSE/NTU

41

⇒ end address unknown; unlikely absolute jump

Absolute jump : "Move to 3"

Relative Jmp : "Move 2 step forward"

CZ1106  
CE1106

## Position-Independent Code

- Such programs can be loaded anywhere in memory and still execute correctly (i.e. relocatable).

*not position independent*



Does not jump to **CodeB** after **PROG** is shifted to **0x090**



B still branch to **CodeB** after **PROG** is shifted to **0x090**

⇒ makes use of relative jump

many result in unpredictable outcomes

©2020 SCSE/NTU

42

CZ1106  
CE1106

## ADD Instruction (Introduction)

- This instruction does the **addition** operation.
  - The 3-operand **ADD** instruction adds 2 source operands and puts result into a 3rd destination operand.
  - The right and middle operands are added and the result is placed in the destination register.
- ADD R2, R0, R1**
- 
- Destination → + → R2
- Destination and middle operands must be registers but rightmost operand can be a **register** or an **immediate value**.

|                  |            |
|------------------|------------|
| R0               | 0x00000003 |
| R1               | 0x00000006 |
| R2               | 0x00000000 |
| Before execution |            |

|                 |            |
|-----------------|------------|
| R0              | 0x00000003 |
| R1              | 0x00000006 |
| R2              | 0x00000009 |
| After execution |            |

©2020 SCSE/NTU

43

Key idea to achieve position independent code:  
 → make sure that all access to your memory variables are relative to current program counter value.

CZ1106  
CE1106

## Accessing Data

- Position-independent (P-I) programs require data to be accessed relative to the **PC**.
- PC-relative addressing** is used to access variables in the data segment of program in memory.
- E.g.: **ADD R0, PC, #0x0F8** ; Get P-I address of Var1 in Data Seg into R0

PC-relative offset of **0x0F8** is added since PC has incremented by 8 when executing **ADD** instruction.

PC-relative Offset:  
 $\text{Var address} - (\text{PC value} + 8)$

- Referencing absolute address of variable in whatever ways will violate **P-I** requirements.



Note: In the ARM processor the **PC** points 8 bytes ahead of the current executed instruction.

Ex. **Mov R0, #0x100**  
**Mov R1, [R0]**  
 Using Absolute address of Memory Variable

Register indirect will not be position independent

(Mov R1, #0x100) Immediate addressing

CZ1106  
CE1106

## Summary

- Non-sequential execution of code can be achieved by modifying the PC contents directly.
- The Branch instruction does this by **adding a signed offset**.
- Such **relative jumps** create **position-independent** code.
- PC-relative addressing with appropriate offsets** allows memory data to be accessed in a position-independent manner.

Y offset needs to take into account the additional 8 bits that PC would have moved ahead.

©2020 SCSE/NTU

45

CZ1106  
CE1106

## Quiz: Position-Independent

- Select the instructions below that can be used at label Q1, do what is in the comments and are **position-independent (P-I)**.

**MOV R1, R0 ;copy R0 to R1**

Changing any register other than the PC is always position independent



Register - Register  
do not involve  
memory

**MOV PC, #0x20 ;jump to Jmp**

Address ARM code segment  
0x00 MOV R0, #0x20  
0x04 Q1 ???  
:  
0x20 Jmp MOV R0, R2  
:  
:



**B Jmp ;jump to Jmp**

The ARM branch instruction B uses a PC relative offset to carry out the jump and is therefore position-independent



**MOV PC, R0 ;jump to Jmp**

Loading PC with the absolute start address (i.e. **0x20**) of the instruction to jump to violates the P-I requirement that jump must be relative to the current PC value. It does not matter whether immediate or register direct is used to modify the PC.

©2020, SCSE NTU

# Lecture (25/1)

CZ1106  
CE1106

## Register Indirect with Autoindexing

CX1106

Which ARM register(s) maybe modified by the instruction below based on the given values in memory.

**LDR R1, [R0], #4**

A.

R1

Destination register

B.

R1, R0

Port-index  
Auto-indexing  
Register

✓C.

R1, R0, PC

R0 = 0x104  
After Execution

Program Counter (PC) increments  
after instruction execution

D.

R1, R0, PC, CPSR

LDR does not influence CC  
flags (No "S" suffix option)

37% 39%

R1 = (-1)

| CPSR 0 0 0 0 |        |
|--------------|--------|
| Address      | Memory |
| 0x100        | 0xFF   |
| 0x101        | 0xFF   |
| 0x102        | 0xFF   |
| 0x103        | 0xFF   |
| 0x104        | 0xDD   |
| 0x105        | 0xCC   |
| 0x106        | 0xBB   |
| 0x107        | 0xAA   |
| 0x108        | :      |

©2020 SCSE NTU



15:08 / 54:36

1.2x



CZ1106  
CE1106

CX1106

## ARM Stack Implementations

Based on the description of the **Push** and **Pop** operations shown, which ARM stack type is being implemented here?

**Push - STR R0, [SP, #-4]!**

**Pop - LDR R0, [SP], #4**

- ✓A. Full Descending (FD)
- B. Empty Descending (ED)
- C. Full Ascending (FA)
- D. Empty Ascending (EA)



©2020 SCSE NTU



19:25 / 54:36

1.2x



Solution Explanation next page →

# A) Full Descending:

⇒ Grows towards lower memory address

⇒ Stack Pointer pointing at very top

↳ points to a valid item

⇒ Full means you are pointing to occupied space



**Push - STR R0, [SP, #-4]!**  
**Pop - LDR R0, [SP], #4**

a clue that it is descending;  
would be +4 if ascending  
↳ Pre-indexing to decrement  
SP  
⇒ to point to empty space

CZ1106  
CE1106

## ARM Stack Implementation

**Push - STR R0, [SP, #-4]!**  
**Pop - LDR R0, [SP], #4**

Note: the Procedure Call Standard for the ARM Architecture (AAPCS), and **armcc** always use a full descending (FD) stack.

### A. Full Descending (FD) stack implementation:



©2020, SCSE NTU



28.33 / 54.36

1.2x



**PC-relative Data Access**

What PC offset value is needed to access VarA in a position independent manner?

Note: The PC points 8 bytes ahead of the current executed instruction.

- A. 0x0000
- B. 0x2000
- C. 0x2000 – 0x0020 – 8
- D. 0x2000 – 0x0020 + 8
- E. 0x2000 – 0x0024 – 8



©2020, SCSE NTU

|| 35:07 / 54:36

1.2x

