

CZ1106  
CE1106

## Chapter 5

# Instruction Set

©2020 SCSE/NTU

1

CZ1106  
CE1106

## Chapter 5

# Instruction Set

## Data Transfer Instructions

### Learning Objectives (5.1)

1. Describe how data in register and memory can be efficiently transferred.
2. Describe how byte-sized data can be access in memory.

©2020 SCSE/NTU

2

CZ1106  
CE1106

## Instruction Set – Basic Categories

- Non system-level instructions in a processor can be typically classified into three basic groups:

### Data Transfer

#### ARM examples:

`MOV R1, R0`  
`STR R0, [R2, #4]`  
`LDR R1, [R2]`

### Data Processing

#### ARM examples:

`ADD R0, R1, R2`  
`SUB R1, R2, #3`  
`EOR R3, R3, R2`

### Program Control

#### ARM examples:

`B Back`  
`BNE Loop`  
`BL Routine`

- Data transfer** – instructions that move data between registers and/or memory.
- Data processing** – instructions that modify the data in register through arithmetic or logical operations.
- Program control** – instructions that alter the normal sequential execution flow of a program. *→ branching & subroutine calls*

©2020 SCSE/NTU

3

CZ1106  
CE1106

## Register Data Transfer

- Moves source operand to the destination register.
- With **MOV**, the source operand can use either **register direct** or **immediate** addressing.

**MOV R1, R0** ; make copy of R0 in R1

**MOVS R0, #0** ; move 0 into R0 and **set Z flag**

*↑ to influence the CPSR (Condition code flags)*

- With move complement (NOT) **MVN**, the source operand is bit-wise inverted before moving into the destination register.

**MVN R1, R0** ; R1 = NOT (R0)

**MVN R0, #0** ; move 32-bit value of -1 into R0

**R0 = 0xFFFFFFFF after execution**

*NOT (0) = -1*

©2020 SCSE/NTU

4

CZ1106  
CE1106

## Memory Data Transfer

- Data in memory can be transferred to or from a register.
- With **LDR**, the memory data at the effective address is moved to the **destination register** using various **indirect register** addressing modes.

**LDR R1, [R0]** ; copy 32-bit value pointed by R0 into R1

**LDRB R2, [R0]** ; copy 8-bit value pointed by R0 into R2 (byte zero-extends to 32 bits)



©2020 SCSE NTU

5

CZ1106  
CE1106

## Memory Data Transfer

- Data in memory can be transferred to or from a register.
- With **LDR**, the memory data at the effective address is moved to the **destination register** using various **indirect register** addressing modes.

**LDR R1, [R0]** ; copy 32-bit value pointed by R0 into R1

**LDRB R2, [R0]** ; copy 8-bit value pointed by R0 into R2 (byte **zero-extends** to 32 bits)

- With **STR**, the content in **source register** is copied to the effective address in memory using various indirect addressing modes.

**STR R1, [R0]** ; copy R1 (4 bytes) starting at address pointed by R0

**STRB R2, [R0, #1] !** ; copy **byte** in R2 to only **one** address at [R0+1]; then R0=R0+1

©2020 SCSE NTU

6

bit transfer

not 4 since its by bits

CZ1106  
CE1106**Program Example****Copying a Block of Memory**

- Block copy is used to replicate a contiguous segment of memory from one location to another.
- The **MOV**, **LDR** and **STR** data transfer mnemonics are required to perform the block copy operations.

```

initial pointers {
    MOV R0, #0x100 ; setup source pointer
    MOV R1, #0x200 ; setup destination pointer
loop LDR R2, [R0] ; memory to register transfer
        STR R2, [R1] ; register to memory transfer
        ADD R0, R0, #4 ; increment source pointer
        ADD R1, R1, #4 ; increment destination pointer
}
loop back 5 times

```

**Block copy 5 words starting at address 0x100 to 0x200**

©2020 SCSE/NTU

CZ1106  
CE1106**Program Example (optimised version)****Copying a Block of Memory**

- The code can be further optimised for speed and size by using the autoindexing feature.
- The register indirect with **post-index** autoindexing will automatically add the 4 offset to the array pointers after memory access.

```

initial {
    MOV R0, #0x100 ; setup source pointer
    MOV R1, #0x200 ; setup destination pointer
loop LDR R2, [R0], #4 ; memory to register transfer
        ; with post index autoindexing
        STR R2, [R1], #4 ; register to memory transfer
        ; with post index autoindexing
}
loop back 5 times

```

**Block copy 5 words starting at address 0x100 to 0x200**

©2020 SCSE/NTU

20 cycles  
(4x5)

using post index  
as we are  
starting at  
the start of  
it block and  
we want to  
use that  
before we add 4

if pre index was  
used, will miss  
out first block  
item on

**CZ1106**  
**CE1106**

## Program Example (further optimisation version)

# Copying a Block of Memory

- Further minor optimisation by using only **one pointer register** and make use of block copy offset.
  - Be careful when using this technique as the immediate offset range for register indirect is limited to only **+/- 4096 bytes**.



**MOV R0, #0x100 ;** setup source and destination pointer

**Use one register less, save 1 cycle**

**loop** LDR R2, [R0], #4 ; memory to register transfer  
STR R2, [R0, #0xFC] ; register to memory transfer  
EA =  $108 + 0 \times 4$  & EA =  $0 \times 100$   
+  $204 \times 4$  } 20 cycles  
(4x5)

$$EA = 0 \times 100$$

R R2, [R0, #0xF]

$$EA = 0 \times 100\text{e} + 0 \times 0\text{FC}$$

; memory to register transfer  
; with post index autoindexing

; register to memory transfer

: with base plus offset of **(0x2**

• [View basic plus sheet](#) or [View](#)

**20 cycles  
(4x5)**

(4x5)

**Block copy 5 words starting at address 0x100 to 0x200**

©2020, SCSE/NTU

If have memory blocks that are far apart, this method not advisable

use index register

**CZ1106**  
**CE1106**

## Summary

- The **efficient** data transfer instruction **MOV** and **MVN** are probably the most commonly used instructions.
  - Can be used with the **register direct** and **immediate** addressing modes.
  - Memory data transfer requires the use of **LDR** and **STR** instructions.
    - Numerous variants of **register indirect** addressing modes can be used.
    - Memory data transfer instructions require **two** clock cycles to execute.
  - Byte-sized memory access can be done using **LDRB** and **STRB**.
    - Byte moved into register is zero-extended. → *What if we have a byte in the low address?*
    - Byte access of memory does not have data alignment restrictions. *in reg*

©2020 SCSE/NTU

the lowest byte  
in register to  
memory, it  
~~just goes to~~  
that one  
byte  
no zero extension)

## Quiz: The MVN Instruction

- Given the 32-bit hexadecimal value in R0 after executing the instruction MVN R0, #0x3F0.

MVN will complement the 32-bit immediate value before moving it to the destination register.

### Invalid mnemonic

a

R0 0xFFFFFC0F



32-bit value of 0x3F0  
is 0x000003F0.

Its complement is  
0xFFFFFFFFC0F.

Reminder:

$$\begin{aligned}3 &= 0011_2 \\ \text{Not } (3) &= 1100_2 \\ &= 0xC\end{aligned}$$



R0 0x000003F0



R0 0x00000C0F



1.4x ⚙ ✖ ↻

**Quiz: Data Transfer Instructions**

- Based on the initial states, which instruction(s) (after execution) will give this result:

Note: You may select more than one answer.

**Initial States**

R0 0x00000100

R2 0xFFFFFFF07

**Address Memory**

0x100 0xF8

0x101 0x00

0x102 0x00

0x103 0x00

0x104 0x00

0x105 0x00

0x106 0x00

0x107 0xF8

: :

©2020, SCSE NTU

little  
Endian {

MVN R2, #0x7

LDRB R2, [R0, #7]

a

b

MVN R2, R2

LDR R2, [R0], #4

c

d



14:03 / 17:59

1.4x



(a) The 32-bit complement of 0x7 copied into R2 is 0xFFFFFFF8

(b) The register indirect with post-index will load 0x00000F8 into R2 but R0 will be changed to 0x104 due to autoindexing

(b) EA is 0x107, so 0xF8 zero-extends in R2. Byte access — no need data alignment

(c) 0xFFFFFFF7 in R2 when complemented will become 0x00000F8. This value moves back into itself in R2.

CZ1106  
CE1106

## Chapter 5

# Instruction Set

## Arithmetic Instructions

### Learning Objectives (5.2)

1. Describe the operation and uses of the basic arithmetic instructions in the ARM instruction set.
2. Describe how arithmetic operations influence the status of Condition Code flags.

©2020 SCSE/NTU

11

CZ1106  
CE1106

## Instruction Set – Basic Categories

- Non system-level instructions in a processor can be typically classified into three basic groups:

### Data Transfer

**ARM examples:**  
**MOV R1,R0**  
**STR R0, [R2 , #4]**  
**LDR R1, [R2 ]**

### Data Processing

**ARM examples:**  
**ADD R0,R1,R2**  
**SUB R1,R2,#3**  
**EOR R3,R3,R2**

### Program Control

**ARM examples:**  
**B Back**  
**BNE Loop**  
**BL Routine**

- Data transfer – instructions that move data between registers and/or memory.
- Data processing – instructions that modify the data in register through arithmetic, logical or shift operations.
- Program control – instructions that alter the normal sequential execution flow of a program.

©2020 SCSE/NTU

12

CZ1106  
CE1106

## Arithmetic Instructions

- The basic arithmetic operations are **Add** and **Subtract**.
- These arithmetic instructions involve **three** operands.
- Add and subtract can only involve **registers** or **immediate values** (as source operand).
- The ARM instruction set provides some **variants** of the basic add and subtract operations to provide more flexibility during programming.

**ADD R2 ,R0 ,R1****ADD R2 ,R0 ,#4**

**ADD** (addition)  
**SUB** (subtraction)  
**RSB** (reverse subtraction)  
**ADC** (add with carry)  
**SBC** (subtract with carry)  
**RSC** (reverse subtract with carry)

©2020 SCSE/NTU

13

CZ1106  
CE1106

## ADD Instruction

- Addition is a **commutative** operation that adds two source operands (order is immaterial).
- But only **rightmost** operand can take on an **immediate** value. The other operands must be registers



|                  |            |
|------------------|------------|
| R0               | 0x00000003 |
| R1               | 0x00000006 |
| R2               | 0x12345678 |
| Before execution |            |

|                 |            |
|-----------------|------------|
| R0              | 0x00000003 |
| R1              | 0x00000006 |
| R2              | 0x0000000B |
| After execution |            |

©2020 SCSE/NTU

14

CZ1106  
CE1106

## Condition Code Flags and ADD

- ADD can affect all the **N, Z, V, C** flags.

|       |            | Signed Number | Unsigned Number | Signed Number | Unsigned Number |
|-------|------------|---------------|-----------------|---------------|-----------------|
| (+ve) | 0000 0001  | (1)           | (1)             | (+ve)         | (1)             |
| (+ve) | +0111 1111 | (127)         | (127)           | (-ve)         | (-1)            |
| (-ve) | 1000 0000  | (-128)        | (128)           | (+ve)         | (0)             |

N=1, V=1 ← 2's complement oVerflow  
Incorrect, hence

curr) Z=1, C=1 ← unsigned overflow

- The **V** flag when set, indicates an **overflow** when adding **signed** numbers.
- Overflow is detected when both **signed numbers** added have the **same sign** but the result has the **opposite sign**.
- The **C** flag when set, indicates an **overflow** when adding **unsigned** numbers.

©2020 SCSE/NTU

15

CZ1106  
CE1106

## SUB Instruction

$$R2 = R0 - R1 \neq R1 - R0$$

$$R2 = R0 - R1$$

↓      ↓  
Minuend   Subtrahend

- Subtraction is a **non-commutative** operation.
- All operands are register but the **rightmost** operand (subtrahend) can take on an **immediate** value.



- To influence all the condition code flags (**N,Z,V,C**), the **"S"** suffix must be added to the **SUB** mnemonic.
- RSB** can be used to reverse the subtraction order.

|                  |            |
|------------------|------------|
| R0               | 0x00000009 |
| R1               | 0x00000006 |
| R2               | 0x12345678 |
| Before execution |            |

|                     |            |
|---------------------|------------|
| R0                  | 0x00000009 |
| R1                  | 0x00000006 |
| R2                  | 0x00000005 |
| After executing SUB |            |

q - 4

©2020 SCSE/NTU

16

**CZ1106**  
**CE1106**

## SUB Instruction (cont)

- Subtraction is done by **adding** the minuend to the **negated** subtrahend.

$$A - B = A + (-B)$$

- **2's complement** is used to negate subtrahend.

**SUB R2,R0,R1 ; R2=R0-R1**

## 2's complement operation

|            |            |    |
|------------|------------|----|
| Minuend    | 0x00000003 | R0 |
| Subtrahend | 0x00000001 | R1 |
| $-$        |            |    |
|            | 0x00000002 | R2 |

=

|                     |      |
|---------------------|------|
| <b>0x00000003</b>   | (3)  |
| <b>+ 0xFFFFFFFF</b> | (-1) |
| <hr/>               |      |
| <b>0x00000002</b>   | (2)  |

|    |            |
|----|------------|
| R0 | 0x00000003 |
| R1 | 0x00000001 |
| R2 | 0x12345678 |

Before execution

|    |            |
|----|------------|
| R0 | 0x00000003 |
| R1 | 0x00000001 |
| R2 | 0x00000002 |

After execution

17

**CZ1106**  
**CE1106**

# Condition Code Flags and SUB

- **SUB** can affect all the **N, Z, V, C** flags.
  - In the ARM's **SUB** instruction, the **C** flag clears (**C=0**) if the subtraction produce a **borrow** and **C** sets (**C=1**)otherwise.
  - A borrow occurs in subtraction when the **unsigned value** of the Minuend is **less** than the unsigned value of the Subtrahend.



**unsigned (A) < unsigned (B)**      C=0

**unsigned (A) ≥ unsigned (B)**      C = 1

- The **v** flag is set (**v=1**) when the result is out of the signed 32-bit range.
  - An **unsigned underflow** is indicated by (0).

$-2^{31} < (\text{A} - \text{B}) > +2^{31}$  **V=1**

**I earn More:** Google “ARM subtraction carry flag”

19

Deal with the unknown

⇒ anytime have borrowed  
that's a sign of  
an overshoot

CZ1106  
CE1106**Program Example****Convert Negative to Positive**

- The code converts an integer array of 10 negative values to its equivalent positive values.
- The 2's complement operation on each retrieved word converts it from -ve to +ve. The -ve word in memory is then replace with its +ve equivalent.



©2020 SCSE/NTU

19

CZ1106  
CE1106**Program Example (optimized version)****Convert Negative to Positive**

- The **SUB** instruction can be used to do the negation operation more efficiently.
- By **subtracting** the value **0** with the -ve value retrieved from memory, the result will be its +ve equivalent.

$$\text{e.g. } 0 - (-5) = +5$$



©2020 SCSE/NTU

20

CZ1106  
CE1106**Program Example (further optimisation version)****Convert Negative to Positive**

- The **RSB** instruction can be used to do the negation with an additional register.
- The **reverse subtract** instruction allow the immediate value of **0** to be used as the minuend, thereby removing the need for a zeroed register.

```

MOV R0, #0x100 ; setup source pointer
                ← Use one register less, save 1 cycle
Loop LDR R1, [R0] ; get next -ve int in mem to R1
                RSB R1, R1, #0 ; do (0 – R1) to get +ve value
                STR R1, [R0], #4 ; put result back then increment
                ; to next array element
loop back 9 times R1=0-R1
  
```



Convert 10 words starting at address 0x100 to its +ve values

©2020 SCSE/NTU

21

CZ1106  
CE1106**Carry-based Arithmetic Instructions**

- ARM provides arithmetic instructions that takes the **carry bit** into consideration.
- These instructions are mainly used to support **multi-precision** arithmetic that involves data size larger than the 32-bit registers in the ARM CPU.

**ADC R2, R0, R1 ; R2=R0+R1+C** ADD with carry

**SBC R2, R0, R1 ; R2=R0–R1+NOT(C)** SUB with carry

**RSC R2, R0, R1 ; R2=R1–R0+NOT(C)** RSB with carry

- Like the other arithmetic instructions, the “**S**” suffix can be added to the mnemonic to influence the condition code flags (**N,Z,V,C**,).

©2020 SCSE/NTU

22

CZ1106  
CE1106

## Summary

- The **ADD** and **SUB** instructions are 3-operand instructions.
- Supports **register direct** and **immediate** addressing (rightmost operand only).
- Influence all condition code flags (**N,Z,V,C**) when “**S**” suffix is used.
- **SUB** is non-commutative. **RSB** allows the minuend to be an **immediate value**.
- Arithmetic instructions that incorporate the carry flag (**C**) can be employed for multi-precision arithmetic.

©2020 SCSE/NTU

23

CZ1106  
CE1106

©2020 SCSE/NTU

24

24

## Quiz: Binary Subtraction

- What is the result of the 5-bit binary subtraction  $01001_2 - 11100_2$  if both these numbers are signed numbers?

00101<sub>2</sub>

a

| Subtraction              | Decimal value (+9) | Addition equivalent | Decimal value (+9) |
|--------------------------|--------------------|---------------------|--------------------|
| 0 1 0 0 1<br>- 1 1 1 0 0 | - (-4)             | + 0 0 1 0 0         | + (+4)             |

(13)      (13)

Negate using two's complement operation

00011<sub>2</sub>

b

01101<sub>2</sub>

C

$$01001_2 - 11100_2 = 01101_2$$

Subtraction can be done using addition.  
The subtrahend is negated by using 2's complement. Then add the 2 numbers.

10010<sub>2</sub>

1.4x

©2020, SCSE NTU



20:22 / 23:21



## Quiz: The ADD Instruction

- What would be the value in register R0 after the ADD instruction shown is executed 8 times.



a

R0 0x00000001

C

R0 0x00000100

Iteration n:

1. R0 = 1 + 1 = 2
  2. R0 = 2 + 2 = 4
  3. R0 = 4 + 4 = 8
  - n. R0 = 2<sup>n</sup>
  8. R0 = 2<sup>8</sup> = (256)
- R0 = 0x00000100



R0 0x00000008

R0 0x00000256

Now  
decimal 256

Haha!

©2020, SCSE NTU



22:06 / 23:21



CZ1106  
CE1106**Chapter 5**

# Instruction Set

## Logical, Shift and Rotate Instructions

**Learning Objectives (5.3)**

1. Describe the operation and uses of the various logical instructions.
2. Describe the operation and uses of the various shift and rotate instructions.
3. Describe how multiplication and division can be done using bit shift.

©2020 SCSE/NTU

25

CZ1106  
CE1106

## Instruction Set – Basic Categories

- Non system-level instructions in a processor can be typically classified into three basic groups:

**Data Transfer**

**ARM examples:**  
**MOV R1,R0**  
**STR R0 , [R2 , #4]**  
**LDR R1 , [R2 ]**

**Data Processing**

**ARM examples:**  
**ADD R0 ,R1 ,R2**  
**SUB R1 ,R2 ,#3**  
**EOR R3 ,R3 ,R2**

**Program Control**

**ARM examples:**  
**B Back**  
**BNE Loop**  
**BL Routine**

- Data transfer – instructions that move data between registers and/or memory.
- Data processing – instructions that modify the data in register through arithmetic, logical or shift operations.
- Program control – instructions that alter the normal sequential execution flow of a program.

©2020 SCSE/NTU

26

CZ1106  
CE1106

## Logical Instructions

- Logical instructions provide various **Boolean** operators.
- MVN** is a **two-operand** instruction that does the **NOT** operation.

Example: **MVNS R2, R2**

R2 **0x00000000** R2 **0xFFFFFFF** *all 1's*  
 Before execution After execution *N=1 and Z=0*

- The **AND**, **ORR** and **EOR** operators are **three-operand** instructions for the **AND**, **OR** and **EX-OR** operations respectively.

Example: **ORRS R1, R1, #0x0000FFFF**

R1 **0x12345678** R1 **0x1234FFFF** *after execution is neither -ve or zero*  
 Before execution After execution *N=0 and Z=0*

- The “**S**” suffix can be used to influence the **N** and **Z** bits in the CC flags.

©2020 SCSE/NTU

27

CZ1106  
CE1106

## Logical Instructions AND, OR and EOR Applications

- The basic logical instructions can be used to:
  - AND** – **clear** specific bits in destination operand.
  - ORR** – **set** specific bits in destination operand.
  - EOR** – **complement** specific bits in destination operand.

*Exclusive OR*

AND truth table

| A | B   | Z = A · B |
|---|-----|-----------|
| 0 | 0 * | 0         |
| 0 | 1   | 0         |
| 1 | 0 * | 0         |
| 1 | 1   | 1         |

\* Binary **0 mask** is used to **clear** the bit

OR truth table

| A | B   | Z = A+B |
|---|-----|---------|
| 0 | 0   | 0       |
| 0 | 1 * | 1       |
| 1 | 0   | 1       |
| 1 | 1 * | 1       |

\* Binary **1 mask** is used to **set** the bit

EX-OR truth table

| A | B   | Z = A ⊕ B |
|---|-----|-----------|
| 0 | 0   | 0         |
| 0 | 1 * | 1         |
| 1 | 0   | 1         |
| 1 | 1 * | 0         |

\* Binary **1 mask** is used to **complement** the bit

©2020 SCSE/NTU

28

Example: **AND R1, R1, #0x0000FFFF**

R1 **0x12345678** R1 **0x00005678**  
 Before execution After execution

F < 1 : allow values to be retain

O = 0 : clear the remaining part

CZ1106  
CE1106

## Instruction examples AND, ORR and EOR Applications

Bits 7 6 5 4 3 2 1 0  
R0 ..01010101

Initial condition of least significant 8 bits register R0

- e.g. AND R1, R0, #...11110000 (e.g. AND R1, R0, #0xF0)
 

R1 ..01010000 Bits 0 to 3 cleared after execution
- e.g. ORR R0, R0, #...11110000 (e.g. ORR R0, R0, #0xF0)
 

R0 ..11110101 Bits 4 to 7 set after execution
- e.g. EOR R2, R0, #...11110000 (e.g. EOR R2, R0, #0xF0)
 

R2 ..10100101 Bits 4 to 7 inverted after execution

©2020 SCSE/NTU

29

CZ1106  
CE1106

## Program Example Alternate LED Flashing

- Turn Green and Red LEDs on and off alternately.
- Green and Red LED states are mapped to bits 3 and 2 of R0 respectively. All other bits must not be affected during pattern change.
- AND is used to turn on the active-low LEDs and ORR is used to turn them off. Two patterns per cycle is needed to alternate the ON and OFF between LEDs.

*battery lights will be on for a while*

Loop AND R0, R0, #0xFFFFFFF7 ; turn on Red (bit 2 = 0)  $\Rightarrow 1011_2$   
                                   ORR R0, R0, #0x00000008 ; turn off Green (bit 3 = 1)  $\Rightarrow 1000_2$   
                                   output pattern in R0 and time delay  $\Rightarrow$  will run so fast that it seems on all the time  
                                   AND R0, R0, #0xFFFFFFF7 ; turn on Green (bit 3 = 0)  $\Rightarrow 0111_2$   
                                   ORR R0, R0, #0x00000004 ; turn off Red (bit 2 = 1)  $\Rightarrow 0100_2$   
                                   output pattern in R0 and time delay

*loop back*

Alternating patterns for bits 3 and 2 in R0

©2020 SCSE/NTU



R0 ..... 4 3 2 1 0  
Bit ..... 4 3 2 1 0

Set a 1 : use ORR with mask 1

Clear 0 : use AND with mask 0

*need to make sure only bit 2 is affected, others may be mapped to other functions*

30

 $\geq 2^3$ 

others

*may be mapped to other functions*

CZ1106  
CE1106**Program Example (Improved version)****Alternate LED Flashing**

- Turn **Green** and **Red** LEDs on and off alternately.
- Green** and **Red** LED states are mapped to **bits 3 and 2** of **R0** respectively. All other bits must not be affected during pattern change.
- Once the alternate pattern for the two LEDs are in place, **EOR** can be used to **flip** or **invert** their state after each cycle.

```

AND R0,R0,#0xFFFFFFFFFB ; turn on Red (bit 2 = 0)
ORR R0,R0,#0x00000008 ; turn off Green (bit 3 = 1)
output pattern in R0 and time delay
Loop EOR R0,R0,#0x0000000C ; flip state of bits 3 and 2
output pattern in R0 and time delay
    
```

loop back

EOR mask = 00 .....1100<sub>2</sub>

mask of 1



R0 ..... 4 3 2 1 0

Alternating patterns for bits 3 and 2 in R0

©2020 SCSE/NTU

31

CZ1106  
CE1106**Shift and Rotate Instructions**

- ARM has several shift and rotate operations:
- Logical Shift Left (**LSL**) and Logical Shift Right (**LSR**).



- Arithmetic Shift Right (**ASR**)



- Rotate Right (**RRX**) and Rotate Right Extended (**RRX**).



↑ w left in ARM

©2020 SCSE/NTU

32

CZ1106  
CE1106

## Doing Arithmetic with Shift

- Shift performs multiply (shift left) or divide (shift right) by a factor of  $2^N$ , where  $N$  is the no. of bits shifted.



- In signed or unsigned **multiply**, binary "0" is shifted into the LSB of the register from the right using Logical Shift Left (**LSL**).
- In **unsigned divide**, binary "0" is shifted into the MSB of the register from the left using Logical Shift Right (**LSR**).
- In **signed divide**, the sign bit is shifted into the MSB from the left using Arithmetic Shift Right (**ASR**).
- The "S" suffix is used on the data processing operator to influence the **C** flag.

©2020 SCSE/NTU

33

CZ1106  
CE1106

## Rotate Operations

- Rotate is also called **cyclical shift**, as no bits in the register is lost during the shifting operation.
- In basic rotate right (**ROR**), the bit shifted out of register is returned in at the leftmost end and is also placed into the **C**-flag.



- In rotate right extended (**RRX**), the **C**-flag is shifted into the register at the leftmost end, while the bit shifted out replaces the current **C**-flag.



©2020 SCSE/NTU

34

CZ1106  
CE1106

## Shift and Rotate Mnemonics

- The ARM **efficiently combines** the shift operation with the data transfer or processing instruction.
- Shift operation is applied to the **rightmost** operand (2<sup>nd</sup> source operand).
- Number of bits to shift is specified as an **immediate** value or a value within a **register** (dynamic shift):

|                                               |                                                                                                |
|-----------------------------------------------|------------------------------------------------------------------------------------------------|
| <b>MOV R0, R0, LSL #1</b> ; R0=R0<<1          | Shift R0 left by 1 bit                                                                         |
| <b>ADD R2, R1, R0, LSR #2</b> ; R2=R1+R0>>1   | ADD R1 with 2-bit right shifted R0. Put result in R2.                                          |
| <b>ADDS R2, R1, R0, LSL R4</b> ; R2=R1+R0<<R4 | Shift R0 by R4 bits before ADD with R1. Update <b>N,Z,V,C</b> flags and result in R2.          |
| <b>ORRS R0, R0, R0, ROR #1</b> ; R0=R0  R0>>1 | OR R0 with a 1-bit right rotated version of itself. Set <b>C</b> flag if bit rotated out is 1. |

©2020 SCSE/NTU

35

↓  
2nd source operand (optional)

CZ1106  
CE1106

**AMR7TDMI  
Processor Logic**

©2020 SCSE/NTU

36

CZ1106  
CE1106**Program Example****Count Number of 1's**

- Count the number of binary 1s in register **R0**.
- Rotate **R0** 32 times (**RRR**), at each rotate use **AND** to mask all bits except LSB. Then add the LSB. The cumulated total will be the number of 1s in **R2**.

```

MOV R2, #0           ; clear 1-counter for binary 1s
MOV R3, #32          ; set loop counter to 32 times
Loop MOV R0, R0, ROR #1 ; rotate right 1 bit
        AND R1, R0, #1 ; clear all bits except LSB
        ADD R2, R2, R1 ; add LSB to 1-counter
        SUB R3, R3, #1 ; decrement loop counter
    loop back 31 times
    (32 loops)          when it gets to 0, exit loop

```

**Count the number of binary 1s in R0**

©2020 SCSE/NTU



37

R0 not destroyed; each rotation — mask everything except LSB

CZ1106  
CE1106**Program Example (optimized version)****Count Number of 1's**

- Count the number of binary 1s in register **R0**.
- **LSR** is used to shift content in **R0** 1 bit at a time into the **C** flag. Then **ADDC** can be used to sum the 1s going into the **C** flag. Loop ends when **R0** has no more 1s and **Z** flag is set.

```

MOV R2, #0           ; clear 1-counter for binary 1s
Loop MOVS R0, R0, LSR #1 ; 1-bit right shift to move LSB
                    ; into C flag
        ADC R2, R2, #0 ; add C flag to 1-counter
    loop back if not zero
    (loop until R0 == 0)

```

**Count the number of binary 1s in R0**

©2020 SCSE/NTU



38

## Quiz: Logical and Shift Instructions

- Given the initial states shown, which is the correct state of R2 after executing the instruction **EOR R2, R0, R1, ASR R3**?

*signed bit*

Initial States

|    |            |
|----|------------|
| R0 | 0x11223344 |
| R1 | 0xF0000000 |
| R2 | 0x12345678 |
| R3 | 0x00000004 |

$$R2 = R0 \text{ EOR } (0xFF000000)$$

$$R2 = 0x\text{EE}223344$$

Flip 8 MSB

$$R2 = 0x12345678$$

$$\begin{aligned} &= (R1 \gg_{\text{sign}} R3) \\ &= (R1 \gg_{\text{sign}} 4) \\ &= (0xF0000000 \gg_{\text{sign}} 4) \\ &= (0xFF000000) \end{aligned}$$

$$R2 = 0x11223344$$

$$R2 = 0x\text{EE}223344$$

*↙ "flipped to EE"*

$$R2 = 0x1E223344$$



©2020 SCSE NTU



26:05 / 30:25

1.5x



## Quiz: An ARM Code Segment

- R0 contains a zero-extended byte read from memory. What would be the value in R0 after executing the code segment shown?

$$R0 \xrightarrow{\text{for extension}} 0x00000080$$

MOV R1, #0  
SUB R2, R1, R0, LSR #7  
ORR R0, R0, R2, LSL #8



$$R0 \dots 000000000001$$

$$\begin{aligned} R0 &= R0 \text{ OR } (R2 \ll 8) \\ R0 &= R0 \text{ OR } 0xFFFFFFF00 \\ R0 &= 0xFFFFFFFF80 \end{aligned}$$

$$\begin{aligned} R2 \text{ left by 8 bits} \\ &= (R2 \ll 8) \\ &= 0xFFFFFFF00 \end{aligned}$$

$$\begin{aligned} R2 &= R1 - (R0 \gg 7) \\ R2 &= 0 - (R0 \gg 7) \\ R2 &= 0 - (1) \leftarrow \text{signed bit} \\ R2 &= 0xFFFFFFFFF0 - 1 \end{aligned}$$

results in signed bit being the least significant bit

This code sign-extends the zero-extended byte fetched from memory to 32-bits

$$R0 = 0x00000000$$

$$R0 = 0xFFFFFFFF$$

$$R0 = 0x00000080$$

$$R0 = 0xFFFFFFFF80$$



©2020 SCSE NTU



29:40 / 30:25

1.5x



CZ1106  
CE1106

## Summary

- Logical instructions such as **AND**, **ORR**, **EOR** can be used to **clear**, **set** and **complement** specific bits in a register, respectively.
- Arithmetic shift instruction can be used as a fast way of implementing **multiplication** and **division** by values of  $2^N$ .
- In the ARM, shift and rotate operations are used in **conjunction** with data transfer and data processing operations.

©2020 SCSE/NTU

39

CZ1106  
CE1106

©2020 SCSE/NTU

40

40

# Lecture (2a/1)

CZ1106  
CE1106

**Adds R0, R0, R1**

What condition code (CC) flag(s) will be set after executing this instruction given the initial states.

**A. No flags**

**B. N**

**C. N and C** *X*

**D. ✓N and V**

**E. N, C, and V**

**C is not set as there is no carry at MSB.**

**N is set as MSB is 1**

**V is set to indicate signed overflow as sign of numbers added are similar but sign of result is different.**

**R0: 0x80000000**  
**R1: 0x00000000**  
**Initial States**

*Note: 1000 was 40000000  
positive + positive cannot be negative*

©2021 SCSE NTU

10:53 / 57:06

1.5x

CZ1106  
CE1106

## Subtraction

$R0 = R1 - R0$

Which option does not give the equivalent of this expression?

**A. SUB R0, R1, R0**

**B. SUB R0, R0, R1  
RSB R0, R0, #0**

**C. MVN R0, R0  
ADD R0, R1, R0  
ADD R0, R0, #1**

**✓D. None of the above**

$\Rightarrow R0 = R1 - R0$  (if it was RSB instead, then it will be wrong)

$\Rightarrow R0 = R0 - R1$

$\Rightarrow R0 = 0 - (R0 - R1) = R1 - R0$

$R0 = \text{NOT}(R0)$

$\Rightarrow R0 = R1 + \text{NOT}(R0)$

$\Rightarrow R0 = R1 + \text{NOT}(R0) + 1 = R1 - R0$

*→ correct in this question*

*two's complement operation = Negate R0*

©2021 SCSE NTU

15:07 / 57:06

1.5x

## Logical Instruction

AND R3, R3, R0

Restriction

Given the initial states below, what will be the result of executing this instruction.

CX1106

A.

|    |              |
|----|--------------|
| R0 | 0xFFFF000000 |
| R3 | 0xFF00000FF  |

R0 cannot change bc destination is R3



B.

|    |              |
|----|--------------|
| R0 | 0xFFFFF00000 |
| R3 | 0xFF0000000  |

R0 acts as a mask.  
For AND, a "0" bit will clear

|    |              |
|----|--------------|
| R0 | 0xFFFFF00000 |
| R3 | 0xFFFFF00FF  |

Correct if operator is ORR.  
A "1" bit in mask will set.

|    |              |
|----|--------------|
| R0 | 0xFFFFF00000 |
| R3 | 0x00FF00FF   |

Correct if operator is EOR.  
A "1" bit in mask will flip

|    |             |
|----|-------------|
| R0 | 0xFFFFE0000 |
| R3 | 0xFF0000000 |

Initial States

PASS / Clear

R0 0xFFFFE0000

R3 0xFF0000000

©2021 SCSE NTU

22:03 / 57:06

1.5x

## Rotate Right Extended

MOV R0, R1, RRX

Given the initial states below, what is the value in R0 after executing this instruction.

CX1106

RRX can only rotate right 1 bit



used to be "1"

R0 0x00000000

R1 0x00000022

N Z C V

CPSR 0 0 1 0

Initial States

C flag is set

- A. R0 0x80000000
- B. R0 0x00000011
- C. ✓ R0 0x80000011
- D. R0 0x00000045



©2021 SCSE NTU

34:42 / 57:06

1.5x

C flag extends the whole register  
⇒ 33 bit rotate (0 - 32)

CZ1106  
CE1106

## Chapter 5

# Instruction Set

## Program Control Instructions

### Learning Objectives (5.4)

1. Describe the various conditional branch instructions and its uses.
2. Describe how conditional test can be implemented.

©2020 SCSE/NTU

41

CZ1106  
CE1106

## Instruction Set – Basic Categories

- Non system-level instructions in a processor can be typically classified into three basic groups:

**Data Transfer**

**ARM examples:**  
**MOV R1,R0**  
**STR R0, [R2 ,#4]**  
**LDR R1, [R2 ]**

**Data Processing**

**ARM examples:**  
**ADD R0,R1,R2**  
**SUB R1,R2,#3**  
**EOR R3,R3,R2**

**Program Control**

**ARM examples:**  
**B Back**  
**BNE Loop**  
**BL Routine**

**Covered in  
Modular  
Programming**

- Data transfer – instructions that move data between registers.
- Data processing – instructions that modify the data through arithmetic, logical or shift operations.
- **Program control** – instructions that alter the normal sequential execution flow of a program.

©2020 SCSE/NTU

42

*Labels must  
be unique*



CZ1106  
CE1106

## Program Control Instructions

- These instructions facilitate the **disruption** of a program's normal **sequential** flow.
- The disruption of sequential flow is implemented by modifying the contents of the Program Counter (**PC**).
- The content of the **PC** can be modified **directly** or by using a **Branch** instruction.
- A jump can be executed based on a given condition (e.g. if result of previous execution is negative) and this is called a **conditional branch**. ↘
- Conditional branch is useful for implementing:
  - conditional constructs (e.g. **if** or **if-else**)
  - loop constructs (e.g. **for** or **while** loops)



©2020 SCSE/NTU

43

CMP subtracts the source operand from the destination register and sets the CC flags according to the results

CZ1106  
CE1106

## Conditional Branch (Bcc)

- ARM provides conditional branch using **Bcc**.
- If the condition specified in the condition field (**cc**) is **true**, a displacement is added to the **PC**, otherwise next instruction is executed.
- Bcc** uses **PC-relative** addressing mode with a displacement range of **±32MB**.
- The **PC** value used to compute required displacement is **8 bytes** ahead of the current **Bcc** being executed.
- Bcc** is used with **address labels** that allows the assembler to compute the required displacement values.



©2020 SCSE/NTU

44

**CZ1106**  
**CE1106**

## Test Conditions for Bcc

- ARM provides **different** conditional branch options.
  - 15 possible conditions is permitted in the condition field (**cc**) using combinations of the **N, Z, V, C** flags.

| e.g. Bcc               | Operation and CC flag conditions             |                                                                                                   |
|------------------------|----------------------------------------------|---------------------------------------------------------------------------------------------------|
| <b>B</b> or <b>BAL</b> | $PC \leftarrow PC \pm n$ <i>displacement</i> | Branch Always  |
| <b>BEQ</b>             | If Z = 1, $PC \leftarrow PC \pm n$           | Branch Equal                                                                                      |
| <b>BVS</b>             | If V = 1, $PC \leftarrow PC \pm n$           | Branch Overflow Set                                                                               |

- **Flexible** conditional branch can be programmed based on outcome of instructions **prior** to **Bcc**.
  - The choice of condition (**cc**) is dependent on whether the test is for a **signed** or **unsigned** computation.



©2020. SCSE/NTU

45

**CZ1106**  
**CE1106**

## Different Bcc Conditions

- There are 15 possible conditional tests for **Bcc**.

| Suffix          | Flags              | Meaning                                                  |
|-----------------|--------------------|----------------------------------------------------------|
| <b>EQ</b>       | Z = 1              | Equal                                                    |
| <b>NE</b>       | Z = 0              | Not equal                                                |
| <b>CS or HS</b> | C = 1              | Higher or same, <b>unsigned</b>                          |
| <b>CC or LO</b> | C = 0              | Lower, <b>unsigned</b>                                   |
| <b>MI</b>       | N = 1              | Negative                                                 |
| <b>PL</b>       | N = 0              | Positive or zero                                         |
| <b>VS</b>       | V = 1              | Overflow                                                 |
| <b>VC</b>       | V = 0              | No overflow                                              |
| <b>HI</b>       | C = 1 and Z = 0    | Higher, <b>unsigned</b>                                  |
| <b>LS</b>       | C = 0 or Z = 1     | Lower or same, <b>unsigned</b>                           |
| <b>GE</b>       | N = V              | Greater than or equal, <b>signed</b>                     |
| <b>LT</b>       | N != V             | Less than, <b>signed</b>                                 |
| <b>GT</b>       | Z = 0 and N = V    | Greater than, <b>signed</b>                              |
| <b>LE</b>       | Z = 1 and N != V   | Less than or equal, <b>signed</b>                        |
| <b>AL</b>       | Can have any value | Always. This is the default when no suffix is specified. |



©2020 SCSE NTU

46

V flag : overflow in signed operations  
C flag : overflow in unsigned  
N/A deletion

CZ1106  
CE1106**Program Example****Count Number of 1's**

- Count the number of binary 1s in register **R0**.
- Loop counter **R3** is initialized to 32 at the start.
- **SUBS** is used to decrement **R3** to zero and set the **Z** flag when that happens.
- **BNE** is used to test for **Z=0**, until **Z=1** (i.e. **R3=0**), it will keep looping back.

```

MOV R2, #0           ; clear 1-counter for binary 1s
MOV R3, #32          ; set loop counter to 32 times
Loop MOV R0, R0, ROR #1 ; rotate right 1 bit
loop back AND R1, R0, #1 ; clear all bits except LSB
31 times ADD R2, R2, R1 ; add LSB to 1-counter
SUBS R3, R3, #1       ; decrement loop counter
BNE Loop             ; loop back until R3=0

```

**Count the number of binary 1's in R0**

©2020 SCSE/NTU

47

CZ1106  
CE1106**Program Example (Alternative Count Loop)****Count Number of 1's**

- Count the number of binary 1s in register **R0**.
- Loop counter **R3** is initialized to **31** at the start.
- **SUBS** decrements **R3** to negative and sets the **N** flag when that happens.
- **BPL** is used to test for **N=0**, until **N=1** (i.e. **R3=-1**), it will keep looping back.

```

MOV R2, #0           ; clear 1-counter for binary 1s
MOV R3, #31          ; set loop counter to 31
Loop MOV R0, R0, ROR #1 ; rotate right 1 bit
loop back AND R1, R0, #1 ; clear all bits except LSB
31 times ADD R2, R2, R1 ; add LSB to 1-counter
SUBS R3, R3, #1       ; decrement loop counter
BPL Loop             ; loop back until R3=-1

```

Instead of testing  
for 0, test  
for N flag

**Count the number of binary 1's in R0**

©2020 SCSE/NTU

48

CZ1106  
CE1106

## Comparing Signed & Unsigned Values

- Appropriate conditional test must be selected based on the number representation used.

- For testing **signed** values, use **GT, LT, GE, LE**.

- e.g.
 

```
SUBS R1,R1,R2 ;R1 = R1 - R2
BGE R1≥R2      ;jump to R1≥R2 if result is positive
:
R1≥R2 :
```

Destination register gets modified when we try to find out if  $R1 \geq R2$

- For testing **unsigned** values, use **HI, LO, HS, LS**.

- e.g.
 

```
SUBS R1,R1,R2 ;R1 = R1 - R2
BHS R1≥R2      ;jump to R1≥R2 if R1 higher or equal to R2
:
R1≥R2 :
```

Note: **R1≥R2** label is only for illustration. The “ $\geq$ ” is not a valid label symbol in the VisUAL ARM simulator

©2020 SCSE/NTU

49

CZ1106  
CE1106

## Conditional Test using CMP

- Use (**CMP**) instead of (**SUBS**) to compare values of two operands without affecting the operands.
- Comparing a register value (signed) to an immediate value.

eg.  $R1 = 4$

```
CMP R1,#4          ; test (R1 - 4), where R1 is a signed no.
BGE R1≥4          ; branch to R1≥4 if result is positive (i.e. R1 ≥ 4)
:
R1≥4 :
```

- Finding C string terminator (0) in memory pointed to by **R0**.

```
Loop LDRB R1,[R0],#1    ; read mem byte using post-index autoindex → load bit in memory
      CMP R1,#0          ; test (R1 - 0)
      BEQ Found          ; branch to Found if value is 0
      B Loop             ; keep branching back to start of Loop
Found :
```

©2020 SCSE/NTU

50

CZ1106  
CE1106

## CMP

- **CMP subtracts** the source operand from the destination register and sets the **CC** flags according to the results.
- Destination register remain **unmodified** after **CMP**.
- CC flags affected in the same manner as the **subtract** instruction (**SUBS**).



R1 modified to achieve desired flow control

achieves  
same  
goal



Same flow control but R1 unchanged

©2020 SCSE/NTU

51

CZ1106  
CE1106

## Other Conditional Test Instructions

- ARM provides several other operators that can be used to influence the conditional test flags.
- These conditional test instructions do not modify the destination operand.
- They do not need the “**s**” suffix to influence the condition code flags (**N,Z,V,C**).

**CMN R0, R1 ; set (N,Z,C,V) based on R0 + R1** Compare Negative

→ “Boolean” operation

**TST R0, R1 ; set (N,Z,C) based on R0 AND R1** Test Bits

→ E.g., TST R0, R1, ROR #1

**TEQ R0, R1 ; set (N,Z,C) based on R0 EOR R1** Test Equivalence

→ e.g. TEQ R0, R1, LSL R2

- The **C** flag for **TST** and **TEQ** can be influenced by applying the shift and rotate operations on the source operand (rightmost).

©2020 SCSE/NTU

52

## Quiz: Loop Count



- How many times will the **BMI Loop** instruction be executed in the code segment shown?

```

MOV R1, # -2
Loop ADDS R1, R1, #1
BMI Loop
N=0
  
```



| Suffix   | Flags | Meaning                  |
|----------|-------|--------------------------|
| EQ       | Z = 1 | Equal                    |
| NE       | Z = 0 | Not equal                |
| CS or HS | C = 1 | Higher or same, unsigned |
| CC or LO | C = 0 | Lower, unsigned          |
| MI       | N = 1 | <b>Negative</b>          |
| PL       | N = 0 | Positive or zero         |

0

1

3



Execute twice. Once branch taken and once branch not taken.

©2020, SCSE NTU



21.09 / 25:19

1.5x



## Quiz: The CMP Instruction



- Which of the following values in **R0** and **R1** will cause the **MOV R2, R3** instruction to be executed?

BHS test unsigned numbers. If unsigned value in  $(R0 \geq R1)$ , BHS will skip over the MOV instruction



| Suffix   | Flags | Meaning                         |
|----------|-------|---------------------------------|
| EQ       | Z = 1 | Equal                           |
| NE       | Z = 0 | Not equal                       |
| CS or HS | C = 1 | Higher or same, <b>unsigned</b> |
| CC or LO | C = 0 | Lower, unsigned                 |

a      R0: 0xABCDFFFF  
R1: 0xABCDFFFF

b      R0: 0x00000001  
R1: 0x00000005

c      R0: 0x00000005  
R1: 0x00000001

d      R0: 0x00000005  
R1: 0xFFFFFFFF

©2020, SCSE NTU



23:44 / 25:19

1.5x



CZ1106  
CE1106

## Summary

- Conditional branch (**Bcc**) allows us to implement conditional and loop constructs.
- Appropriate (**Bcc**) conditions must be selected for the conditional test used.
- The (**cc**) choice needs to take into account of data type being used (i.e. signed or unsigned numbers).
- Appropriate operations (e.g. **CMP** or **SUBS**) are used to set the **N**, **Z**, **V**, **C** flags before conditional test can be done.

©2020 SCSE/NTU

53

CZ1106  
CE1106

©2020 SCSE/NTU

54

54

CZ1106  
CE1106

## Chapter 5

# Program Example

## Finding the Largest Number

### Learning Objectives (5.5)

1. Use appropriate data transfer instructions to retrieve memory arrays efficiently.
2. Use appropriate program control instructions to determine flow of program based on desired outcomes.
3. Implement a simple find max algorithm in ARM assembly.

©2020 SCSE/NTU

55

CZ1106  
CE1106

### Program Example

#### Find Largest Number (FindMax)

- Write an assembly language program to :
  - Find the **largest value** in an integer array and store the result in register **R3**.
  - The array consists of **10 unsigned** numbers stored starting at address **0x100**.
  - **Things to note:**
    - Use correct conditional test for comparing unsigned number.
    - Use appropriate register indirect to access each array element efficiently.
    - Set up appropriate count loop to access all 10 numbers

| Largest Value |            |
|---------------|------------|
| R3 0x00000007 |            |
| Number Array  |            |
| Address       | Memory     |
| [R0] → 0x100  | 0x00000003 |
| 0x104         | 0x00000007 |
| 0x108         | 0x00000004 |
| 0x10C         | 0x00000002 |
| :             | :          |
| :             | :          |
| 0x120         | 0x00000005 |
| 0x124         | 0x00000001 |

©2020 SCSE/NTU

56

CZ1106  
CE1106**Possible Solution****Find Largest Number (FindMax)**

```

MOV R0, #0x100 ;setup pointer to first array element
MOV R1, #9 ;load 9 into counter register
LDR R3, [R0] ;assume 1st no. in array is current max

Loop
Initialisation of registers
    SUBS R1, R1, #1
    BNE Loop

```

| Number Array Memory |            |
|---------------------|------------|
| Address             | Memory     |
| [R0] → 0x100        | 0x00000003 |
| 0x104               | 0x00000007 |
| 0x108               | 0x00000004 |
| 0x10C               | 0x00000002 |
| :                   | :          |

Loop Count R1 0x00000009  
Temp Reg R2  
Current Max R3 0x00000003

R1 > 0  
(8 Times)

R0 = Address pointer for current array element.

R1 = Loop counter register

R2 = Temporary register holding current no.

R3 = Current maximum value (i.e. the result).

©2020 SCSE/NTU

57

CZ1106  
CE1106**Possible Solution****Find Largest Number (FindMax)**

UnSigned

```

MOV R0, #0x100 ;setup pointer to first array element
MOV R1, #9 ;load 9 into counter register
LDR R3, [R0] ;assume 1st no. in array is current max

Loop
    ADD R0, R0, #4 ;increment array pointer to next element
    LDR R2, [R0] ;get next no. in array
    CMP R2, R3 ;compare R3 and R2 (i.e. R2-R3)
    BLS Skip ;branch if R2 ≤ current max (i.e. R3)
    MOV R3, R2 ;update current max. in R3 with R2

Skip
    SUBS R1, R1, #1 ;decrement 1 from counter register
    BNE Loop ;jump back to Loop if not zero

```

sets CC flag  
based on R2-R3

R0 = Address pointer for current array element.

R1 = Loop counter register

R2 = Temporary register holding current no.

R3 = Current maximum value (i.e. the result).

©2020 SCSE/NTU

58

## Quiz: Code Optimisation

- Which mnemonic in **FindMax** can you remove by using a more optimised addressing mode?



```

MOV R0, #0x100 ;setup pointer to first array element
MOV R1, #9 ;load 9 into counter register
LDR R3, [R0] a ;assume 1st no. in array is current max

Loop ADD R0, R0, #4 b ;increment array pointer to next element
LDR R2, [R0] ;get next no. in array
CMP R2, R3 ;compare R3 and R2 (i.e. R2-R3)
BLS Skip c ;branch if R2 < current max (i.e. R3)
MOV R3, R2 ;update current max. in R3 with R2

Skip SUBS R1, R1, #1 ;decrement 1 from counter register
BNE Loop ;jump back to Loop if not zero
    
```

©2020, SCSE-NTU

|| 7:47 / 15:16

1.5x

} refer to next slide  
(conditional Execution)

↓ save a cycle  
to increment point register

Loop LDR R2, [R0, #4]!

⇒ pre-index auto index to next element

⇒ then get array element from memory

MOVH I R3, R2

⇒ current max is update with R2

If R2 > R3

CZ1106  
CE1106

## Conditional Execution

- ARM instructions can be conditionally executed based on the CC flags.

ARM code example

```
; C code
if (R0 == 1)
    R1 = 3;
else
    R1 = 5;
```



```
CMP    R0, #1      ; set CC based on r0 -1
BNE    ELSE        ; if (R0 == 1)
                  ; then { R1 := 3}
MOV    R1, #3      ; skip over else code seg
B     SKIP         ; else { R1 := 5}
ELSE   MOV    R1, #5
SKIP   .....      ;
```

- The conditional execution feature allows us to make the execution of each instruction dependent on the current status of the **N**, **Z**, **V**, **C** flags.

```
CMP    R0, #1      ; if (r0 == 1)
Execute if Z=1    MOVEQ  R1, #3  / will be ; then { r1 := 3}
Execute if Z=0    MOVNE  R1, #5  / will be ; else { r1 := 5}
SKIP   .....      ;
```

↑ will be executed

©2020 SCSE/NTU

59

CZ1106  
CE1106

## Summary

- Register indirect** addressing modes (with and without autoindexing) can be used to access array elements in memory.
- Conditional branch (**Bcc**) allows us to implement conditional and loop constructs.
- Appropriate conditions (e.g. **LS** and **NE**) must be selected to implement the required test.
- Appropriate operations (e.g. **CMP** or **SUBS**) are used to set the (**N**, **Z**, **V**, **C**) flags before conditional test can be done.
- Conditional execution (e.g. **MOVHI** or **ADDEQ**) can be used to avoid doing conditional branching.

©2020 SCSE/NTU

↳ good for optimisation

60

## Analysing a Loop Construct



CX1106

Jump to  
Loop

```

    MOV R1, #0xFFFFFFFF
Loop ADDS R1, R1, #1
      BNE Loop
  
```

Z = 1

- A. 0  
B. 1  
C. ✓2  
D. 3

How many times is the ADDS instruction executed?

(-2)

(1)

All is &amp; negative

```

    MOV R1, #0xFFFFFFFF
      ADDS R1, R1, #1
      ADDS R1, R1, #1
  
```

0 Z flag

1 Z flag

R1 goes from -2 to -1 (not zero yet).

R1 zero, so exit loop



©2021, SCSE NTU

- B & BTL is the same

## The CMP Instruction



```

Loop SUB R1, R1, R1 → becomes 0 → R1 cleared
      CMP R1, #1
      BEQ Exit
      ADD R1, R1, #1
      B Loop
Exit :
  
```

- A. 0  
B. 1  
C. ✓2  
D. 3

How many times is the CMP instruction executed?

and 0

Z = 1

```

      SUB R1, R1, R1
      CMP R1, #1
      ADD R1, R1, #1
      CMP R1, #1
  
```

0 Z flag

1 Z flag

not possible  
achieving

R1 is 0 and not 1, so CMP with #1 does not set Z flag.

R1 is 1, so CMP with #1 will set the Z flag.



## The Conditional Test (Bcc)



Which content in R1 will avoid the execution of the MOV R2, R3 instruction?

- A. ✓ R1=0xFFFFFFF D
- B. R1=0xFFFFFFF E
- C. R1=0xFFFFFFF F
- D. R1=0x00000000

R1 = -3

R1 = -2

R1 = -1

R1 = 0

→ signed comparison  
(link of table)

```
CMP R1, #0xFFFFFFFF
BLT NEXT
MOV R2, R3
:
```

Next

Read this as:  
Branch to NEXT  
(i.e. avoid MOV R2, R3)  
if R1 is less than -2.  
i.e. (R1 < -2) or (R1 LT -2)



©2021, SCSE NTU



34:30 / 56:22

1.5x



## Another Conditional Test (Bcc)



Which content in R2 will avoid the execution of the MOV R2, R3 instruction?

- A. R2=0x00000001
- B. R2=0xFFFFFFF E
- C. R2=0xFFFFFFF F
- D. ✓ R2=0x00000000

R2 = +1

R2 = +big

R2 = +big

R2 = 0

higher than or  
in same  
unrelated

```
MOV R1, #0
CMP R1, R2
BHS Next
MOV R2, R3
:
```

Next

Read this as:  
Branch to NEXT  
(i.e. avoid MOV R2, R3)  
if (R1=0) is higher than or  
same as R2, i.e. (R1 ≥ R2)



©2021, SCSE NTU

