

# ARM Architecture

Matt Lamont

# History

- Tested top processors of the day
- Decided to create new processor

Acorn 

# History

- Sophie Wilson  
(Lead Engineer at Acorn) wrote the first RISC based instruction set with BBC Basic



# History

- The ARM1 (Acorn RISC Machine) processor was fabricated in April 1985
- In 1987 the ARM2 was first used in Acorn's Archimedes Personal Computer



# History

- The revolutionary architecture of the ARM processor even scored Acorn a Queen's Award for Technology



# RISC Based Architecture

- Pioneered by David Patterson at University of California Berkeley
- Found that only 30% of hardware instructions were being used



# RISC Based Architecture

- Remove unused hardware instructions and fill space with registers
- First ARM processor had 78 32-bit registers and only 31 instructions
- Compared to Motorola 68000 which had 16 registers and 56 instructions



# RISC Based Architecture

- RISC is Load/Store based Architecture
- All operands are loaded to registers and instructions are carried out on registers
- Instruction hardware take up less space, but code density is increased
- Program size larger than traditional instruction sets

# ARM Registers

- 30 general purpose registers
- 1 dedicated program counter register
- 1 dedicated program status register (CPSR)
- 5 saved program registers (SPSR)

# CPSR Register



# SPSR Register

- When interrupt is introduced, the CPSR is saved to SPSR
- Contents transferred back after interrupt handled

# ARM Instruction Set



# ARM Instruction Set

- Source and Destination operands are only registers.
- Every opcode contains a 4-bit condition field to determine execution
- Less code density

# Condition Field

| Suffix | Flags                         | Meaning                                     |
|--------|-------------------------------|---------------------------------------------|
| EQ     | Z set                         | Equal                                       |
| NE     | Z clear                       | Not Equal                                   |
| CS/HS  | C set                         | Higher or same ( unsigned $\geq$ )          |
| CC/LO  | C clear                       | Lower ( unsigned $<$ )                      |
| MI     | N set                         | Negative                                    |
| PL     | N clear                       | Positive or zero                            |
| VS     | V set                         | Overflow                                    |
| VC     | V clear                       | No overflow                                 |
| HI     | C set and Z clear             | Higher ( unsigned $>$ )                     |
| LS     | C clear or Z set              | Lower or same ( unsigned $\leq$ )           |
| GE     | N and V the same              | Signed $\geq$                               |
| LT     | N and V different             | Signed $<$                                  |
| GT     | Z clear, and N and V the same | Signed $>$                                  |
| LE     | Z set, or N and V different   | Signed $\leq$                               |
| AL     | Any                           | Always ( This suffix is normally omitted. ) |

# ARM Instruction Set

## **Traditional Subtract Instruction**

Subtract operand\_X , operand\_Y

## **RISC Subtract Instruction**

```
mov register1 , mem_location_X  
mov register2 , mem_location_Y  
subtract register1 , register2
```

# Thumb Instruction Set

- Introduced in 1994 with ARM7TDMI
- 16 bit instruction set
- Utilizes implicit operands (registers)
- Many instructions are mapped directly onto ARM Instruction set

# Thumb Instruction Set



# Jazelle Instruction Set

- Allow Java Byte Code to be more easily executed on processor
- Significantly reduces interpretation of Java instructions



# Processor Modes

Exception modes

| Mode             | Description                                                               |                   |
|------------------|---------------------------------------------------------------------------|-------------------|
| Supervisor (SVC) | Entered on reset and when a Supervisor call instruction (SVC) is executed | Privileged modes  |
| FIQ              | Entered when a high priority (fast) interrupt is raised                   |                   |
| IRQ              | Entered when a normal priority interrupt is raised                        |                   |
| Abort            | Used to handle memory access violations                                   |                   |
| Undef            | Used to handle undefined instructions                                     |                   |
| System           | Privileged mode using the same registers as User mode                     |                   |
| User             | Mode under which most Applications / OS tasks run                         | Unprivileged mode |

# ARM Endianess

- Little Endian
- and Big Endian
- Set or unset bit 9 of CPSR

# ARM Barrel Shifter

- Specialized hardware
- carries out shift and rotate instructions
- good for multiply, divide, and calculating addresses



# ARM1

- 32 bit instructions
- 26 bit addressable memory
- Mainly RISC based architecture



# ARM2

- Multiplication instruction added
- Memory controller
- Support for graphics and I/O
- 4 MIPS @ 8MHz



# ARM3

- 4Kb memory cache
- 12 MIPS @ 25 MHz



# ARM6

- Address Space extended to 32 bits
- Coprocessor bus added
- 28 MIPS @ 33 MHz



# ARM7

- Thumb Instruction Set
- 130 MIPS @ 40 MHz



# ARM7 TDMI

- Introduced hardware for debugging (ICEBreaker)
- Fast multiplier



# ARM7 EJ

- 5 stage pipeline
- Jazelle DBX instruction set
- Enhanced DSP instruction set  
(better digital signal processing)



# ARM8

- Double Bandwidth memory
- Memory read/writes could occur on both leading and falling edge of clock
- 84 MIPS @ 72 MHz



# ARM9 TDMI

- Larger Memory Management Units
- Memory Protection Units
- 200 MIPS @ 180 MHz



# ARM9E

- incorporated Tightly Coupled Memory (similar to cache without unpredictability)
  - ARM996HS had no clock
  - 220 MIPS @ 200 MHz



# ARM10E

- 6 stage pipeline
- Increased cached sizes
- VFP (Vector Floating Point) instruction set for signal processing and media acceleration



# ARM11

- 8 stage pipeline
- SIMD (single instruction multiple data ) instruction set
- Trustzone hardware implementation
- 965 DMIPS @ 772 MHz



# Cortex-M

- Made specifically for microcontrollers (Arduino)
- Uses 16 bit Thumb instructions
- No cache or TCM
- 1.25 DMIPS per MHz



# Cortex-R

- Made for embedded systems
- Very high error resistance
- High clock frequencies
- Low latency memory units



# Cortex-A

- Super scaling pipelining
- Hardware virtualization
- Dedicated interrupt processing
- Powerful media processing
- 4 DMIPS per MHz



# The Next Generation

- AARCH 64 bit instruction set
- 64 bit address space
- Hardware dedicated for cryptographic purposes



# Hot Dog

