

UNIT - VARM ARCHITECTURE AND PROCESSOR

ARM processor: (Advanced RISC Machine)

Used in Music players, Smartphones, wearables, tablets and other Consumable Electronic devices.

1. It needs very few instructions and transistors
2. It is very small size, which is perfect for Small size devices.
3. It has less power Consumption along with reduced Complexity in the circuit
4. It can be Used for 32-Bit devices Eg. Embedded Systems.

Features of ARM processor: (32 bit processor)

1. Multiprocessing System: ARM processors are designed where more than One processor are Used to process information. first ARM Processor introduced by name of ARMv6k had Capability to Support "4 CPU's" along with its hardware.
2. Tightly Coupled Memory: Memory of ARM processor is tightly coupled This has very fast response time. It has low latency that can also be Used in Cases of Cache Memory (latency - the instruction starts Executing to get Correct Obj.)
3. Memory Management: It includes Memory Management Unit and Memory protection Unit.
4. Thumb-2 Technology: Thumb-2 instruction set Combining both

16 bit & 32 bit instructions to increase the performance in new embedded systems and mobile phones with longer battery life.

5. One cycle Execution time: Each instruction is of fixed length that allows time for fetching next instruction before executing the present instruction. ARM has CPI (clock per instruction) of one cycle.

6. Pipelining: Instructions are broken down and decided in one pipeline stage. The pipeline advances one step at a time to increase the throughput (Rate of processing).

7. Large number of registers: Large number of registers are used in ARM processor to prevent large amount of memory interactions. These act as local memory store for all operations.

→ ARM processor families:

| Architecture version   | Processor families | Processor                     | Features                          |
|------------------------|--------------------|-------------------------------|-----------------------------------|
| ARM V4T                | ARM7TDMI           | ARM720T                       | Von-Neumann<br>3-stage pipeline   |
|                        | ARM9TDMI           | ARM920T<br>ARM922T<br>ARM942T | MMU, Hardvard<br>5-stage pipeline |
| ARM V5TE,<br>ARM V5TEJ | ARM9E              | ARM926EJ-S                    | MMU, DSP, Jazelle                 |
|                        |                    | ARM946E-S                     | MPU, DSP                          |
|                        |                    | ARM966 HS                     | MPU (optional), DSP               |
| ARM10F                 | ARM10F             | ARM1020F                      | MMU, DSP                          |
|                        |                    | ARM1026EJ-S                   | MMU (MPU, DSP, Jazelle)           |

|          |        |                |                                                                             |
|----------|--------|----------------|-----------------------------------------------------------------------------|
| ARM V6   | ARM 11 | ARM1156T2(F)-S | MPU, DSP                                                                    |
|          |        | ARM1176JZ(F)-S | MMU, Trust Zone, DSP, Jazelle                                               |
| ARM V6-M | Cortex | ARM11 MP core  | MMU, Multiprocessor Cache                                                   |
|          |        | Cortex-M0      | Support, DSP, Jazelle<br>NVIC                                               |
|          |        | Cortex-M1      | FPGA Tag interface, NVIC                                                    |
| ARM V7-M | Cortex | Cortex-M3      | MPU (optional), NVIC                                                        |
|          |        | Cortex-R4      | MPU, DSP                                                                    |
|          |        | Cortex-R4F     | MPU, DSP, floating point                                                    |
| ARM V7-R | Cortex | Cortex-A8      | MMU, Trust zone, DSP,<br>Jazelle, Neon, floating point                      |
|          |        | Cortex-A9      | MMU, Trust zone, multi-<br>processor, DSP, Jazelle, Neon,<br>floating point |
|          |        |                |                                                                             |

### ARM XYZTDMLFEJFS

X - Series

Y - MMU

Z - Cache

T - Thumb

D - Debugger

M - Multiplier

L - Embedded In-circuit Emulator (ICE) macrocell

E - Enhanced instructions for DSP

J - Java Acceleration by Jazelle

F - floating point

S - Synthesizable version

→ ~~ARM CORTEX - M Series~~ family: [Cortex - M → Cortex M<sub>core</sub> + FPU]

1. CORTEX - M Series are Used in Motion Sense devices, Smart lighting, Automotive, Retail & Healthcare industries, Energy grid.
2. CORTEX - M processor family is based on the M - Architecture, Provides "Low - latency" & highly "deterministic operation" for Embedded Systems.

#### a) CORTEX - M55:

ARM CORTEX - M55 processor is Arm's most AI capable CORTEX - M processor & it uses ARM Helium Vector processing technology"

#### b) CORTEX - M35P:

A Tamper - resistant CORTEX - M processor with optional Software Isolation Using Trust Zone for ARMV8 - M

#### c) CORTEX - M33:

ARM CORTEX - M33 processor has real - time determinism efficiency & Security.

#### d) CORTEX - M23:

ARM CORTEX - M23 is the Smallest & low power microcontroller with trust zone Security.

#### e) CORTEX - M7:

It is the Energy efficient, highest performance processor

#### f) CORTEX-M4:

ARM CORTEX-M4 processor implements a good blend of control and performance for mixed-signal devices.

#### g) CORTEX-M3:

ARM CORTEX-M3 processor is the industry leading "32-bit Processor" for highly deterministic real-time applications.

#### h) CORTEX-M1:

ARM CORTEX-M1 processor targets "FPGA" devices.

#### i) CORTEX-M0+:

It is the most energy efficient ARM processor.

#### j) CORTEX-M0:

It is the smallest ARM processor.

→ ~~ARM~~ Architecture:



ARM Architecture Consists of

1. Load / Store Architecture
2. A Large Array of Uniform Registers
3. Fixed length 32-bit instructions
4. 3 - Address Instructions.

a) Registers:

Only 16 registers are visible to Specific mode. A mode could access

1. A particular set of  $r_0 - r_{12}$
2.  $r_{13}$  (Stack pointer, SP)
3.  $r_{14}$  (Link Register, LR)
4.  $r_{15}$  (Program Counter, PC)
5. Current program Status Register (CPSR)

General-purpose Registers:



1. 6 data types (Signed / Unsigned)
2. All ARM operations are 32-bit. Shorter data types are only supported by data transfer operations.

### b) Program Counter:

1. Store the address of the instruction to be executed.
2. All instructions are 32-bit wide and word aligned.
3. Last two bits of PC are undefined.

### c) program Status Register (PSR):



### c) Processor Modes:

| Processor Mode   | Description                                 |
|------------------|---------------------------------------------|
| User (usr)       | Normal program Execution mode               |
| FIQ (fiq)        | Supports a high speed data transfer         |
| IRQ (irq)        | Used for general purpose interrupt handling |
| Supervisor (svc) | A protected mode for the operating system   |
| Abort (Abt)      | Implements virtual memory                   |
| Undefined (Und)  | Supports Software Emulation                 |
| System (sys)     | Runs privileged system tasks                |

### Barrel Shifter:

A barrel shifter is a digital circuit that can shift a data word a specified number of bits without the use of sequential logic.

### Booth Multipliers:

Booth Multiplier uses Booth Multiplication Algorithm. It is used for both multiplication and division operations used for Load and Store instructions.

### Memory Address Register (MAR):

It holds the address of the memory location to (or) from which the data is to be transferred.

### Memory Buffer Register (MBR):

It is used to store addresses which act as connection between memory address & memory data register.

## → ARM Cortex-M3 Functional description:



### Features:

1. A Low Cost processor Core, with Low Latency interrupt processing - that has:
  - a) Banked Stack pointer (sp)

- b) Hardware divide instructions, SDIV & UDIV
- c) Thumb and Debug states
- d) Support for interruptible Continued instructions LDM, STM, PUSH & POP for low interrupt latency
- e) Automatic processor state Saving & restoration for low latency interrupt Service Routine (ISR).
- f) Support for ARM V6 big Endian byte invariant (or) little Endian process.

## 2. Nested Vectorized interrupt Controller: (NVIC)

- NVIC closely integrated with the processor Core to achieve low latency interrupt processing! features include:
- a) External interrupt, Configurable from 1 to 240.
- b) Bits of priority, Configurable from 3 to 8.
- c) Dynamic reprioritization of interrupt
- d) Support for tail-chaining and late arrival of interrupt  
This enables back-to-back interrupt processing with overhead of state.
- e) optional wake-up interrupt Controller (WUC), providing ultra-low power sleep mode.

## 3. Memory protection Unit (MPU): It consists of

a, Eight Memory regions

b, Sub Region Disable (SRD), enabling efficient use of

## Memory regions.

4. Bus interfaces:
  - a) three advanced High-performance Bus Lite (AHB-Lite) interfaces: S code, D code & System bus interfaces.
  - b) private peripheral Bus (PPB) based on Advanced peripheral Bus (APB) interface.
  - c) Bit-band Support that includes atomic bit band write & Read operation.
  - d) Memory Access alignment.
  - e) Exclusive access transfers for Multiprocessor Systems.
5. Low Cost Debug Solution features:
  - a) Debug access to all memory and registers in the system, including access to Memory mapped devices, when the Core is halted access to internal Core registers and access to debug Control registers even "SYSRESETn" is asserted.
  - b) Serial wire - Debug port (SW-DP) or Serial wire JTAG Debug Port (SWJ-DP) debug access.
  - c) "optional flash patch & Breakpoint (FPB)" Unit for implementing Breakpoints & Code patches.
  - d) "optional Instrumentation Trace Macrocell" (ITM) for Support of "printf" style debugging.
  - e) "optional trace port interface Unit" (TPIU) for bridging to a trace port Analyzer (TPA), including Single wire output (SWO) mode.
  - f) "optional Embedded Trace Macrocell" for instruction trace.

→ Modes of operation and Execution of ARM:

ARM processor supports "THREAD" and "HANDLER" operating modes and may be run in "THUMB" (or) "DEBUG" operating states. In addition, the processor can limit (or) exclude access to some resources by executing code in "privileged" (or) "Unprivileged" mode.

### 1. Operating Modes:

The conditions which cause the processor to enter into Thread (or) Handler mode are as follows:

a) The processor enters "thread mode" on Reset, (or) as a result of an "exception" return. "privileged" and "Unprivileged" code can run in "Thread" mode.

b) The processor enters "Handler mode" as a result of exception. All code is "privileged" in handler mode.

### 2. Operating states:

The processor can operate in thumb (or) Debug state:

a) Thumb state: This is normal execution while running on 16 bit & 32-bit aligned thumb instructions.

b) Debug state: This is the state when the processor is in "Debug".

## → NVIC Programmers Model:

Address of NVIC registers:

|                           |
|---------------------------|
| 0x F000F004               |
| 0x F000F100 — 0x F000F11C |
| 0x F000F180 — 0x F000F19C |
| 0x F000F200 — 0x F000F21C |
| 0x F000F280 — 0x F000F29C |
| 0x F000F300 — 0x F000F31C |
| 0x F000F400 — 0x F000F41F |

## → Loops, Stack and Stack pointer:

Loops:

1. A Loop is a block of statements that are repeatedly executed until a Condition is satisfied.
2. Assembly language uses "JMP" instruction to implement loops
3. "Loop" instruction extract the number of iterations from the "ECX" register.
4. Loop instruction decrements the value of "ECX" and compares it with zero.
5. If the value in the "ECX" is equal to zero, the program jumps to the loop; otherwise the program exits the loop.

Eg:

```
Mov X5, #1  
Mov X4, #100
```

[start index = 1]

[End index = 100]



-for loop: // ... code

X5 <= 100

B.LF for-loop

↓ X5 > 100

// ... code

```
ADD X5, X5, #1
```

[increment index by 1]

```
CMP X5, X4
```

[check whether End index has

~~Stack:~~

1. The processor Uses full descending stack - the stack pointer holds the address of the last stacked item in memory.
2. When the processor pushes a new item onto stack, it decrements the stack pointer & writes the item to the new memory location.
3. The processor implements two stacks, the Main stack and the process stack, with a pointer held in independent registers.
4. In THREAD mode, the control register controls whether the processor uses the Main stack (or) the process stack, in handler mode, the processor uses Main stack.

|                |                    |
|----------------|--------------------|
| Processor Mode | Used to execute    |
| THREAD         | Applications       |
| HANDLER        | Exception handlers |

### Stack pointer:

1. A "Stack pointer" is a small register that stores the address of the last program request in "Stack".
2. "Stack" is a specialized "buffer" which stores data from top down.
3. A stack also called "pushDown STACK" operates in last-in-first-out.
4. When a new data enter (or) pushed onto the top of stack, the stack pointer "increments" to the next memory address.
5. When a data is pulled (or) popped from the top of stack, the data is copied from the address of stack pointer, and the stack pointer "decrements" to the next available item at the top of stack.

### Subroutine and parameter passing:

#### Subroutine:

Group of instructions which are written separately to perform a specific task is called "Subroutine".

#### 1. passing parameter via stack:

To pass parameters to a Subroutine, the calling program

Pushes them on the stack in the reverse Order so that the last parameter to pass is the first one pushed and the first parameter to pass is the last one pushed



### 2. Passing parameters via Registers:

1. when the Subroutine called, it simply access the right register to pass the parameter.
2. By putting the address of some parameter in memory in an address register, the Subroutine neednot access memory in Order to fetch the parameters.

### 3. passing parameters via Memory locations:

1. It is normally used to pass data to memory mapped I/O devices as the interrupt handlers for these devices expect the data transfer to be in a given location, which prohibits the use of stack.

Eg: Mov D0, MAILBOX (put value of D0 in Memory)

BSR POSTMAN

...

POSTMAN : Mov MAILBOX

(fetch the data from memory location)

...

86

## ~~to~~ → Programming techniques:

ARM CORTEX-M3 can be programmed Using either "Assembler" or "C". There might be Compilers for other languages but mostly used Assembler, C, (or) a Combination of both.

### 1. Using Assembly:

for Small projects, it is possible to develop the whole application in Assembly language. In addition, handling Complex data structures (or) function library management is difficult in Assembler.

1. Functions that cannot be implemented in 'C' such as Special register access.
2. Timing Critical routines.
3. Tight Memory requirements.

### 2. Using C:

1. 'C' has the advantage of being portable & easier for implementing Complex operations.
2. It is a Generic Computer language, 'C' does not specify how the processor is initialised.

#### Eg: Using Assembly:

START:

```
Mov r0, #10  
Mov r1, #3  
Add r0, r0, r1
```



Application Execution

```

    Mov r0, # 0x18
    %Mo LDR r1, =0x20026
    SVC # 0x123456 } Application termination
    END

```

program end

Eg: Using C

i = 10

PRINT " \* ";

i = i + 1

### → Parallel I/O:

- 1. "parallel port," multiple bits read/written in parallel by CPU
- 2. "parallel input port," CPU can access information from an external device
- 3. "parallel output port," CPU can send information to an external device
- 4. Multiple I/O ports in most ARM CORTEX - M Support additional I/O ports via expansion buses.
- 5. Each port is configured and accessed via one (or) more registers
- 6. Each register is assigned Unique memory address
- 7. CPU reads/writes data via "data registers" in the port hardware
- 8. "Mode Register" for a port determines whether each pin is input (or) output



## Mode Registers:



1. '3' bits per pin.

00 - Input mode (Reset state)

01 - General purpose output mode

1. write pin value to ODR

2. Read IDR to determine pin state

3. Read ODR for last written value

10 - Alternate function mode

11 - Analog Mode

→ Disable op Buffer, input schmitt trigger, pull resistors.

## Data Registers: a) IDR:

1. Data input through 16 pins

2. Read only

## b) ODR:

1. write data to be output to 16 pins

2. Read last value written to ODR

3. Read / write (for Read - Modify - write operations)

## NVIC functional description:



1. Nested Vector interrupt Controller (NVIC) is a method of prioritizing interrupt, improving the MCU performance and reducing interrupt latency.
2. NVIC provides implementation for handling interrupt that occurs when other interrupt are being Executed (or) when the CPU is in process of restoring its previous state & resuming its suspended process.
3. The term "nested" refers to, in NVIC a number of interrupt can be defined, and each interrupt is assigned a priority, with "0" being the highest priority.
4. Function of "NVIC" is higher priority interrupt are completed before lower priority interrupt, even if the lower priority interrupt is triggered first.
5. The term "vector" refers to, in which the CPU finds the program, or ISR, to be executed when an interrupt occurs.
6. NVIC Uses a vector table that contains address of the ISRs of each interrupt.

7. when the interrupt is triggered, the processor gets the address from the Vector table.
8. prioritization & handling Schemes of NVIC reduce low power Consumption, even with high interrupt loading on the microcontroller.