

## 1. Classification of memory devices:



## 2. Difference between microprocessor & microcontroller:

### Microprocessor

- CPU is stand alone, RAM, ROM, I/O, timer are separate.
- designer can decide on the amount of ROM, RAM and I/O ports.
- Expansive
- general purpose

### microcontroller

- CPU, RAM, ROM, I/O & timer are all on a single chip.
- Fix amount of on chip ROM, RAM, I/O ports
- For applications in which cost, power and space are critical.
- Single purpose.

8051 → CISC - complex instruction set computer.

## 3) Difference between CISC & RISC

### CISC

- Emphasis on hardware
- Multiple instruction formats
- Less registers
- Extensive use of microprogramming
- Instructions take varying amount of cycle time

### RISC

- Emphasis on software
- Instructions of same set with few formats
- More registers
- Complexity in compiler
- Instructions take one cycle time

- pipelining is difficult
- pipelining is easy

#### 4. Classification of microcontroller



#### Types of buses

- data bus
- address bus
- control bus

Address bus - It is unidirectional, it is sent from processor to the memory

Control bus - It is unidirectional. It is sent from control unit to memory.

Data bus - It is bidirectional & sent from memory to control unit or read from memory.

The operation of processor:

(In 3 stage pipeline)

fetch → decode → execute.

ARM7TDMI → 32 bit microprocessor.

ARM7TDMI → 32 bit advanced RISC machine

T → thumb architecture extension.

- Two separate instruction sets 32 bit ARM instructions
- 8 16 bit thumb instructions.

ppf:-

D → debug extension

M - Enhanced multiplier

I - embedded ICE macrocell extension.

Instructions in ~~the~~ ARM get executed in single cycle.

Features of ARM:

- 32 bit processor → 32bit ALU, 32bit data bus, 32bit address bus.
- ICE (in circuit emulation)
- Registers are 32 bit.
- There are total of 37 registers
  - 31 general purpose
  - 6 special function registers.
- RISC instructions
- All most all instructions execute in single cycle.
- Each instruction has a fixed size of 32 bit. (32 bit long)
- Fast multiplier (external to ALU) - Booth algorithm [32x32]
- Scalable shifter (to improve speed of processor)
- It operates in 3 states
  - ARM state - 32 bit instructions
  - THUMB state - 16 bit instructions
- It operates under 7 modes:
  - User (usr) - normal ARM program execution state
  - FIQ (fIQ) - designed to support data transfer or channel process
  - IRQ (irq) - used for general purpose interrupt handling.

- supervisor (svc) - protected mode for the O.S.
- abort mode (abt) - Entered after a data or instruction prefetch error
- system (sys) - a privileged user mode for operating system
- undefined (und) - Entered when an undefined instruction is executed

Among these USER mode is unprivileged, others are privileged.

- Debug interface.
- Von Neumann architecture
- It is a 3 stage pipeline - fetch, decode, execute
- OS & RTOS

### Pipelining :

Ex - executing 4 instructions with pipelining (fetch, decode, execute)



$\therefore$  6 timecycles are required

without pipelining:



12 cycles are required.

OS & RTOS  $\rightarrow$  OS stands for operating system & RTOS stands for Real time OS.

ARM  $\rightarrow$  Advanced RISC Machine.

Based upon RISC architecture with enhancements to meet requirements of embedded applications

• Load store architecture, where data processing operations operate on register contents only

- <sup>Basic</sup>  
<sup>with features</sup>
- Uniform & fixed length instructions
  - Good speed / power consumption ratio
  - High code density.
  - Large uniform register file.
  - Instructions are 32 bit long & single length.

To operate in memory (perform operations) load store architecture is used which is used to load <sup>(read from)</sup> memory & store <sup>(write to)</sup> memory. [Unlike in 8051 operations such as add<sup>register</sup> could be directly performed by mentioning the address]. Load store architecture - ALU cannot access data from memory, they access the registers.

extended  
features

Enhancements to basic RISC features:

- Control over ALU and shifter for every data processing operations to maximize their usage.
- Auto increment & auto decrement addressing modes to optimize program loops
- Load and store multiple instructions to maximize data throughput
- Conditional execution of instruction to maximize execution throughput.

throughput → execution of many instructions in a fixed time

ARM  
Registers

Register set of ARM by Steve Furber.  
ARM organization from Steve Furber.

Difference between simulation & emulation

Simulation

Emulation

- |                                           |                                                |
|-------------------------------------------|------------------------------------------------|
| • Decision making happens instantaneously | • Decision making happens in a series of steps |
| • More approximation                      | • More of accurate calculations                |

JTAG - Joint test action group.

ARM state General registers & program counter.  
[Diagram in txt]

37 registers are divided as 16 in system & user,  
15 registers are banked in general registers & P.C.

The CPSR is common in all ARM state Program Status Registers.  
There are SPSR terms in each column of the state named respectively as SPSR\_fiq, SPSR\_svc, SPSR\_abt, SPSR\_irq, SPSR and SPSR stands for saved program status registers.

(CPSR - current program status registers.)

PSR is used to store the status of the CPSR for further reference.

PC - (program counter) points to the address of the next line of program to be executed.

Registers are used for ALU operations.

State registers - used to update the state.

General purpose registers:

PC → program counter ( $r_{15}$ )

LP → LR → link register ( $r_{14}$ )

SP → stack pointer ( $r_{13}$ )

CPSR → current program state register.

Stack pointer points towards the stack formed in the execution processor of function call in a program. Stores the head of stack in current mode. LP is used to store the value of PC temporarily when the PC moves towards function definition from function call.

$r_1$      $r_6$

$r_1$

$r_2$

$r_3$

$r_4$

$r_5$

$r_6$

$r_7$

$r_8$

$r_9$

$r_{10}$

$r_{11}$

$r_{12}$

$r_{13}$  SP

$r_{14}$  LP

$r_{15}$  PC

Data  
registers

CPSR } processor status  
SPSR } registers.

} special function  
register

- PC can be used as general purpose
- PC also, but when used in abbreviation form (SP, LP, RI)
- PC performs special functions.

PC is an pointer towards the program.

~~Register~~

control field.

|    |    |    |    |               |    |    |    |              |    |    |    |    |     |    |   |   |   |   |   |   |   |   |   |
|----|----|----|----|---------------|----|----|----|--------------|----|----|----|----|-----|----|---|---|---|---|---|---|---|---|---|
| 31 | 30 | 29 | 28 | 27            | 26 | 25 | 24 | 23           | 22 | 21 | 20 | 19 | ... | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 |
| N  | Z  | C  | V  | Do not modify |    |    |    | Read as zero |    |    |    | F  | F   | T  | M | M | M | M | M | M | M | M | M |

N → negative

I → IRQ mode [interrupt masks]

Z → zero

F → FIQ mode

C → carry

T → 0 goes to arm mode / 1 goes to thumb mode.

V → overflow

H → mode. [changed by using BX or BIX instruction]

5 modes → binary decided :  $2^5 \rightarrow 32$  combinations

out of 32 only 7 combinations are valid because of security reason.

fiq → fast interrupt request used to speedup the working of the processor. If it is 1 it disables the FIQ interrupt

If the result is zero Z is set to 1. [all 32 bits of result is zero]

Negative flag is set to 1 when the result is -ve.

Carry is set when there is a carry generated in ALU operation

Methods of indicating signed numbers (for M-N) : two

Interrupt masks are used to stop specific interrupt requests from interrupting the processor.

If IRQ is 1, disables the IRQ interrupt.

## Bit 4 to 0 Mode

|       |            |
|-------|------------|
| 10000 | User mode  |
| 10001 | FIQ mode   |
| 10010 | IRQ mode   |
| 10011 | Supervisor |
| 10111 | abort      |
| 11011 | Undefined  |
| 11111 | System     |

The Memory system:

- The ARM system has memory ~~state~~.
- It is viewed as a linear array of bytes numbered from 0 to  $2^{32}-1$
- Data items may be - 8 bit bytes  
16 bit half words  
32 bit words.
- Words are always aligned on a 4 byte boundary.

bit → binary (0 or 1)

1 nibble → 4 bits

1 byte → 8 bits

halfword → 16 bits

1 word → 32 bits



When lower byte addresses have lower byte data than it is called little endian.

When lower byte address have higher byte data, it is called big endian.

- 32 bit bus - data, address & control

- multiplier

- shift register

- JCF

- interrupt

- register banks - [PC, EPSR, SPSR].

- control & decode logic.

- ALU.

[address bus]

[register bank PC]

bus

shift register

ALU

Buffers are used to affect the mismatch in the speed (speed variation) between local buses and address buses.

### ALU organization (architecture)



## Components :

- Register bank which stores the processor state. It has 2 read ports & one write port. & additional read port & write port that gives access to  $R_{15}$  (program counter) <sup>a register</sup>
- Barrel shifter which shifts one operand by any no. of bits
- ALU which performs arithmetic & logic functions.
- Data register - which hold data passing from ~~one~~ to memory.
- Address register - It selects & holds all memory <sup>addresses</sup> & generates sequential addresses when required.
- Instruction decoder & associated control logic.



2/5

30/1

## Instruction cycle.

### Machine cycle

- The steps that get performed by the processor getting employed in a device and all the instructions that get implemented.
- Fetch, decode, execute & store
- Memory unit & central processing unit
- The steps required by CPU to fetch & execute an instruction is called an instruction cycle.
- The time required by the microprocessor to complete the operation of accessing memory or I/O devices is called machine cycle.
- It process by which a computer takes an instruction given by a program then understands it and executes from memory.
- Fetch, decode, execute and store.
- Arithmetic logic unit, registers, data & memory
- The time required by CPU to fetch & execute an instruction is called instruction cycle.

## Modes of addressing in ARM:

|                                                                 |                                         |
|-----------------------------------------------------------------|-----------------------------------------|
| Register - mov R <sub>1</sub> , R <sub>2</sub>                  | → used in data processing instructions. |
| Indirect - mov R <sub>1</sub> , @R <sub>0</sub> x               |                                         |
| Immediate - mov R <sub>1</sub> , #0x5h                          |                                         |
| Indexed → base, index - the array is accessed [not used in ARM] | J used in d                             |

## Instructions

|                                                                                                                               |
|-------------------------------------------------------------------------------------------------------------------------------|
| data processing (cannot access memory directly or indirectly)<br>register addressing<br>immediate addressing<br>data transfer |
|-------------------------------------------------------------------------------------------------------------------------------|

Data processing instructions :

- i) move instruction (R-R transfer, R-immediate transfer)
- ii) Arithmetic instruction
- iii) Logical instructions
- iv) Compare
- v) jump (branch or looping)

Instruction set (32 bit)

Ex Add R<sub>1</sub>, R<sub>0</sub>, R<sub>1</sub>

First 4 bits are used to hold the pneumatic (add/sub/mul/and)



4 bits to check the condition/carry/zero/be flags

The next 4 bits are used to hold the address of the first operand. & similarly with the consecutive 4 bits of data.

The later bits are used to perform the operation indicated by the pneumatic.

Extra Data can be addressed through barrel shifter also.  
Barrel shifter is connected to B bus. Because it is only connected to second operand because only second <sup>source</sup> operand will be shifted. To speed up the execution the barrel shifter is external & all instructions can include shift instruction in their operation. [Ex Add R<sub>1</sub>, R<sub>0</sub>, R<sub>1</sub>, shift #2].  
The ARM doesn't have actual shift register instructions that is why it has barrel shifter to shift the instruction by default.

Barrel shifter - Left shift.

Shifts left by the specified amount (multiples by the power of two).

Eg -  $LSL \#5$  = Multiply by 32.

LSL - logical shift left.



$LSL \#1 \rightarrow 0010 = 0100 = 4$ .

$LSL \#2 \rightarrow 0010 = 1000 = 8$ .

$RSI \#1 \rightarrow 0010 = 0001 = 1$ .

$RSI \#2 \rightarrow 0010 = 0000 = 0$ .

Barrel shifter - Right shift.

Shifts right by the specified amount (multiple divides by multiples of 2).

Eg  $RSR \#05$



Arithmetic shift right.

Shifts right (divides by power of 2) & preserves the sign bit for 2's complement operation.



Used for signed integers.

### • Rotate right (ROR)

Similar to ASR but the bits wrap around as they leave the LSB & appear as the MSB.



### • Rotate right extended (RRX)

This operation uses the CPSR[12] flag as a 33rd bit.

Rotate right by 1 bit (RRX # 0).



### Features of ARM instruction set:

- All instructions are 32 bit long [fixed length]
- most instructions execute in single cycle.
- 2 sets.
- Thimble  $\rightarrow$  opcode - 16 bits
- ARM  $\rightarrow$  ..  $\rightarrow$  32 bits
- All codes are conditionally executed.
- Orthogonal registers.
- Instructions get executed through 3 stage pipelining system.
- Shifting of second operand can be done through barrel shifter.
- Load store architecture. — combined ALU & shifter for high speed manipulation.
- External multiplier.
- Auto indexing (memory related instruction)
- 3 operand instructions.  $\rightarrow$  most operations.
- Instruction set extension via coprocessor.

### Accessing registers using ARM instruction

- All instructions can access to r14 directly.

- Most instructions also allow the use of the PC.
- Specific instructions to allow access to CPSR & SPSR
- When in privileged mode, it is also used

The condition field:



0000 = EQ - Z set (equal)

0001 = NE - Z clear (not equal)

0010 - HS / CS - C set (unsigned higher or same)

0011 = LO / CC - C clear (unsigned lower)

0100 = MI - N set (negative)

0101 = PL - N clear (positive or zero)

0110 - VS - V set (overflow)

0111 - VC - V clear (no overflow)

1000 - HI - C set & Z clear (unsigned higher)

1001 - LS - C clear or Z (set unsigned lower or same)

1010 - GE - N set & V set or N clear & V clear ( $\geq 0$ )

1011 = LT - N set & V clear or N clear & V set ( $> 0$ )

1100 - GT - Z clear and either N set & V set or N clear & V set ( $> 0$ )

1101 = LE - Z set, or N set and V clear or N clear & V set ( $\leq 0$ )

1110 = AL - always

1111 = NV - reserved.

E)  $0x78 \rightarrow 120$       Flags updated  $\rightarrow N = 0$   
 $+ 0x8A \rightarrow 138$

Result = 102H 258

0001P00000 0010] 8000

Z = 0

C = 1

V = 1

$0x7A \rightarrow 122$ . Flags updated - N = 0  
 $+ 0x1B \quad \underline{27}$  Z = 0  
 Result  $\rightarrow 0x95 \quad 149$  C = 0  
 V = 0

~~Q10.~~ Write down examples for each shift operation.

General format of instruction:

mnemonic { condition } { S } dest. | source 1, source 2  
 code | ↑                                      | operand 1, operand 2  
 flag update.

if ( $a == 0$ ) — condition.  
 {  
 $a = a + 1;$   
 }  
 else:  
 $a = a - 1;$

Flag update.

$$\begin{array}{r}
 R_0 \\
 + R_1 \\
 \hline
 R_2
 \end{array}
 \quad
 \begin{array}{r}
 R_2 \\
 + R_3 \\
 \hline
 R_4
 \end{array}$$

Add S  $R_2, R_1, R_0$

Add S  $R_4, R_2, R_3$

adc  $\rightarrow$  add with carry.

& S update the flags with current values [can check]

There are lesser no. of basic pneumatics, and the instruction  
s. can be altered by updating flag(s)

Eg. Add EO  $\rightarrow$  this will add when 7 flag is set (=1)

if ( $a == 0$ )  
 {  
 $a = a + 1;$  }  $\rightarrow$  SETS  
 }  
 else  
 $a = a - 1;$

This can be written in 3 ways

|                     |                    |
|---------------------|--------------------|
| Add S R2, #0h       | ORG 00:            |
| Add EQ R3, R2, R1   | MOV R0, #00        |
| SUB NE R3, R2, R1   | CMP R0, #00        |
|                     | ADD EQ R0, R0, #01 |
|                     | SUB NE R0, R0, #01 |
|                     | END                |
| MOV S R1, #2ch      | CMP R0, #00        |
| ADDEQ R1, R1, #1;   | ADDEQ R1, R0, #1;  |
| SUBB NE R1, R1, #1; | SUBNE R2, R0, #1;  |
| END                 |                    |

$$\begin{cases} Q & \{a > 0\} \\ & \{a = a + 1\}; \end{cases}$$

$$\text{else } \{a = a - 1\};$$

~~ADD ADDS R0, #00  
#ADDI SUB RL  
ADD PL.~~

→ (CMP R0, #00 (C, Z, N, F))  
 { ADDPL R0, R0, #1  
 SUBMS R0, R0, #1.

i) Check whether a no. is positive or negative, add if +ve,  
 subtract if -ve.

Check whether a no. is positive odd or even add if odd, subtract if even.

### Data processing

i) Move → MOV

MOV [cond. pos/neg] [S] destination, source;

MVN [complement source & put it in destination]

MVN [cond. pos/neg] [S] destination, source, immediate

MVN R0, R1 → R0 ← not R1 (R1' is put in R0)

~~MOV R0, R1~~ ;  $R0 \leftarrow R1$

~~MOV R1, R0, LSL #2~~ ;  $R0 \leftarrow R0 * 4$ .

~~MOV R1, R0, #1~~ ;  $R0 \leftarrow R1 * 00000001$ .

### MVN:

~~MVN R1, R0~~ ;  $R1 \leftarrow \text{NOT } R0$ .

~~MVN R1, R0, LSL #2~~ ;  $R1 \leftarrow \text{NOT}(R0 * 4)$ .

~~MVN R0, #4~~ ;  $R0 \leftarrow \text{NOT } 4$ .

### ③ Arithmetic & Logical

ADD  
adc  
sub

SBC  
RSB (reverse subtract)

and RSC

DTT

EOR

→ write 3 addressing modes  
for each

### ADD:

ADD {cond. postfix} of S<sub>1</sub> dest, op1, op2;

(register) (reg, immediate, barrel shifted)

### Class:

### ARM Development Tools:



Full form of aof, aif, difference between machine & ic. cycle.

The library is included (stdio.h) in the compiler. i.e., the standard functions in the library are extended to the program to be executed in the compiler. (They become a part of the program).

### Assembler directives:

They are the directions given by the user that guide the assembler.

### Some directives:

AREA - It defines a region in the memory (can be named) [Pg 1]

CODE - It is used to select between (data & code memory) [Ex- Named as Pg 1] Read only → used to select the type of file.

AREA Pg 1, CODE, READONLY

ENTRY

: MOV R<sub>0</sub>, #0x0A

: MOV R<sub>1</sub>, #0x0B

: ADD R<sub>2</sub>, R<sub>1</sub>, R<sub>0</sub>

END.

In Area with CODE memory & Read only type is named as Pg 1 in this case

ENTRY - used to indicate the beginning of the code

END - " " " the end of the program code.

HW.

1. CMP R<sub>0</sub>, #00 (C, Z, N, E)

ADD PL R<sub>0</sub>, R<sub>0</sub>, #1

SUBB MI R<sub>0</sub>, R<sub>0</sub>, #1

2.

MOV S R<sub>2</sub>, Q

RORS R<sub>2</sub>, R<sub>2</sub>

ADD HS R<sub>3</sub>, R<sub>2</sub>, #01

ABBLD R<sub>2</sub>, R<sub>3</sub>, #01.

⊕ NXP

↳ LPC 2148

Code for: +ve, -ve:

```

    AREA PGI, CODE, READONLY
    ENTRY
    MOV R0, #DX0A
    cmp R0, #0x00
    ADDPL R0, R1, #1
    stop b stop
    END
  
```

OR

```

    AREA PGI, CODE, READONLY
    ENTRY
    ldr R0, =0x11112222
    cmp R0, #0x00
    ADDPL R1, R2, #1
    SUBMI R1, R1, #1
    stop b stop
    END
  
```

LDR is used instead of MOV because MOV moves 8 bit of immediate data. (that is because of Only 8 bit is left in opcode in the instruction set)

Data processing continued:

iii) Compare

- CMP
- CMN
- TST
- TEQ.

iv) Jump

- B
- PL
- BLX

Add 2 32 bit no., add 2 64 bit no., add series of 5 32 bit no.  
count 0's & 1's in 8 bit no.

check whether a number is greater than a no. (32 bit)  
" " less than a no. (32 bit)

Add 2 32 bit numbers,

AREA A32, CODE, READONLY  
ENTRY

LDR R0, =0X11112222

LDR R1, =0X88885555

ADD R2, R1, R0

STOP B STOP

FEND

Add 2 64 bit numbers

AREA A64, CODE, READONLY

19  
1.2  
21  
R1 11112222 (33234444)  
(55556666) (22223333)  
T6 R4

R2 11112222 88884444-R  
R3 55556666 88883333-R  
R4 66668889 11107777-R  
R5

### Object library:

It is a collection of object modules in a single file in a format that allows selective extraction by linker of those modules that are needed when linking a program.

### Memory:

Program memory → read only  
and once written it cannot  
overwrite.

|         |              |
|---------|--------------|
| Program | = 0x40000000 |
| Data    |              |

### Memory handling Instructions:

LDR destn., constant - (i)

pointer - (ii)

variable name - (iii)

Eg. (i) LDR, R<sub>0</sub>, = 0XAABBCCDD

(ii) LDR R<sub>1</sub>, [R<sub>0</sub>] → here R<sub>0</sub> is used as pointer.

LDRH → loads 2 bytes into register → 16 bit.  
half

LDRB → loads 1 byte into register → 8 bit

LDR → 32 bit (4 bytes) into register

Pointers are used to access a set of data (stored continuously)

(iii) LDR, R<sub>0</sub>, a.

where a=5 is initialized before (a is the variable).

~~ANSWER~~

DCD → directive that indicates the datatype that the variable is holding [declare constant doubleword]  
 → 32 bit (DCD)  
 → 16 bit (DCW) [declare constant halfword]  
 → 8 bit (DCB) [declare constant byte]

Eg. a DCD 0XAABBCCDD [DCD → 32 bit]

a. DCW 0XAABB [DCW → 16 bit]

a DCB 0XAA [DCB → 8 bit]

Write a program to add 2 → 32 bit numbers by declaring variables a & b.

AREA ENTRY

a DCD 0X11112222 } after stop B stop.

b DCD 0X33334444 }

LDR R<sub>0</sub>, a

LDR R<sub>1</sub>, b

ADD R<sub>2</sub>, R<sub>1</sub>, R<sub>0</sub>

stop B stop

END.

AREA A32, CODE, READONLY

ENTRY  
LDR R0, a

LDR R1, b

ADD R2, R1, R0

stop b stop

a dcd 0x11122222

b dcd 0x33344444

end.

LDR R0, =a [load the starting address of a to R0]  
(not the value of a.)

LDR R1, [R0].

AREA A5, CODE, READONLY

ENTRY

LDR R0, =a.

LDR R1, [R0], #0x04. [#0x04 increments the pointer by ~~#~~<sup>A4</sup> because of autoindexing].

LDR R2, [R0], #0x04

LDR R3, [R0], #0x04

LDR R4, [R0], #0x04

LDR R5, [R0]

ADD R6, R1, R2.

ADC R7, R6, R3

ADC R8, R7, R4

ADC R9, R8, R5.

Stop B stop

a dcd 0x61223344, 0x27334455, 0x33445566, 0x11336677

0x44553322

end.

Find the sum of even & odd numbers in an array of 10 32 bit numbers.

Find the no. of positive & negative numbers in an array of 10 32 bit nos.

To check whether a number is greater than the other

AREA GRE, CODE, READONLY.

ENTRY

MOV R0, #05

MOV R1, #15

~~MOV R2~~

CMP R0, R1

MOVGT R2, #01

MOV LT R2, #00

STOP B STOP

END

To check whether a no. is less than the other

AREA LESS, CODE, READONLY.

ENTRY

MOV R0, #05

LDR R0, = 0x11223344

MOV R1, #15

LDR R1, = 0x55663311

CMP R0, R1

MOV LT R2, #01

MOVGT R2, #00

STOP B STOP

END.

for 32 bit

To count no. of 1's & 0's, in number

AREA COUNT, CODE, READONLY

ENTRY

MOV R0, #0x23

MOV R1, #0x8

Label MOV R0, RD, LSR #1

ADD CS R2, R2, #1  
 ADD CC R3, R3 #1  
 sub r1, r1, #1  
 cmp r1, #00  
 bne lable  
 end.

Branch if not equal jump  
 back if & only geno  
 flag is clear.

## SUB

MOV R1, #0x04  
 MOV R2, #0x01  
 SUBS R0, R1, R2

## CMP

MOV R1 #0x01  
 MOV R2, #0x04  
 CMP R1, R2

## RSB

MOV R1, #0x02  
 MOV R2, #0x01  
 RBS R0, R1, R2

FOR  
 MOV R, #0x01  
 MOV R2, #0x04  
 FOR R1, R2, R1

## ORR

MOV R1, #0x02  
 MOV R2, #0x03  
 ORR R0, R1, R2

→ To add 5 32 bit numbers using branch loop  
 AREA ADDS, CODE, READONLY.

## ENTRY

MOV R3, #0x00 ; R3 stores final sum  
 AREA: MOV R0, #0x05

LDR R0, R0 = 0

LABEL: LDR R0, [R1], #0x04

ADDS R3, R3, R2

SUBS R0, R0, #0x01

CMP R0, #0x00

BNF LABEL

B

adc → 6  
 (A)

~~stop B stop~~

a dcd 0x11223344, 0x11224433, 0x33221144, 0x11334422  
0x1234455 .

end

The c in the sub's instruction overwrites the carry flag. ~~leads to wrong answer.~~

To avoid this another loop can be used.

area A:32, code, readonly, entry | area A:32, code, readonly, entry

mov r0, #0x05  
mov r3, #0x00  
ldr r1, =a

OR

mov r0, #0x05  
mov r3, #0x00  
bdr .

TABLE ldr r2, [r1], #0x04 TABLE ldr r2, [r1], #0x04

adds r3, r3, r2

adds r3, r3, r2

beq carry

adds r4, r4, 0x01

sub r0, r0, #0x01

sub r0, r0, #0x01

cmp r0, #0x00

cmp r0, #0x00

bne LABEL

bne LABEL

carry add r3, r3, r2

stop b stop

subs r0, r0, #0x01

a dcd

cmp r0, #0x00

end

bne LABEL

stop B stop

a dcd 0x11223344,

end

HW

1. Odd-even, +ve max-min, 1's & 0's ; mul/div, add/sub
2. To add two array elements (one to one) and store the result in 3rd array
3. Develop a code to find minimum of N numbers.

3x4

4

3

$$3+3+3+3$$

$$4+4+4$$

Code for division (division using subtraction)

Core page Data processing instruction  
Data path activity.



STR → Store instruction.

- STR source, destination (destination → memory pointer/address)

The writable memory area is 8kB.

$$1\text{K} = 10\text{b.} \quad [\because 2^{10} = 1024.]$$

$$111^{\text{bus}} \quad 8\text{kb} = 12\text{b.} \quad [\because 2^13 = 8192]$$

str b R<sub>5</sub>, [R<sub>0</sub>], #0



ARM multicycle instruction pipeline.



These are different ways of handling the stall. The same data is fetched in the next cycle if the data is flushed.

On S-I-R instruction. str R<sub>5</sub> [R<sub>4</sub>, #4]

(Decoder generates control signals.)

1. Add 2 array elements & store the answer in an array.  
Program will add array, code, readonly  
Entry

```
ldr R0, =a
ldr R1, =b
ldi R2, =0x40000000
mov R3, #3
ldy ldr R4, [R0], #4
ldi R5, [R5], #4
add R6, R4, R5
str R6, [R2], #4
subs R3, R3, #1
bne lable
```

stop & stop

a dcd 0xA1, 0xAF0, 0x56F

b dcd 0x116, 0xBAC, 0x74C

end

2. To find the min element in a given array.  
Subarea Min, code, readonly

Entry

```
ldr R0, =a
mov R5, #4
ldr R2, [R0], #4
lmp LDR R4, [R0], #4
cmp R2, R4
movcs R2, R4
subs R5, R5, #1
bne LMPR
```

STOP & STOP

~~a dcd 0x11223344, 0x22334455, 0x12345678~~

END

3. Code to find the sum of even no. in an array of 10 ele.

→ AREA EVEN, CODE, READONLY

ENTRY

MOV R3, #0X00 // R3 is used to store the result

MOV R5, #0X0A // R5 is used as count

LDR R0, =a

even LDR R2, [R0], #4

MVNS R6, R2, RDR, #01

ADDCC R3, R3, R2

SUBS R5, R5, #1

BNE EVEN

STOP B STOP

END

a dcd 0x11223344, —, —, —,

4. To find sum of <sup>odd</sup> negative numbers in a given array of 10 elements.

→ AREA SUMODD, CODE, READONLY

ENTRY

MOV R3, #0X00 // R3 is used to store the result

MOV R5, #0XA // R5 is used as count

LDR R0, #=a

odd LDR R2, [R0], #4

MVNS R6, R2, RDR, #01

ADDCC R3, R3, R2

SUBS R5, R5, #1

BNE DDD

Stop B Stop

a dcd 0x09E, 0xAF, 0X32, 0X33, 0X41 ..

end

5. To find the sum of positive numbers in a given array

→ AREA POS, CODE, READONLY

ENTRY

LDR R0, a  
MOV R5, #0xa  
MOV R6, #00 // R6 is used to store the result.  
POS. LDR R2, [R0], #4  
MOV R3, R2  
MOVS R2, R2, LSL #1  
ADD CC R6, R6, R3  
SUBS R5, R5, #1  
BNE POS

STOP B STOP

a dcd .0x2233,  
end

6. To find the sum of negative numbers in a given array of 100  
AREA NEGATIVE, CODE, RANDOMLY

ENTRY

LDR R0, -a // R0 is used as a pointer to array  
MOV R5, #0a  
MOV R6, #00 // stores the result  
NEG LDR R2, [R0], #4  
MOV R3, R2  
MOVS R2, R2, LSL #1  
ADD CS R6, R6, R3  
SUBS R5, R5, #1  
BNE NEG

STOP B STOP

a dcd .0xfe  
end

7. Code for division using subtraction  
AREA DIV\_SUB, CODE, RANDOMLY

ENTRY

MOV R5, #15  
MOV R8, #3.  
PMP R6, #0  
BEQ EN  
CMP R6, R5  
MOV PL R6, #1.  
BPL EN  
DIV SUB R7, R5, R6  
ADD R1, R1, #1  
CMP RT, R6  
MOVPL R5, RT  
BPL DIV

ENB.

STOP B STOP

END

8. To sort a given array in ascending &amp; descending order.

→ AREA ASCEND, CODE, READONLY

ENTRY A

MOV R0, #5

OUT LDR R5, =0X40000000

· ADD R6, R5, #1

MOV R3, #4

IN LDRB R1, [R5]

LDRB R2, [R6]

CMP R1, R2

BCC IOP

MOV R4, R2

MOV R2, R1

MOV R1, R4

IOP STRB R1, [R5]

STRB R2, [R6]

```
ADD R5, R5, #1  
ADD R6, R6, #1  
SUBS R3, R8, #1  
BNE IN  
SUBS R0, R0, #1  
BNE OUT  
STOP B ITDP  
END.
```

9 To arrange an array in descending order  
AREA DESCEND, CODE, READONLY

```
ENTRY  
MOV R0, #5  
OUT LDR R5, =DX40000000  
ADD R6, R5, #1  
MOV R3, #4  
IN LDRB R1, [R5]  
LDRB R2, [R6]  
CMP R1, R2  
BCS LOP  
MOV R4, R2  
MOV R5, R1  
MOV R1, R4  
LOP STRB R1, [R5]  
STRB R2, [R6]  
ADD R5, R5, #1  
ADD R6, R6, #1  
SUBS R3, R3, #1  
BNE IN  
SUBS R0, R0, #1  
BNE OUT  
STOP B STOP  
END
```

ro. ~~Code~~ → Code to sort array in half ascending & half descending order.

AREA HALF, CODE, READONLY

ENTRY

MOV R0, #8

MOV R9, #2

MOV R10, #0

OUT1 LDR R5, =0x40000000

ADD R6, R5, #1

MOV R3, #2

IN1 LDRB R1, [R5]

LDRB R2, [R6]

CMP R1, R2

BCC LDP

MOV R4, R2

MOV R2, R1

MOV R1, R4

LDP STRB R1, [R5]

STRB R2, [R6]

ADD R5, R5, #1

ADD R6, R6, #1

SUBS R3, R3, #1

BNE IN1

SUBS R0, R0, #1

1 BNE OUT1

BEQ DUT2

OUT2 LDR R5, =0x40B00003

ADD R6, R5, #1

MOV R12, #1

IN2 LDRB R1, [R5]

LDRB R2, [R6]

CMP R1, R2

BCK LOP 2  
MOV R4, R2  
MOV R2, R1  
MOV R1, R4  
(OP 2 STRB R1, [R5])  
STRB R2, [R6]  
ADD R5, R5, #1  
ADD R6, R6, #1  
SUBS R12, R12, #1  
BNF IN2  
SUBS R9, R9, #1  
BNF OUT2  
STOP B STOP  
END

To generate first 10 numbers of fibonacci series  
AREA FIBO, CODE, READONLY

ENTRY  
MOV R10, #10  
LDR R8, =0x40000000  
MOV R0, #0  
SUB R1, R0, #1  
MOV R2, #1  
ANB R6, R2, R1  
STR R6, [R8], #4  
FIB ADD R6, R2, R0  
STR R6, [R8], #4  
MOV R2, R0  
MOV R0, R6  
SUBS R10, R10, #1  
BNF FIB  
STOP B STOP  
ENDP.

To generate first 10 even numbers.

AREA EVEN, CODE, RFADONLY

ENTRY

MOV R10, #10

ldr R3, =0x40000000

MOV RH, #0

lalut add R4, RU, #2

str RU, [R3], #4

sub R10, R10, #1

BNE LABLE

31. Develop a code to find

i)  $R_0 = 5R_1$

ii)  $R_3 = 40R_4$

iii)  $R_7 = 15R_8$

iv) If  $(R_0 > R_1)$

{  
     $R_2 = R_0 + 4R_1$ , }  
else

{  
     $R_2 = R_0 + 7R_1$ , }  
if {  
     $R_5 = !R_6$ , }

{  
     $Y = AB + BC \cdot AC$ , }  
else

{  
     $Y = A \text{nor} B \text{nor} C$ , }  
32. Given random array of 10 nos. separate even numbers.

[string dcb "BVB CET", 0] → null terminated string.  
[string dcb "BVB CET", CR]  
CR equal 0x0d  
→ string terminated with (R=0x0d)

[string dcb "BVB CET", \$] → string terminating with \$

Store instruction occurs in 3 phases

↳ calculation of address : Ex [R<sub>2</sub>, #4]

↳ Store data & auto indexing.

In store instruction immediate data has a size of 12 byte.

STR R<sub>0</sub>, [R<sub>2</sub>, #4] (!) → writeback

writeback helps in updating the address to the autoindexed value. (It only applies to pre-indexing)

Not using (!) in pre-indexing keeps the pointer pointing towards R<sub>0</sub> and repeats the same address.

PC is involved in execution of branch instruction. And the PC part is moved to ALU because it is used to calculate the size of jump in case of loop. The jump size is restricted to 24b.

Word align the

R<sub>14</sub> → link register [used to temporarily store the value of PC]

B1 → branch & link for function call

BX → branch & exchange for different operation.

before branching it gets word aligned. Word alignment.  
word align → addresses having multiples of 4.

Sum of odd & even parity.

LDI R<sub>0</sub>, = A8

MOV R12, = 0x0A

MOV R11, = 0x0B

```

loop    LDRB R1, [R0], #0x01
        MOV   R2, R1
ONES   MOVS R1, R1, ASR, #0x01
        ADDCS R3, R3, #0x01
        SUBS R11, R11, #0x01
BNF    ONES
        MOV   R11, #0x08
        MOV   R4, R3
        ANDS R6, R4, #0x01
        ADDEQ R5, R5, R2.
        ADDNE R7, R7, R2.
        SUBS R12, R12, #0x01
        MOV   R3, #0x00
LDR   R8, =0x40000000
        LDR   .R9, 0x400000004
        STRB R5, [R8]
        STRB R7, [R9]

```

stop to stop

A DCB R01, ... , R0A

end.

CX8 can be done as RSB R1, R0, R0, LSL #3.