

# CEG 3136 Summary Sheet

## Contents

|                                           |           |
|-------------------------------------------|-----------|
| <b>1 Data Representation</b>              | <b>3</b>  |
| 1.1 Strings . . . . .                     | 3         |
| 1.2 Fixed Point . . . . .                 | 3         |
| 1.3 Floating Point . . . . .              | 4         |
| <b>2 ARM Instructions</b>                 | <b>5</b>  |
| 2.1 Arithmetic and Logic . . . . .        | 6         |
| 2.1.1 NZCV Flags . . . . .                | 7         |
| 2.1.2 Saturation . . . . .                | 7         |
| 2.1.3 Other Instructions . . . . .        | 8         |
| 2.2 Memory . . . . .                      | 8         |
| 2.3 Endianess . . . . .                   | 9         |
| 2.4 Control Flow Instructions . . . . .   | 9         |
| <b>3 Subroutines</b>                      | <b>10</b> |
| 3.1 Stack . . . . .                       | 10        |
| <b>4 C and Assembly</b>                   | <b>11</b> |
| 4.1 Volatile Datatypes . . . . .          | 12        |
| 4.2 Interrupts . . . . .                  | 12        |
| <b>5 FPU</b>                              | <b>12</b> |
| <b>6 GPIO</b>                             | <b>13</b> |
| 6.1 Debouncing . . . . .                  | 14        |
| 6.2 Interrupts . . . . .                  | 14        |
| <b>7 Timers</b>                           | <b>14</b> |
| <b>8 Direct Memory Access (DMA)</b>       | <b>15</b> |
| <b>9 DAC and ADC (Analog Interfacing)</b> | <b>15</b> |

|           |                                 |           |
|-----------|---------------------------------|-----------|
| 9.1       | DAC . . . . .                   | 15        |
| 9.2       | ADC . . . . .                   | 15        |
| <b>10</b> | <b>Serial Communications</b>    | <b>16</b> |
| 10.1      | I2C . . . . .                   | 16        |
| 10.2      | SPI . . . . .                   | 16        |
| 10.3      | UART . . . . .                  | 16        |
| 10.4      | USB . . . . .                   | 16        |
| <b>11</b> | <b>The Bus (Bus Interfaces)</b> | <b>16</b> |
| <b>12</b> | <b>Appendix</b>                 | <b>17</b> |

# 1 Data Representation

One **Byte** is defined as 8 bits. In 32 bit architecture, one **Word** is 32 bits, and therefore a half word is 16 bits, and a double word is 64 bits.

We have unsigned integers and signed integers. Unsigned is simple, signed can be represented in either signed magnitude, twos complement, or ones complement.

## Ex. Show -5 in 4 bits in all 3 forms

Positive 5: 0101

Signed Magnitude: 1101

Twos Complement: 1011

Ones Complement: 1010

In an adder/subtractor, we generate a carry flag (C) (carry out of most significant bit [MSB]) and an overflow flag (V). If we are doing unsigned, the carry flag signifies something is wrong (C=1 for addition, C=0 for subtraction). For signed, we ignore C, and use V. If V=1, something went wrong.

Overflow is the XOR of the carry out of the MSB and the carry in to the MSB.

We typically use twos complement since it makes addition and subtraction able to use the same logic.

## 1.1 Strings

Strings are represented using **ASCII** codes. Each string is compared using its ASCII value.

$$CAT < Cat < DOG < Dog < cat < dog$$

|            |    |    |    |     |    |    |    |     |    |    |    |     |    |    |    |
|------------|----|----|----|-----|----|----|----|-----|----|----|----|-----|----|----|----|
| Letter     | A  | B  | C  | ... | X  | Y  | Z  | ... | a  | b  | c  | ... | x  | y  | z  |
| ASCII Code | 41 | 42 | 43 |     | 58 | 59 | 5A |     | 61 | 62 | 63 |     | 78 | 79 | 7A |

## 1.2 Fixed Point

The **Q Notation** represents the type of fixed point using UQ<sub>m.n</sub>.

|   |                                         |
|---|-----------------------------------------|
| U | Unsigned. Remove if we are doing signed |
| Q | Means Q notation                        |
| m | Number of integer bits                  |
| n | number of fractional bits               |

**Ex.** Approximate  $-\pi$  using Q3.12

This means we have the sign bit, then 3 integer bits and 12 fractional bits.

We can represent pi as:

0011.001001000100

So after taking the twos complement  $-\pi$  becomes:

1100.110110111100

If it was UQ4.12 we could only represent  $\pi$  not  $-\pi$  but it would be:

0011.001001000100

To add fixed point numbers, it is very simple. We just treat it as if the radix is non existent.

### 1.3 Floating Point

Floating point is similar to scientific notation in decimal. We first need to normalize it. So we make it in the form of  $1.xxxxxxx \times 2^{exp}$ . We then hide the 1 as it is implied.

Using single precision, we have 1 bit for the sign, 8 for exponent, and 23 for the fractional.

To represent negative and positive exponents, we add 127 to the exponent.

**Ex.** Express  $(36.5625)_{10}$  as a 32 bit floating point number using the IEEE standard.

The value in binary is 100100.1001 and normalized is  $1.001001001 \times 2^5$ .

1. Sign: 0
2. Exponent:  $5+127 = 132 = 10000100$
3. Mantissa: 001001001000000000000000

So the 32 bit number is: 0 10000100 001001001000000000000000 (without the spaces ofc)

We reserve some values for special cases:

| Sign  | Exponent  | Fraction           | Meaning            |
|-------|-----------|--------------------|--------------------|
| 1 / 0 | 0000 0000 | 0000 ... 0000      | 0                  |
| 1 / 0 | 1111 1111 | 0000 ... 0000      | + / - infinity     |
| x     | 1111 1111 | any non zero value | NaN (Not a Number) |

Floating point has big dynamic range, but is less precise and more complicated than fixed point.

## 2 ARM Instructions

ARM has 3 main processor families;

|          |                                                             |
|----------|-------------------------------------------------------------|
| CORTEX-A | High performance Application processors                     |
| CORTEX-R | Reliable Real time processors for mission critical purposes |
| CORTEX-M | Low cost, Low power Microcontroller                         |

ARM has a few different instruction sets. The CORTEX-M series supports the T32 instruction set which includes both space saving 16 bit instructions, and high performance 32 bit instructions.

Other architectures such as CORTEX-A support T32 and A32 instructions with some supporting A64 as well.

ARM is a RISC architecture, so it cannot directly access memory through instructions. It must first **load** from memory into registers, then **modify**, and finally **store** back into memory.

We have 16 **core registers** and some **special purpose registers** as well. Since we are 32 bit, the registers are all 32 bits wide.

| Register  | General or Special | Purpose                                                                   |
|-----------|--------------------|---------------------------------------------------------------------------|
| R0        | C                  | General Purpose                                                           |
| R1        | C                  | General Purpose                                                           |
| R2        | C                  | General Purpose                                                           |
| R3        | C                  | General Purpose                                                           |
| R4        | C                  | General Purpose                                                           |
| R5        | C                  | General Purpose                                                           |
| R6        | C                  | General Purpose                                                           |
| R7        | C                  | General Purpose                                                           |
| R8        | C                  | General Purpose                                                           |
| R9        | C                  | General Purpose                                                           |
| R10       | C                  | General Purpose                                                           |
| R11       | C                  | General Purpose                                                           |
| R12       | C                  | Intra Procedure Call Register (IR)                                        |
| R13       | C                  | Stack Pointer (SP) - Often there are two:<br>MSP (Main) and PSP (Process) |
| R14       | C                  | Link Register (LR)                                                        |
| R15       | C                  | Program Counter (PC)                                                      |
| xPSR      | S                  | Program Status Register                                                   |
| BASEPRI   | S                  | Interrupt Priorities                                                      |
| PRIMASK   | S                  | Enabling and Disabling Interrupts                                         |
| FAULTMASK | S                  | Fault Handling                                                            |
| CONTROL   | S                  |                                                                           |

Often we map some hardware device to a memory to make it easier to work with. For example, we may have it set up so bit 7 of R0 is 1 if an LED is on, or 0 if the LED is off.

Assembly has 4 main classes of instructions:

- Arithmetic and Logic
- Data Movement
- Compare and Branching
- Miscellaneous

Each instruction has 4 parts:

| General From | label | mnemonic | operand(s) | comments                            |
|--------------|-------|----------|------------|-------------------------------------|
| Ex           |       | BX       | LR         | ; branch to LR                      |
| Ex           | LOOP  | CMP      | R1, R2     | ; start of loop, compares R1 and R2 |
| Ex           |       | STR      | R1, R2     |                                     |
| Ex           |       | ADD      | R3, R5, R8 |                                     |

We also have assembly directives which are just information for the assembler such as ALIGN, EXPORT, and ENDP.

| Directive   | Meaning                                                       |
|-------------|---------------------------------------------------------------|
| AREA        | Make a new block of data or code                              |
| ENTRY       | Declare an entry point where the program execution starts     |
| ALIGN       | Align data or code to a particular memory boundary            |
| DCB         | Allocate one or more bytes (8 bits) of data                   |
| DCW         | Allocate one or more half-words (16 bits) of data             |
| DCD         | Allocate one or more words (32 bits) of data                  |
| SPACE       | Allocate a zeroed block of memory with a particular size      |
| FILL        | Allocate a block of memory and fill with a given value        |
| EQU         | Give a symbol name to a numeric constant                      |
| RN          | Give a symbol name to a register                              |
| EXPORT      | Declare a symbol and make it referable by other source files  |
| IMPORT      | Provide a symbol defined outside the current source file      |
| INCLUDE/GET | Include a separate source file within the current source file |
| PROC        | Declare the start of a procedure                              |
| ENDP        | Designate the end of a procedure                              |
| END         | Designate the end of a source file                            |

## 2.1 Arithmetic and Logic

Here are just a few of the arithmetic instructions for T32.

| Mnemonic    | Syntax         | Meaning                        | Operation                                     |
|-------------|----------------|--------------------------------|-----------------------------------------------|
| <b>ADD</b>  | {Rd,} Rn, Op2  | Add                            | $Rd \leftarrow Rn + Op2$                      |
| <b>ADC</b>  | {Rd,} Rn, Op2  | Add w/ carry                   | $Rd \leftarrow Rn + Op2 + Carry$              |
| <b>SUB</b>  | {Rd,} Rn, Op2  | Subtract                       | $Rd \leftarrow Rn - Op2$                      |
| <b>SBC</b>  | {Rd,} Rn, Op2  | Subtract w/ carry              | $Rd \leftarrow Rn - Op2 + Carry - 1$          |
| <b>RSB</b>  | {Rd,} Rn, Op2  | Reverse subtract               | $Rd \leftarrow Op2 - Rn$                      |
| <b>MUL</b>  | {Rd,} Rn, Rm   | Multiply                       | $Rd \leftarrow (Rn \times Rm)[31 : 0]$        |
| <b>MLA</b>  | Rd, Rn, Rm, Ra | <b>Multiply and accumulate</b> | $Rd \leftarrow (Ra + (Rn \times Rm))[31 : 0]$ |
| <b>MLS</b>  | Rd, Rn, Rm, Ra | <b>Multiply and subtract</b>   | $Rd \leftarrow (Ra - (Rn \times Rm))[31 : 0]$ |
| <b>SDIV</b> | {Rd,} Rn, Rm   | <b>Signed divide</b>           | $Rd \leftarrow Rn \div Rm$                    |
| <b>UDIV</b> | {Rd,} Rn, Rm   | <b>Unsigned divide</b>         | $Rd \leftarrow Rn \div Rm$                    |

There are also lots of logic ones such as AND, ORR, EOR (XOR), ORN (NOR) and so on.

There are many other instructions, the ARM T32 instruction set has a lot.

### 2.1.1 NZCV Flags

These flags are stored in bits 28 to 31 of the PSR.

| Flag | Meaning                                   |
|------|-------------------------------------------|
| N    | Negative - Result is Negative             |
| Z    | Zero - Result is Zero                     |
| C    | Carry - Unsigned Arithmetic out of range  |
| V    | Overflow - Signed Arithmetic out of range |

To update these flags, we add an S to the end of the instruction.

#### Ex.

Does not update NZCV flags: ADD, SUB, MUL, etc

Does update NZCV flags: ADDS, SUBS, MULS, etc

Always updates NZCV flags: CMP, CMN, TST, TEQ

### 2.1.2 Saturation

Saturation is a logical operation that deals with the case where overflow occurs.

Normally, it will wrap back around to the lowest value. However, sometimes we want to cap the highest value.

|                             |        |
|-----------------------------|--------|
| With Saturation (4 bits)    | 7+1=-8 |
| Without Saturation (4 bits) | 7+1=7  |



Figure 1: Saturation

### 2.1.3 Other Instructions

| Instruction   | Description                  | Similar Instructions                                                |
|---------------|------------------------------|---------------------------------------------------------------------|
| RBIT Rd, Rn   | Reverses bit order in word   | REV (byte order), REV16 (For half words), REVSH (Sign Extend)       |
| SXTB {Rd,} Rm | Sign Extension (Byte)        | SXTH (Half word), UXTB/UXTH (Zero extend)                           |
| MOV Rd, Rx    | Move from Rx to Rd           | MVN (MV and NOT), MRS (From special reg), MSR (From gen to special) |
| LSL Rd, Rn, # | Move Rn to Rd and left shift | LSR (right logical), ASR (Right arithmetic), ROR (rotate right)     |

## 2.2 Memory

Memory is byte addressable, but we typically only start a 32 bit word at a multiple of 4, a 16 bit half word at a multiple of 2, and a byte at any point.

|                |                                        |
|----------------|----------------------------------------|
| LDRxx R0, [R1] | Load from memory at R1 into R0         |
| STRxx R0, [R1] | Store contents of R0 into memory at R1 |

If we are storing something smaller than the memory width (byte, or halfword) we need to differentiate between signed (add S) [LDRSB, LDRSH] and unsigned (do not add S) [LDRB, LDRH].

When loading and storing, we can also address bits after the location we specify. This is useful for arrays. We have a few modes:

|                  |                   |                                            |
|------------------|-------------------|--------------------------------------------|
| Register Offset  | LDR r0, [r1, r2]  | Target: r1+r2                              |
| Immediate Offset | LDR r0, [r1, #8]  | Target: r1 + 8                             |
| Pre-Index        | LDR r0, [r1, #4]! | Target: r1+4, update r1 to r1+4 after read |
| PostIndex        | LDR r0, [r1], #4  | Target: r1, increase r1 by 4 after read    |

## 2.3 Endianess

Endianess means within a 32 bit word (or any multi byte data structure) do we start the LSB at the low address (little endian) or high address (big endian)?



Figure 2: Endianess

## 2.4 Control Flow Instructions

We have 4 variations of the branch command. These will branch to either a label, or an address.

| Instruction | Operands | Description                   |
|-------------|----------|-------------------------------|
| B           | label    | Branch                        |
| BL          | label    | Branch with link              |
| BLX         | Rm       | Branch and Exchange with link |
| BX          | Rm       | Branch and Exchange           |

We have condition codes. These are appended to almost any instruction and they will only execute the instruction if the condition is true. It uses the status flags NZCV. These are typically performed after a CMP operation.

| Suffix | Description               | Flags tested          |
|--------|---------------------------|-----------------------|
| EQ     | Equal                     | $Z = 1$               |
| NE     | Not Equal                 | $Z = 0$               |
| CS/HS  | Unsigned Higher or Same   | $C = 1$               |
| CC/LO  | Unsigned Lower            | $C = 0$               |
| MI     | Minus (Negative)          | $N = 1$               |
| PL     | Plus (Positive or Zero)   | $N = 0$               |
| VS     | Overflow Set              | $V = 1$               |
| VC     | Overflow Cleared          | $V = 0$               |
| HI     | Unsigned Higher           | $C = 1 \& Z = 0$      |
| LS     | Unsigned Lower or Same    | $C = 0 \mid Z = 1$    |
| GE     | Signed Greater or Equal   | $N = V$               |
| LT     | Signed Less Than          | $N \neq V$            |
| GT     | Signed Greater Than       | $Z = 0 \& N = V$      |
| LE     | Signed Less than or Equal | $Z = 1 \mid N \neq V$ |
| AL     | Always                    | None                  |

**Ex.** Branch to FOO if r0 is less than 0.

```
CMP r0, #0           ; compare r0 with 0
BLE FOO             ; branch to FOO if LE (Z=1)
```

This is similar to some c code doing:

```
if (a < 0) { //assuming a is in r0
    foo(); //or something else, whatever code is located at FOO
}
```

ARM assembly lets us use the IT (If Then) syntax as well.

**Ex.**

```
ITTE NE          ; Two commands will follow with NE
ANDNE r0, r0, r1 ; Then one command with the opposite
ANDNE r2, r2, #1 ; of NE which is EQ
MOVEQ r2, r3     ; 

ITT EQ           ; The IT can be omitted from code
MOVEQ ...        ; and the assembler will add it.
ADDEQ ...
```

### 3 Subroutines

The link register LR contains the return address of the subroutine. This is copied back to the PC when the subroutine is finished.

In ARM, we store any parameters in registers R0 through R3. Any additional parameters need to be put on the stack. Also, if the parameters are larger than 32 bits, they can take up more than one register (a 128 bit parameter would take up R0, R1, R2, R3).

It returns the return value in R0.

Registers R0 through R3 can be freely changed by the subroutine, as well as R12 and R14 (LR). In other words, the calling function cannot expect them to keep the same data when the subroutine returns. The opposite is true with registers R4 to R11 where they must be preserved. If the subroutine changes anything in those registers, it must return them to the previous value before returning.

We use BL or BLX to call a subroutine.

#### 3.1 Stack

The ARM stack uses a full descending stack. This means that the stack pointer points to the top piece of data on the stack. The stack also grows down to memory address 0 as items are pushed to it.

We have the instructions `PUSH{reg_list}` and `POP{reg_list}`.

When we push or pop multiple registers, the highest number register is pushed first, and popped last.

**Ex. `PUSH{r6,lr8,lr7}`**

This instruction will first push `r8`, followed by `r7` and then finally `r6`.

It would be equivalent to `PUSH{r6,lr7,lr8}` and to `PUSH{r8,lr6,lr7}` and so on.

**Ex. `POP{r6,lr8,lr7}`**

This instruction will first pop `r6`, followed by `r7` and then finally `r8`.

It would be equivalent to `POP{r6,lr7,lr8}` and to `POP{r8,lr6,lr7}` and so on.

**Ex.**

```
...                                ; main program
    BL foo
...
foo PROC
    PUSH {r4}          ;we are using r4 which must be preserved
    ...
    MOV r4, #1         ;this changes r4, good thing we saved it
    ...
    POP {r4}          ;this restores r4 for the caller function
    BX LR             ;goes back to caller
ENDP
```

We also need to preserve the LR on the stack if we call a subroutine from inside a subroutine since it could be overridden. Then we would have no way to return to the main program.

Often we have two stack pointers, the MSP (main) and PSP (process). This is toggled by a bit in the CONTROL register.

## 4 C and Assembly

When we have C code, it goes through a lot of steps to get into ARM assembly to be loaded onto the microcontroller.

*Preprocessor → Compiler → Assembler → Linker → Loader → MCU (thru programmer)*  
*→ Debugger*

Typically, C will set up data so every word starts at an even multiple of 4 bits address. This is for efficiency. Same idea with half words, but every 2 bits. C does this by padding extra

space. We can use the `__packed` keyword in C to not pad the extra space. But this can cause weird behavior.

If we want to mix C and assembly, we can do this. In assembly, we call a C function using the `import` keyword, and export a function to C using the `export` keyword. Similarly, in C we can import something from assembly using the `extern` keyword. This can work with data as well as functions.

## 4.1 Volatile Datatypes

The `volatile` keyword means each time we use a variable, we need to import it from memory into a register. This is useful when an external event may change memory at any point. In this case, we need to ensure that we don't use an old version of the variable.

## 4.2 Interrupts

An interrupt is a signal that occurs that tells the controller that it needs to stop whatever it is currently doing, save the state using the stack, and then move on to the interrupt service routine (ISR). After it then restores the stack, and goes back to the user program.

It needs to save xPSR, PC, LR, R12, R3, R2, R1, and R0 . Therefore it can use any of those registers to store data. Any other registers that are used must be returned to their original state.

## 5 FPU

The Floating Point Unit (FPU) greatly improves efficiency when dealing with floats over software floating point calculations. These FPUs are however quite large and expensive to implement, so they are not found on all systems.

By default, this FPU is disabled due to power usage. It can easily be enabled.

The FPU coprocessor has its own bank of general purpose registers, and special purpose registers labeled as:

|       |     |
|-------|-----|
| s1    | s0  |
| s3    | s2  |
| s5    | s4  |
| ...   | ... |
| s31   | s30 |
| FPCAR |     |
| FPSCR |     |
| FPCCR |     |

If we are operating with doubles instead of floats, then we use two of those registers to hold a double. So d0 would use s1 and s0. We can copy to and from the s registers to the r registers.

|        |        |
|--------|--------|
| d0[0]  | d0[1]  |
| d1[0]  | d1[1]  |
| d2[0]  | d2[1]  |
| ...    | ...    |
| d14[0] | d14[1] |
| d15[0] | d15[1] |

We have similar instructions to the regular CPU, but for the FPU they are prefixed with V. So LDR becomes VLDR. There are even a few advanced functions not present in the standard CPU such as VSQRT.

| Standard | FPU   |
|----------|-------|
| LDR      | VLDR  |
| STR      | VSTR  |
| MOV      | VMOV  |
| ADD      | VADD  |
| MLA      | VMLA  |
| CMP      | VCMP  |
| N/A      | VABS  |
| N/A      | VSQRT |

We often suffix the commands with something like F32/F64 or U32/S32. These signify float/double, and unsigned/signed. They are appended to the instruction like VLDR.F32.S32. These are used to convert from one format to another.

## 6 GPIO

To access our GPIO devices, we use memory mapped IO. This is where a section of the memory addresses are mapped to the IO device such as LED and speaker. All the bits we access through software are found in these registers such as the clock enable, GPIO port enable, GPIO mode, and so on.

The GPIO speed can be configured where a higher speed is faster, but uses more power and has more noise.

The GPIO pin can be set up in different modes such as input, output, analog, or alternate function.

The GPIO pin can be either in open drain, or pull push mode. This changes how the physical transistors are set up. We also have a pull up or pull down resistor that can also be enabled/disabled.

We have a certain number of GPIO devices, each device has a certain number of pins. So we could have 8 devices with 16 pins each for a total of 128 inputs/outputs.

## 6.1 Debouncing

Debouncing is needed for any sort of input switch. If we do not have debouncing, then the hardware will detect multiple presses per single button press, which is not what we want. We can either debounce using hardware with an RC circuit (or an SR latch and dual throw switch if we are fancy), or through software with a delay.

## 6.2 Interrupts

An interrupt is an external signal sent into the CPU for processing. So when a button is pressed, or motion is detected on a motion detector, an interrupt will be generated.

# 7 Timers

Timers generate interrupts at a fixed interval. This is what the SYSTICK driver does, it generates an interrupt every say 1ms. SYSTICK is a hardware component in ARM CORTEX M.

We have a reload value stored in a register (**ARR**). This is the starting value that the counter starts up, and then counts down to 0 when it creates the interrupt.

We also have a current value register to get the current value of the counter. This needs to be cleared on startup before running a timer since it has a random value.

We also often have some other timers that are independent of the processor. These are useful for peripheral devices. They can either operate in **capture** mode, or **compare** mode (useful for PWM, 1 if below Comp value, 0 if above). Capture will record the time when events occur, and compare will trigger events at specific times.

When counting, we can either use down counting, up counting, or center counting.



We can also change the repetitions. This allows us to determine the number of reloads between events. This uses the repetition counter register (**RCR**). This is done using an input prescaler which is basically another timer.



Figure 3:

## 8 Direct Memory Access (DMA)

By default, when we copy from a peripheral to memory, we need to go through the CPU. This is inefficient, the DMA lets us directly go from the peripheral to the memory. This is better since it uses much less CPU. We have a separate DMA controller that can initialise the transfers. The data can either flow through the DMA, or directly fly by to the memory (does not directly go through DMA).

## 9 DAC and ADC (Analog Interfacing)

When interfacing with analogue devices, we need to either convert a digital signal to analogue, or convert an analogue signal to digital, depending on whether or not it is an output or input.

### 9.1 DAC

For the digital to analog converter, we have a certain number of bits which gives the range of analog values.

The DAC can be implemented in a few ways such as a string of resistors with switches acting as a voltage divider, or as a combination of resistors with switches (same idea, but we use combinations of resistors for more possible outputs with less resistors), and we can also use PWM with a filter.

### 9.2 ADC

When converting from analog to digital, we need to sample at a certain frequency, and each sample again has a bit depth (resolution). The sampling rate that we need to reconstruct the analog signal is called the Nyquist rate which is twice the max frequency of the signal.

One way to do this is to use Successive Approximations (SAR). This basically uses a comparator to test all the digital values compared to the raw analog signal using a binary search.

**Ex.** We have a 12 bit ADC, where it takes 4 cycles to sample the signal. What is the total sampling time?

For the total sampling time, we need to account for the time to sample the signal, and then the time to determine the actual signal using the SAR. Since this is a binary search,

it will take at most 12 cycles.

$$T = 4 + 12 = 16 \text{ cycles}$$

## 10 Serial Communications

### 10.1 I2C

I2C is a 2 wire communication standard. It has a data line (SDA), and a clock line (SCL). It is relatively low speed. Both wires use open drain drivers with a pull up resistor (so are logic 1 by default).

- START is when SDA has falling edge when SCL is HI
- STOP is when SDA has rising edge when SCL is HI
- DATA can only change when SCL is LO

We have a target, which can be 7 or 10 bits, and then we have the data. If we have more than 1 controller, then the arbitration works by each controller checking SDA. If SDA is LO when it is driving SDA HI, then that controller backs off and waits. This works since SDA is pulled up to HI.

### 10.2 SPI

### 10.3 UART

### 10.4 USB

## 11 The Bus (Bus Interfaces)

## 12 Appendix

## Cortex-M4F Instructions used in ARM Assembly for Embedded Applications (ISBN 978-1-09254-223-4)

| Function Call/Return              | Operation                                                                              | Notes                                                              | Clock Cycles                                                       |              |
|-----------------------------------|----------------------------------------------------------------------------------------|--------------------------------------------------------------------|--------------------------------------------------------------------|--------------|
| BL <i>label</i>                   | $LR \leftarrow \text{return address}; PC \leftarrow \text{address of label}$           | BL is used to call a function<br>BX LR is used as function return  | 2-4                                                                |              |
| BLX <i>Rn</i>                     | $LR \leftarrow \text{return address}; PC \leftarrow R_n$                               |                                                                    |                                                                    |              |
| BX <i>Rn</i>                      | $PC \leftarrow R_n$                                                                    |                                                                    |                                                                    |              |
| B <i>label</i>                    | $PC \leftarrow \text{address of label}$                                                |                                                                    |                                                                    |              |
| Load Integer Constant             | Operation                                                                              | Flags                                                              | Notes                                                              |              |
| ADR <i>Rd,label</i>               | $R_d \leftarrow \text{address of label}$                                               |                                                                    | $PC-4095 \leq \text{address} \leq PC+4095$                         |              |
| MOV{S} <i>Rd, constant</i>        | $R_d \leftarrow \text{constant}$                                                       | NZ                                                                 | $0 \leq \text{constant} \leq 255 (FF_{16}) \text{ & a few others}$ |              |
| MVN{S} <i>Rd, constant</i>        | $R_d \leftarrow \sim \text{constant}$                                                  | NZ                                                                 | $0 \leq \text{constant} \leq 255 (FF_{16}) \text{ & a few others}$ |              |
| MOVW <i>Rd, constant</i>          | $R_d \leftarrow \text{constant}$                                                       |                                                                    | $0 \leq \text{constant} \leq 65535 (FFFF_{16})$                    |              |
| MOVT <i>Rd, constant</i>          | $R_d<31..16> \leftarrow \text{constant}$                                               |                                                                    | $0 \leq \text{constant} \leq 65535 (FFFF_{16})$                    |              |
| Load/Store Memory                 | Operation                                                                              | Bits                                                               | Notes                                                              |              |
| LDRB <i>Rd,[address mode]</i>     | $R_d \leftarrow \text{memory}<7..0> \text{ (zero extended)}$                           | 8                                                                  | $R_d<31..8> \leftarrow 24 \text{ 0's}$                             |              |
| LDRSB <i>Rd,[address mode]</i>    | $R_d \leftarrow \text{memory}<7..0> \text{ (sign extended)}$                           | 8                                                                  | $R_d<31..8> \leftarrow 24 \text{ copies of } R_d<7>$               |              |
| LDRH <i>Rd,[address mode]</i>     | $R_d \leftarrow \text{memory}<15..0> \text{ (zero extended)}$                          | 16                                                                 | $R_d<31..16> \leftarrow 16 \text{ 0's}$                            |              |
| LDRSH <i>Rd,[address mode]</i>    | $R_d \leftarrow \text{memory}<15..0> \text{ (sign extended)}$                          | 16                                                                 | $R_d<31..16> \leftarrow 16 \text{ copies of } R_d<16>$             |              |
| LDR <i>Rd,[address mode]</i>      | $R_d \leftarrow \text{memory}<31..0>$                                                  | 32                                                                 |                                                                    |              |
| LDRD <i>Rt,Rt2,[address mode]</i> | $R_{t2}.R_t \leftarrow \text{memory}<63..0>$                                           | 64                                                                 | Can't use register offset adrs mode                                |              |
| STRB <i>Rd,[address mode]</i>     | $R_d \rightarrow \text{memory}<7..0>$                                                  | 8                                                                  |                                                                    |              |
| STRH <i>Rd,[address mode]</i>     | $R_d \rightarrow \text{memory}<15..0>$                                                 | 16                                                                 |                                                                    |              |
| STR <i>Rd,[address mode]</i>      | $R_d \rightarrow \text{memory}<31..0>$                                                 | 32                                                                 |                                                                    |              |
| STRD <i>Rt,Rt2,[address mode]</i> | $R_{t2}.R_t \rightarrow \text{memory}<63..0>$                                          | 64                                                                 | Can't use register offset adrs mode                                |              |
| Load/Store Multiple               | Operation                                                                              | Notes                                                              | Clock Cycles                                                       |              |
| POP <i>{register list}</i>        | $\text{registers} \leftarrow \text{memory}[SP]; SP += 4 \times \# \text{ registers}$   | regs: Not SP; PC/LR, but not both                                  | 1 + #registers                                                     |              |
| PUSH <i>{register list}</i>       | $SP -= 4 \times \# \text{ registers}; \text{ registers} \rightarrow \text{memory}[SP]$ | regs: Neither SP or PC.                                            |                                                                    |              |
| LDMIA <i>Rn!, {register list}</i> | $\text{registers} \leftarrow \text{memory}[R_n]$                                       | if "!" is appended,<br>then $R_n += 4 \times \# \text{ registers}$ |                                                                    |              |
| STMIA <i>Rn!, {register list}</i> | $\text{registers} \rightarrow \text{memory}[R_n]$                                      |                                                                    |                                                                    |              |
| LDMDB <i>Rn!, {register list}</i> | $\text{registers} \leftarrow \text{memory}[R_n - 4 \times \# \text{ registers}]$       | if "!" is appended,<br>then $R_n -= 4 \times \# \text{ registers}$ |                                                                    |              |
| STMDB <i>Rn!, {register list}</i> | $\text{registers} \rightarrow \text{memory}[R_n - 4 \times \# \text{ registers}]$      |                                                                    |                                                                    |              |
| Move / Add / Subtract             | Operation                                                                              | Flags                                                              | operand2 options:                                                  | Clock Cycles |
| MOV{S} <i>Rd,Rn</i>               | $R_d \leftarrow R_n$                                                                   | NZ                                                                 |                                                                    | 1            |
| ADD{S} <i>Rd,Rn,operand2</i>      | $R_d \leftarrow R_n + \text{operand2}$                                                 | NZCV                                                               | 1. constant                                                        |              |
| ADC{S} <i>Rd,Rn,operand2</i>      | $R_d \leftarrow R_n + \text{operand2} + C$                                             | NZCV                                                               | 2. $R_m$ (a register)                                              |              |
| SUB{S} <i>Rd,Rn,operand2</i>      | $R_d \leftarrow R_n - \text{operand2}$                                                 | NZCV                                                               | 3. $R_m, shift$<br>(Any kind of shift)                             |              |
| SBC{S} <i>Rd,Rn,operand2</i>      | $R_d \leftarrow R_n - \text{operand2} + C - 1$                                         | NZCV                                                               |                                                                    |              |
| RSB{S} <i>Rd,Rn,operand2</i>      | $R_d \leftarrow \text{operand2} - R_n$                                                 | NZCV                                                               |                                                                    |              |

Revised: December 15, 2020

Page 1 of 7

## Cortex-M4F Instructions used in ARM Assembly for Embedded Applications (ISBN 978-1-09254-223-4)

| Multiply / Divide | Operation                                          | Flags                                                       | Notes                                                                     | Clock Cycles |
|-------------------|----------------------------------------------------|-------------------------------------------------------------|---------------------------------------------------------------------------|--------------|
| MUL{S}            | $R_d \leftarrow (R_n \times R_m) < 31..0 >$        | NZC                                                         | $32 \leftarrow 32 \times 32$ ; C undefined                                | 1            |
| MLA               | $R_d \leftarrow R_a + (R_n \times R_m) < 31..0 >$  |                                                             | $32 \leftarrow 32 + 32 \times 32$                                         |              |
| MLS               | $R_d \leftarrow R_a - (R_n \times R_m) < 31..0 >$  |                                                             | $32 \leftarrow 32 - 32 \times 32$                                         |              |
| SMMUL{R}          | $R_d \leftarrow (R_n \times R_m) < 63..32 >$       |                                                             | Upper half of signed 64-bit product;                                      |              |
| SMMLA{R}          | $R_d \leftarrow R_a + (R_n \times R_m) < 63..32 >$ |                                                             | Append R: Round towards $+\infty$ (Adds 0x80000000 to the 64-bit product) |              |
| SMMLS{R}          | $R_d \leftarrow R_a - (R_n \times R_m) < 63..32 >$ |                                                             |                                                                           |              |
| <b>[S]MULL</b>    | $R_{dlo}, R_{dhi}, R_n, R_m$                       | $R_{dhi}R_{dlo} \leftarrow R_n \times R_m$                  | <b>Signed/U</b> signed: $64 \leftarrow 32 \times 32$                      |              |
| <b>[S]MLAL</b>    | $R_{dlo}, R_{dhi}, R_n, R_m$                       | $R_{dhi}R_{dlo} \leftarrow R_{dhi}R_{dlo} + R_n \times R_m$ | <b>Signed/U</b> signed: $64 \leftarrow 64 + 32 \times 32$                 |              |
| <b>[S]DIV</b>     | $R_d \leftarrow R_n / R_m$                         |                                                             | <b>Signed/U</b> signed: $32 \leftarrow 32 \div 32$                        | 2-12         |

| Saturating Instructions | Operation                        | Min        | Max         | operand2 options                                                          | Clock Cycles |
|-------------------------|----------------------------------|------------|-------------|---------------------------------------------------------------------------|--------------|
| SSAT                    | $R_d \leftarrow \text{operand2}$ | $-2^{n-1}$ | $2^{n-1}-1$ | 1. $R_m$ (a register)<br>2. $R_m, ASR$ constant<br>3. $R_m, LSL$ constant | 1            |
| USAT                    | $R_d \leftarrow \text{operand2}$ | 0          | $2^n-1$     |                                                                           |              |
| QADD                    | $R_d \leftarrow R_n + R_m$       | $-2^{31}$  | $2^{31}-1$  |                                                                           |              |
| QSUB                    | $R_d \leftarrow R_n - R_m$       |            |             | (Q $\leftarrow 1$ if saturates)                                           |              |

| SIMD Signed Saturating ADD/SUB | Operation                                                         | Min to Max                                            | Notes                                                                       | Clock Cycles |
|--------------------------------|-------------------------------------------------------------------|-------------------------------------------------------|-----------------------------------------------------------------------------|--------------|
| QADD [8]<br>[16]               | $R_d[\text{bits}] \leftarrow R_n[\text{bits}] + R_m[\text{bits}]$ | 8: $-2^7$ to $+2^7-1$<br>16: $-2^{15}$ to $+2^{15}-1$ | For bytes 0-3: bits 7..0,<br>15..8, 23..16, & 31..24<br>(No flags affected) | 1            |
| QSUB [8]<br>[16]               | $R_d[\text{bits}] \leftarrow R_n[\text{bits}] - R_m[\text{bits}]$ |                                                       |                                                                             |              |

| SIMD Unsigned Saturating ADD/SUB | Operation                                                         | Min to Max                             | Notes                                                                | Clock Cycles |
|----------------------------------|-------------------------------------------------------------------|----------------------------------------|----------------------------------------------------------------------|--------------|
| UQADD [8]<br>[16]                | $R_d[\text{bits}] \leftarrow R_n[\text{bits}] + R_m[\text{bits}]$ | 8: 0 to $2^8-1$<br>16: 0 to $2^{16}-1$ | For halfwords 0 and 1:<br>bits 15..0 & 31..16<br>(No flags affected) | 1            |
| UQSUB [8]<br>[16]                | $R_d[\text{bits}] \leftarrow R_n[\text{bits}] - R_m[\text{bits}]$ |                                        |                                                                      |              |

| SIMD Signed Non-Saturating ADD/SUB | Operation                                                         | GE Flags              | Notes                                                                      | Clock Cycles |
|------------------------------------|-------------------------------------------------------------------|-----------------------|----------------------------------------------------------------------------|--------------|
| SADD [8]<br>[16]                   | $R_d[\text{bits}] \leftarrow R_n[\text{bits}] + R_m[\text{bits}]$ | sum $\geq 0$ ? 1 : 0  | Parallel operations:<br>Four 8-bit operations,<br>or two 16-bit operations | 1            |
| SSUB [8]<br>[16]                   | $R_d[\text{bits}] \leftarrow R_n[\text{bits}] - R_m[\text{bits}]$ | diff $\geq 0$ ? 1 : 0 |                                                                            |              |

| SIMD Unsigned Non-Saturating ADD/SUB | Operation                                                         | GE Flags              | Notes                                                                      | Clock Cycles |
|--------------------------------------|-------------------------------------------------------------------|-----------------------|----------------------------------------------------------------------------|--------------|
| UADD [8]<br>[16]                     | $R_d[\text{bits}] \leftarrow R_n[\text{bits}] + R_m[\text{bits}]$ | overflow ? 1 : 0      | Parallel operations:<br>Four 8-bit operations,<br>or two 16-bit operations | 1            |
| USUB [8]<br>[16]                     | $R_d[\text{bits}] \leftarrow R_n[\text{bits}] - R_m[\text{bits}]$ | diff $\geq 0$ ? 1 : 0 |                                                                            |              |

## Cortex-M4F Instructions used in ARM Assembly for Embedded Applications (ISBN 978-1-09254-223-4)

| Q and GE Flag Instructions |                                                  | Operation                                                                              | Notes                                                    | Clock Cycles |
|----------------------------|--------------------------------------------------|----------------------------------------------------------------------------------------|----------------------------------------------------------|--------------|
| SEL                        | R <sub>d</sub> , R <sub>n</sub> , R <sub>m</sub> | R <sub>d</sub> [bits] ← (GE[byte] = 1) ? R <sub>n</sub> [bits] : R <sub>m</sub> [bits] | For bytes 0-3: bits 7..0, 15..8, 23..16, & 31..24        | 1            |
| MRS                        | R <sub>d</sub> , APSR                            | R <sub>d</sub> <31..27> ← NZCVQ<br>R <sub>d</sub> <19..16> ← GE flags                  | All other bits of R <sub>d</sub> are filled with zeroes. |              |
| MSR                        | APSR_nzcvq, R <sub>n</sub>                       | NZCVQ ← R <sub>n</sub> <31..27>                                                        | Other flags in the PSR are not affected.                 |              |
| MSR                        | APSR_g, R <sub>n</sub>                           | GE flags ← R <sub>n</sub> <19..16>                                                     |                                                          |              |

| SIMD Multiply Instructions |                                                                       | Operation                                                                                                                                    | Notes                                                                   | Clock Cycles |
|----------------------------|-----------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------|--------------|
| SMUAD                      | R <sub>d</sub> , R <sub>n</sub> , R <sub>m</sub>                      | R <sub>d</sub> ← R <sub>n</sub> <15..00> × R <sub>m</sub> <15..00> + R <sub>n</sub> <31..16> × R <sub>m</sub> <31..16>                       | Sets Q flag if an addition or subtraction overflows; does not saturate. | 1            |
| SMUSD                      | R <sub>d</sub> , R <sub>n</sub> , R <sub>m</sub>                      | R <sub>d</sub> ← R <sub>n</sub> <15..00> × R <sub>m</sub> <15..00> - R <sub>n</sub> <31..16> × R <sub>m</sub> <31..16>                       |                                                                         |              |
| SMLAD                      | R <sub>d</sub> , R <sub>n</sub> , R <sub>m</sub> , R <sub>a</sub>     | R <sub>d</sub> ← R <sub>a</sub> + R <sub>n</sub> <15..00> × R <sub>m</sub> <15..00> + R <sub>n</sub> <31..16> × R <sub>m</sub> <31..16>      |                                                                         |              |
| SMLSD                      | R <sub>d</sub> , R <sub>n</sub> , R <sub>m</sub> , R <sub>a</sub>     | R <sub>d</sub> ← R <sub>a</sub> + R <sub>n</sub> <15..00> × R <sub>m</sub> <15..00> - R <sub>n</sub> <31..16> × R <sub>m</sub> <31..16>      |                                                                         |              |
| SMLALD                     | R <sub>dlo</sub> , R <sub>dhi</sub> , R <sub>n</sub> , R <sub>m</sub> | R <sub>dhi</sub> , R <sub>dlo</sub> += R <sub>n</sub> <15..00> × R <sub>m</sub> <15..00> + R <sub>n</sub> <31..16> × R <sub>m</sub> <31..16> |                                                                         |              |
| SMLS LD                    | R <sub>dlo</sub> , R <sub>dhi</sub> , R <sub>n</sub> , R <sub>m</sub> | R <sub>dhi</sub> , R <sub>dlo</sub> += R <sub>n</sub> <15..00> × R <sub>m</sub> <15..00> - R <sub>n</sub> <31..16> × R <sub>m</sub> <31..16> |                                                                         |              |

Appending "X" to instruction mnemonic changes operand2s to R<sub>n</sub><15..00> × R<sub>m</sub><31..16> and R<sub>n</sub><31..16> × R<sub>m</sub><15..00>.

| Signed Multiply Halfwords |                                                  | Operation                                                          | Notes      | Clock Cycles |
|---------------------------|--------------------------------------------------|--------------------------------------------------------------------|------------|--------------|
| SMULBB                    | R <sub>d</sub> , R <sub>n</sub> , R <sub>m</sub> | R <sub>d</sub> ← R <sub>n</sub> <15..00> × R <sub>m</sub> <15..00> | 32 ← 16×16 | 1            |
| SMULBT                    | R <sub>d</sub> , R <sub>n</sub> , R <sub>m</sub> | R <sub>d</sub> ← R <sub>n</sub> <15..00> × R <sub>m</sub> <31..16> |            |              |
| SMULTB                    | R <sub>d</sub> , R <sub>n</sub> , R <sub>m</sub> | R <sub>d</sub> ← R <sub>n</sub> <31..16> × R <sub>m</sub> <15..00> |            |              |
| SMULTT                    | R <sub>d</sub> , R <sub>n</sub> , R <sub>m</sub> | R <sub>d</sub> ← R <sub>n</sub> <31..16> × R <sub>m</sub> <31..16> |            |              |

| Pack Halfwords |                                            | Operation                                                                                                 | operand2 options:                                                                                      | Notes                                      | Clock Cycles |
|----------------|--------------------------------------------|-----------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------|--------------------------------------------|--------------|
| PKHBT          | R <sub>d</sub> , R <sub>n</sub> , operand2 | Btm: R <sub>d</sub> <15..00> ← R <sub>n</sub> <15..00><br>Top: R <sub>d</sub> <31..16> ← operand2<31..16> | 1. R <sub>m</sub> (a register)<br>2. R <sub>m</sub> , LSL constant<br>3. R <sub>m</sub> , ASR constant | Shift constants:<br>LSL: 1-31<br>ASR: 1-32 | 1            |
| PKHTB          | R <sub>d</sub> , R <sub>n</sub> , operand2 | Top: R <sub>d</sub> <31..16> ← R <sub>n</sub> <31..16><br>Btm: R <sub>d</sub> <15..00> ← operand2<15..00> |                                                                                                        |                                            |              |

| Compare Instructions |                           | Operation                 | operand2 options:                                                                                 | Notes         | Clock Cycles |
|----------------------|---------------------------|---------------------------|---------------------------------------------------------------------------------------------------|---------------|--------------|
| CMP                  | R <sub>n</sub> , operand2 | R <sub>n</sub> - operand2 | 1. constant<br>2. R <sub>m</sub> (a register)<br>3. R <sub>m</sub> , shift<br>(any kind of shift) | Updates: NZCV | 1            |
| CMN                  | R <sub>n</sub> , operand2 | R <sub>n</sub> + operand2 |                                                                                                   | Updates: NZCV |              |
| TST                  | R <sub>n</sub> , operand2 | R <sub>n</sub> & operand2 |                                                                                                   | Updates: NZC  |              |
| TEQ                  | R <sub>n</sub> , operand2 | R <sub>n</sub> ^ operand2 |                                                                                                   | Updates: NZC  |              |

| Zero/Sign-Extend Instructions |                           | Operation                                                               | operand2 options:                                                                            | Clock Cycles |
|-------------------------------|---------------------------|-------------------------------------------------------------------------|----------------------------------------------------------------------------------------------|--------------|
| [S] XTB                       | R <sub>d</sub> , operand2 | R <sub>d</sub> ← Sign (S) extend or Unsigned (U) extend operand2<7..0>  | 1. R <sub>m</sub> (a register)<br>2. R <sub>m</sub> , ROR constant<br>(constant=8, 16 or 24) | 1            |
| [S] XTH                       | R <sub>d</sub> , operand2 | R <sub>d</sub> ← Sign (S) extend or Unsigned (U) extend operand2<15..0> |                                                                                              |              |

## Cortex-M4F Instructions used in ARM Assembly for Embedded Applications (ISBN 978-1-09254-223-4)

| Conditional Branch Instructions                |                        | Operation                                    | Notes                                                                                                         |                      | Clock Cycles |
|------------------------------------------------|------------------------|----------------------------------------------|---------------------------------------------------------------------------------------------------------------|----------------------|--------------|
| Bcc                                            | label                  | Branch to <i>label</i> if "cc" is true       | 'cc' is a condition code<br>Can't use in an IT block<br>Can't use in an IT block<br>Controls 1-4 instructions | 1 (Fail) or 2-4<br>1 |              |
| CBZ                                            | R <sub>n</sub> , label | Branch to <i>label</i> if R <sub>n</sub> =0  |                                                                                                               |                      |              |
| CBNZ                                           | R <sub>n</sub> , label | Branch to <i>label</i> if R <sub>n</sub> ≠0  |                                                                                                               |                      |              |
| ITC <sub>1</sub> C <sub>2</sub> C <sub>3</sub> | condition code         | Each c <sub>i</sub> is one of T, E, or empty |                                                                                                               |                      |              |

  

| Shift Instructions |                                                       | Operation                                                                                       | Flags | operand2 options                                                                                                                  | Notes                                                                                        | Clock Cycles |
|--------------------|-------------------------------------------------------|-------------------------------------------------------------------------------------------------|-------|-----------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------|--------------|
| ASR{S}             | R <sub>d</sub> ,R <sub>n</sub> ,operand2 // 1-32 bits | R <sub>d</sub> ← R <sub>n</sub> >> operand2 (arithmetic shift right)                            | NZC   | 1. constant<br>2. R <sub>m</sub> (a register)<br>When operand2 is a constant:<br>LSL: shifts 0-31 bits;<br>ASR,LSR,ROR: 1-32 bits | Sign extends<br>Zero fills<br>right rotate<br>RRX shifts only by 1 bit.<br>33-bit rotate w/C | 1            |
| LSL{S}             | R <sub>d</sub> ,R <sub>n</sub> ,operand2 // 1-31 bits | R <sub>d</sub> ← R <sub>n</sub> << operand2 (logical shift left)                                | NZC   |                                                                                                                                   |                                                                                              |              |
| LSR{S}             | R <sub>d</sub> ,R <sub>n</sub> ,operand2 // 1-32 bits | R <sub>d</sub> ← R <sub>n</sub> >> operand2 (logical shift right)                               | NZC   |                                                                                                                                   |                                                                                              |              |
| ROR{S}             | R <sub>d</sub> ,R <sub>n</sub> ,operand2 // 1-31 bits | R <sub>d</sub> ← R <sub>n</sub> >> operand2 (rotate right)                                      | NZC   |                                                                                                                                   |                                                                                              |              |
| RRX{S}             | R <sub>d</sub> ,R <sub>n</sub> // 1 bit               | R <sub>d</sub> ← R <sub>n</sub> >> 1; R <sub>d&lt;31&gt;</sub> ← C; C ← R <sub>n&lt;0&gt;</sub> | NZC   |                                                                                                                                   |                                                                                              |              |

  

| Bitwise Instructions |                                          | Operation                                   | Flags | operand2 options                                                                                | Notes | Clock Cycles |
|----------------------|------------------------------------------|---------------------------------------------|-------|-------------------------------------------------------------------------------------------------|-------|--------------|
| AND{S}               | R <sub>d</sub> ,R <sub>n</sub> ,operand2 | R <sub>d</sub> ← R <sub>n</sub> & operand2  | NZC   | 1. constant<br>2. R <sub>m</sub> (a register)<br>3. R <sub>m,shift</sub><br>(Any kind of shift) |       | 1            |
| ORR{S}               | R <sub>d</sub> ,R <sub>n</sub> ,operand2 | R <sub>d</sub> ← R <sub>n</sub>   operand2  | NZC   |                                                                                                 |       |              |
| EOR{S}               | R <sub>d</sub> ,R <sub>n</sub> ,operand2 | R <sub>d</sub> ← R <sub>n</sub> ^ operand2  | NZC   |                                                                                                 |       |              |
| BIC{S}               | R <sub>d</sub> ,R <sub>n</sub> ,operand2 | R <sub>d</sub> ← R <sub>n</sub> & ~operand2 | NZC   |                                                                                                 |       |              |
| ORN{S}               | R <sub>d</sub> ,R <sub>n</sub> ,operand2 | R <sub>d</sub> ← R <sub>n</sub>   ~operand2 | NZC   |                                                                                                 |       |              |
| MVN{S}               | R <sub>d</sub> ,operand2                 | R <sub>d</sub> ← ~operand2                  | NZC   |                                                                                                 |       |              |

  

| Bitfield Instructions |                                           | Operation                                                       | Notes                        |  | Clock Cycles |
|-----------------------|-------------------------------------------|-----------------------------------------------------------------|------------------------------|--|--------------|
| BFC                   | R <sub>d</sub> ,lsb,width                 | SelectedBitfieldOf(R <sub>d</sub> ) ← 0                         | Sign extends<br>Zero extends |  | 1            |
| BFI                   | R <sub>d</sub> ,R <sub>n</sub> ,lsb,width | SelectedBitfieldOf(R <sub>d</sub> ) ← LSBitsOf(R <sub>n</sub> ) |                              |  |              |
| SBFX                  | R <sub>d</sub> ,R <sub>n</sub> ,lsb,width | R <sub>d</sub> ← SelectedBitfieldOf(R <sub>n</sub> )            |                              |  |              |
| UBFX                  | R <sub>d</sub> ,R <sub>n</sub> ,lsb,width | R <sub>d</sub> ← SelectedBitfieldOf(R <sub>n</sub> )            |                              |  |              |

  

| Bits / Bytes / Words |                                | Operation                                              | Notes               |  | Clock Cycles |
|----------------------|--------------------------------|--------------------------------------------------------|---------------------|--|--------------|
| CLZ                  | R <sub>d</sub> ,R <sub>n</sub> | R <sub>d</sub> ← CountLeadingZeroesOf(R <sub>n</sub> ) | #leading 0's = 0-32 |  | 1            |
| RBIT                 | R <sub>d</sub> ,R <sub>n</sub> | R <sub>d</sub> ← ReverseBitOrderOf(R <sub>n</sub> )    |                     |  |              |
| REV                  | R <sub>d</sub> ,R <sub>n</sub> | R <sub>d</sub> ← ReverseByteOrderOf(R <sub>n</sub> )   |                     |  |              |

  

| Pseudo-Instructions |                                | Operation                        | Flags | Replaced by                            | Clock Cycles |
|---------------------|--------------------------------|----------------------------------|-------|----------------------------------------|--------------|
| LDR                 | R <sub>d</sub> =constant       | R <sub>d</sub> ← constant        | NZCV  | MOV, MVN, MOVW, or LDR                 | 1            |
| NEG                 | R <sub>d</sub> ,R <sub>n</sub> | R <sub>d</sub> ← -R <sub>n</sub> |       | RSBS R <sub>d</sub> ,R <sub>n</sub> ,0 |              |
| CPY                 | R <sub>d</sub> ,R <sub>n</sub> | R <sub>d</sub> ← R <sub>n</sub>  |       | MOV R <sub>d</sub> ,R <sub>n</sub>     |              |

## Cortex-M4F Instructions used in ARM Assembly for Embedded Applications (ISBN 978-1-09254-223-4)

| <b>Floating-Point PUSH/POP</b>        |                                                                   | <b>Operation</b>                                                                                                       | <b>Clock Cycles</b> |
|---------------------------------------|-------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------|---------------------|
| VPUSH                                 | {FP register list}                                                | SP -= 4 × # registers, copy registers to memory[SP]                                                                    |                     |
| VPOP                                  | {FP register list}                                                | Copy memory[SP] to registers, SP += 4 × # registers                                                                    | 1 + # registers     |
| <b>Floating-Point Load Constant</b>   |                                                                   |                                                                                                                        | <b>Clock Cycles</b> |
| VMOV                                  | S <sub>d</sub> ,fpconstant                                        | fpconstant must be ±m × 2 <sup>-n</sup> , (16 ≤ m ≤ 31; 0 ≤ n ≤ 7)                                                     |                     |
|                                       |                                                                   |                                                                                                                        | 1                   |
| <b>Floating-Point Copy Registers</b>  |                                                                   | <b>Operation</b>                                                                                                       | <b>Clock Cycles</b> |
| VMOV                                  | S <sub>d</sub> ,S <sub>m</sub>                                    | S <sub>d</sub> ← S <sub>m</sub>                                                                                        | 1                   |
| VMOV                                  | R <sub>d</sub> ,S <sub>m</sub>                                    | R <sub>d</sub> ← S <sub>m</sub>                                                                                        |                     |
| VMOV                                  | S <sub>d</sub> ,R <sub>m</sub>                                    | S <sub>d</sub> ← R <sub>m</sub>                                                                                        | 2                   |
| VMOV                                  | R <sub>t</sub> ,R <sub>t2</sub> ,S <sub>m</sub> ,S <sub>m+1</sub> | R <sub>t</sub> ← S <sub>m</sub> ; R <sub>t2</sub> ← S <sub>m+1</sub> (S <sub>m</sub> , S <sub>m+1</sub> adjacent regs) |                     |
| VMOV                                  | S <sub>m</sub> ,S <sub>m+1</sub> ,R <sub>t</sub> ,R <sub>t2</sub> | S <sub>m</sub> ← R <sub>t</sub> ; S <sub>m+1</sub> ← R <sub>t2</sub> (S <sub>m</sub> , S <sub>m+1</sub> adjacent regs) |                     |
| <b>Floating-Point Load Registers</b>  |                                                                   | <b>Operation</b>                                                                                                       | <b>Clock Cycles</b> |
| VLDR                                  | S <sub>d</sub> ,[R <sub>n</sub> ]                                 | S <sub>d</sub> ← memory32[R <sub>n</sub> ]                                                                             | 2                   |
| VLDR                                  | S <sub>d</sub> ,[R <sub>n</sub> ,constant]                        | S <sub>d</sub> ← memory32[R <sub>n</sub> + constant]                                                                   |                     |
| VLDR                                  | S <sub>d</sub> ,label                                             | S <sub>d</sub> ← memory32[Address of label]                                                                            |                     |
| VLDR                                  | D <sub>d</sub> ,[R <sub>n</sub> ]                                 | D <sub>d</sub> ← memory64[R <sub>n</sub> ]                                                                             | 3                   |
| VLDR                                  | D <sub>d</sub> ,[R <sub>n</sub> ,constant]                        | D <sub>d</sub> ← memory64[R <sub>n</sub> + constant]                                                                   |                     |
| VLDR                                  | D <sub>d</sub> ,label                                             | D <sub>d</sub> ← memory64[Address of label]                                                                            |                     |
| VLDMA                                 | R <sub>n</sub> !,{FP register list}                               | FP registers ← memory, R <sub>n</sub> = lowest address;<br>Updates R <sub>n</sub> if write-back flag (!) is included.  |                     |
| VLDMDB                                | R <sub>n</sub> !,{FP register list}                               | FP registers ← memory, R <sub>n-4</sub> = highest address;<br>Must append (!) and always updates R <sub>n</sub>        | 1 + # registers     |
| <b>Floating-Point Store Registers</b> |                                                                   | <b>Operation</b>                                                                                                       | <b>Clock Cycles</b> |
| VSTR                                  | S <sub>d</sub> ,[R <sub>n</sub> ]                                 | S <sub>d</sub> → memory32[R <sub>n</sub> ]                                                                             | 2                   |
| VSTR                                  | S <sub>d</sub> ,[R <sub>n</sub> ,constant]                        | S <sub>d</sub> → memory32[R <sub>n</sub> + constant]                                                                   |                     |
| VSTR                                  | D <sub>d</sub> ,[R <sub>n</sub> ]                                 | D <sub>d</sub> → memory64[R <sub>n</sub> ]                                                                             | 3                   |
| VSTR                                  | D <sub>d</sub> ,[R <sub>n</sub> ,constant]                        | D <sub>d</sub> → memory64[R <sub>n</sub> + constant]                                                                   |                     |
| VSTMIA                                | R <sub>n</sub> !,{FP register list}                               | FP registers → memory, R <sub>n</sub> = lowest address;<br>Updates R <sub>n</sub> if write-back flag (!) is included.  | 1 + # registers     |
| VSTMDB                                | R <sub>n</sub> !,{FP register list}                               | FP registers → memory, R <sub>n-4</sub> = highest address;<br>Must append (!) and always updates R <sub>n</sub>        |                     |

## Cortex-M4F Instructions used in ARM Assembly for Embedded Applications (ISBN 978-1-09254-223-4)

| Floating-Point Convert Representation             | Operation                                                                                                                                                          | Clock Cycles |
|---------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|
| VCVT.F32.U32      S <sub>d</sub> ,S <sub>m</sub>  | S <sub>d</sub> $\leftarrow$ (float) S <sub>m</sub> , where S <sub>m</sub> is an unsigned integer                                                                   | 1            |
| VCVT.F32.S32      S <sub>d</sub> ,S <sub>m</sub>  | S <sub>d</sub> $\leftarrow$ (float) S <sub>m</sub> , where S <sub>m</sub> is a 2's comp integer                                                                    |              |
| VCVT{R}.U32.F32    S <sub>d</sub> ,S <sub>m</sub> | S <sub>d</sub> $\leftarrow$ (uint32_t) S <sub>m</sub>                                                                                                              |              |
| VCVT{R}.S32.F32    S <sub>d</sub> ,S <sub>m</sub> | S <sub>d</sub> $\leftarrow$ (int32_t) S <sub>m</sub> Rounded if suffix "R" is appended using current rounding mode (FPSCR bits 23 and 22, default is nearest even) |              |

| Floating-Point Arithmetic                                    | Operation                                                                           | Clock Cycles |
|--------------------------------------------------------------|-------------------------------------------------------------------------------------|--------------|
| VADD.F32      S <sub>d</sub> ,S <sub>n</sub> ,S <sub>m</sub> | S <sub>d</sub> $\leftarrow$ S <sub>n</sub> + S <sub>m</sub>                         | 1            |
| VSUB.F32      S <sub>d</sub> ,S <sub>n</sub> ,S <sub>m</sub> | S <sub>d</sub> $\leftarrow$ S <sub>n</sub> - S <sub>m</sub>                         |              |
| VNEG.F32      S <sub>d</sub> ,S <sub>m</sub>                 | S <sub>d</sub> $\leftarrow$ -S <sub>m</sub>                                         |              |
| VABS.F32      S <sub>d</sub> ,S <sub>m</sub>                 | S <sub>d</sub> $\leftarrow$   S <sub>m</sub>  ; (clears FPU sign bit, N)            |              |
| VMUL.F32      S <sub>d</sub> ,S <sub>n</sub> ,S <sub>m</sub> | S <sub>d</sub> $\leftarrow$ S <sub>n</sub> $\times$ S <sub>m</sub>                  |              |
| VDIV.F32      S <sub>d</sub> ,S <sub>n</sub> ,S <sub>m</sub> | S <sub>d</sub> $\leftarrow$ S <sub>n</sub> $\div$ S <sub>m</sub>                    |              |
| VSQRT.F32     S <sub>d</sub> ,S <sub>m</sub>                 | S <sub>d</sub> $\leftarrow$ $\sqrt{S_m}$                                            | 14           |
| VMLA.F32      S <sub>d</sub> ,S <sub>n</sub> ,S <sub>m</sub> | S <sub>d</sub> $\leftarrow$ S <sub>d</sub> + S <sub>n</sub> $\times$ S <sub>m</sub> | 3            |
| VMLS.F32      S <sub>d</sub> ,S <sub>n</sub> ,S <sub>m</sub> | S <sub>d</sub> $\leftarrow$ S <sub>d</sub> - S <sub>n</sub> $\times$ S <sub>m</sub> |              |

| Floating-Point Compare                       | Operation                                                                              | Clock Cycles |
|----------------------------------------------|----------------------------------------------------------------------------------------|--------------|
| VCMP.F32      S <sub>d</sub> ,S <sub>m</sub> | Computes S <sub>d</sub> - S <sub>m</sub> and updates FPU Flags in FPSCR                | 1            |
| VCMP.F32      S <sub>d</sub> ,0.0            | Computes S <sub>d</sub> - 0 and updates FPU Flags in FPSCR                             |              |
| VMRS            APSR_nzcv,FPSCR              | Core CPU Flags $\leftarrow$ FPU Flags (Needed between VCMP.F32 and conditional branch) |              |

### Addressing Modes for floating-point load and store instructions (VLDR & VSTR):

| Addressing Mode  | Syntax                     | Meaning                             | Example  |
|------------------|----------------------------|-------------------------------------|----------|
| Immediate Offset | [R <sub>n</sub> ]          | address = R <sub>n</sub>            | [R5]     |
|                  | [R <sub>n</sub> ,constant] | address = R <sub>n</sub> + constant | [R5,100] |

### Shift Codes:

Any of these may be applied as the "shift" option of "operand2" in Move / Add / Subtract, Compare, and Bitwise Groups.

| Shift Code   | Meaning                                     | Notes                           |
|--------------|---------------------------------------------|---------------------------------|
| LSL constant | Logical Shift Left by constant bits         | Zero fills; 0 ≤ constant ≤ 31   |
| LSR constant | Logical Shift Right by constant bits        | Zero fills; 1 ≤ constant ≤ 32   |
| ASR constant | Arithmetic Shift Right by constant bits     | Sign extends; 1 ≤ constant ≤ 32 |
| ROR constant | ROtate Right by constant bits               | 1 ≤ constant ≤ 32               |
| RRX          | Rotate Right eXtended (with carry) by 1 bit |                                 |

## Cortex-M4F Instructions used in ARM Assembly for Embedded Applications (ISBN 978-1-09254-223-4)

### Addressing Modes for *integer* load and store instructions (LDR, STR, etc.):

Any of these may be used with all variations of LDR/STR except LDRD/STRD, which may not use Register Offset Mode.

| Addressing Mode  | Syntax                                         | Meaning                                                               | Example       |
|------------------|------------------------------------------------|-----------------------------------------------------------------------|---------------|
| Immediate Offset | [R <sub>n</sub> ]                              | address = R <sub>n</sub>                                              | [R5]          |
|                  | [R <sub>n</sub> ,constant]                     | address = R <sub>n</sub> + constant                                   | [R5,100]      |
| Register Offset  | [R <sub>n</sub> ,R <sub>m</sub> ]              | address = R <sub>n</sub> + R <sub>m</sub>                             | [R4,R5]       |
|                  | [R <sub>n</sub> ,R <sub>m</sub> ,LSL constant] | address = R <sub>n</sub> + (R <sub>m</sub> << constant)               | [R4,R5,LSL 3] |
| Pre-Indexed      | [R <sub>n</sub> ,constant]!                    | R <sub>n</sub> ← R <sub>n</sub> + constant; address = R <sub>n</sub>  | [R5,100]!     |
| Post-Indexed     | [R <sub>n</sub> ],constant                     | address = R <sub>n</sub> ; R <sub>n</sub> ← R <sub>n</sub> + constant | [R5],100      |

### Condition Codes:

If appended to an FPU instruction within an IT block, the condition code precedes any extension. (E.g., VADDGT.F32)

| Condition Code             | CMP Meaning     | VCMP Meaning        | Requirements                                  |
|----------------------------|-----------------|---------------------|-----------------------------------------------|
| EQ (Equal)                 | ==              | ==                  | Z = 1                                         |
| NE (Not Equal)             | !=              | != or unordered     | Z = 0                                         |
| HS (Higher or Same)        | unsigned $\geq$ | $\geq$ or unordered | C = 1 Note: Synonym for "CS" (Carry Set)      |
| LO (Lower)                 | unsigned $<$    | $<$                 | C = 0 Note: Synonym for "CC" (Carry Clear)    |
| HI (Higher)                | unsigned $>$    | $>$ or unordered    | C = 1 & Z = 0                                 |
| LS (Lower or Same)         | unsigned $\leq$ | $\leq$              | C = 0    Z = 1                                |
| GE (Greater Than or Equal) | signed $\geq$   | $\geq$              | N = V                                         |
| LT (Less Than)             | signed $<$      | $<$ or unordered    | N $\neq$ V                                    |
| GT (Greater Than)          | signed $>$      | $>$                 | Z = 0 && N = V                                |
| LE (Less Than or Equal)    | signed $\leq$   | $\leq$ or unordered | Z = 1    N $\neq$ V                           |
| CS (Carry Set)             | unsigned $\geq$ | $\geq$ or unordered | C = 1 Note: Synonym for "HS" (Higher or Same) |
| CC (Carry Clear)           | unsigned $<$    | $<$                 | C = 0 Note: Synonym for "LO" (Lower)          |
| MI (Minus)                 | negative        | $<$                 | N = 1                                         |
| PL (Plus)                  | non-negative    | $\geq$ or unordered | N = 0                                         |
| VS (Overflow Set)          | overflow        | unordered           | V = 1                                         |
| VC (Overflow Clear)        | no overflow     | not unordered       | V = 0                                         |
| AL (Always)                | unconditional   | unconditional       | Always true                                   |

- Notes:
1. This is only a partial list of the most commonly-used ARM Cortex-M4 instructions.
  2. Clock Cycle counts do not include delays due to stalls when an instruction must wait for the previous instruction to complete.
  3. There are magnitude restrictions on immediate constants; see ARM documentation for more information.