

# Kombinatorische Logik

## 1 Bit Halfaddierer

11

AND  $\boxed{\&}$   $Z = A \& B$ Inverte  $\boxed{\ominus}$   $Z = !A$   
 $\boxed{B}$ Buffer  $\boxed{\square}$   $Z = A$ OR  $\boxed{\geq 1}$   $Z = A \# B$ NAND  $= \boxed{\& \ominus}$   $Z = !(A \& B)$ NOR  $= \boxed{\geq 1 \ominus}$   $Z = !(A \# B)$ XOR  $\boxed{\geq 1}$   $Z = A \oplus B$ XNOR  $\boxed{\geq 1}$   $Z = ! (A \oplus B)$ 

## Multiplexer



$$\begin{aligned} S=0 &\Rightarrow Z=B \\ S=1 &\Rightarrow Z=A \end{aligned}$$

## Komb vs sequentielle Logik

seq: memory

output: input + memory  
= komb + clock

kom: no memory

output: input dep.

## Demultiplexer



$$A = (Z \& S)$$

$$B = (Z \& !S)$$

## MO CPU

Core Registers  $16 \times 32\text{bit}$  Registers

ALU Rechenwerk log + arithmetische Operationen

Control Unit (IR) enthält OP code von momentaner ausgetüfteter Aktion

Bus Interface Schnittstelle zu externem System Bus

Flags Status von letzter ALU Rechnung

## Registers

R0 - R7 Low Regs

R14 LR (Link Register Rücksprungadresse)

R8 - R12 High Regs

ASPR Reg: NZCV

R15 PC Program Counter

neg zero carry overflow

R13 SP Stack Pointer

Assembly instruction

Label (optional)

instruction 2.13 M0US

Operands

Comment (optional)

SpeicherbereicheStack  
Data  
Code

Code, readonly RAM/ROM

Maschinennstrukturen, Konstanten

Data read/write → RAM

- globale/static variable + heap in C

Stack read/write → RAM

- Prozeduraufrufe/Parameterübergabe  
lokale VariablenLiteral

global, "DCD

EGV

= literal DCD myvar = 0x55

LDR R1, =myvar  
=0x55

Loads the value of myvar in R1

Negative Numbers

0 = pos

- MSB + Wert von anderen Bits = Value  
anzeigbar  $[-2^{n-1}, 2^{n-1} - 1]$ Zer Komplement: invertieren + 1 → negativ  
darstellung

RSB5 in Assembler

ASPR Add/Sub

Unsigned: C check after operation

- addition C = 1 → falsche Ergebnisse  
sub C = 0

→ ganzer Zahlenkreis addieren/subtrahieren

Signed: V check after operation

V = 1 → not enough digits  
full turn of numbercycle oder

C irrelevant

MULS <Rdm>, <Rn>, <Rdm>

Rdm = Rn \* Rdm

Flags N, Z update  
C, V unchangedMemory mapgraphical map of addresses and sizes of elements  
communicating with CPUEndian

little endian LSB an tiefster Stelle in Address

Big endian MSB an höchster Stelle in Address

little → big: Swap address; value in address  
stay the sameSpeicher

zuoberst 0xFFFFFFF

zuunter 0x0000000

Bereich (Hex) = Top - Bottom + 1

Load / Store

access of memory via load/store

Multi word arithmeticlong1 DCD ... ; low  
DCD ; highlong2 DCD low  
DCD highlong3 DCD low  
DCD high

Load R1 + R2

LDR R0, =Long1

LDM R0, {R1-R4} → Starts reading in R0 and saves  
in R1-R4

ADDS R1, R1, R3/lower

ADC5 R2, R2, R4/higher part → R1 long1 low

with carry R2 long1 high

R3 long2 low

R4 long2 high

store:

LDR R0, =Long3

STM R0, {R1, R2}

ASPR

Addition Unsigned: C = 0 → Borrow

V irrelevant

Sub von small may result in big

signed V = 1 → overflow/underflow

not possible when operands same sign

C irrelevant

MULS von Hand

Unsigned: add left zeros

Signed: add left ones

# Casting

Casting von negativen Zahlen → Fehler  
 small negative → grosse positive  
 high positive → small negative

Shift low reg only

LSLS logical Shift Left leftmost in carry  
 $2^n \cdot R_n$  rightfill with 0's

LSRS Logical Shift Right rightmost in carry  
 $2^{-n} \cdot R_n$  leftfill with 0's

ASRS Arithmetic Shift Right rightmost in carry  
 $2^{-n} \cdot A$  leftfill with MSB

RORs Rotate right  
 carrousel rightmost in carry  
 leftfill with prev. LSB

## Iteration / loop



## Ass if then else

CMP R1, #0x00

BLT false      ? if

MOVS R2, #1      ? else  
 B done

else: MOVS R2, #0x00      then block

done: ...

## Compare

CMP  $x, y \xrightarrow{\text{immediate}}$   $x - y$

CMN  $x + y \xrightarrow{\text{immediate}}$   $x + y$   
 $y = 2\text{er}(x) ?$

TST only changes N,Z,C not V  
 log. col AND  
 $x \wedge y$  is specific bit set

## Ass for loops

translated to while

for (init; test, update)

body

→ init

white (test)

body

update

## Extension

unsigned: left fill with 0's  
 signed: left fill with 1's  
 SXTB  $\$ \rightarrow 32$  unsigned  
 SXTH  $16 \rightarrow 32$  unsigned  
 UXTB  $\$ - 32$  signed  
 UXTH  $16 - 32$  signed

## Logic operations 3/

only <sup>low</sup> Registers  $< R_1 > < R_m >$   
 changes N and Z

Rd=

|       |                         |
|-------|-------------------------|
| ANDS  | bitwise AND A & B       |
| EORS  | bitwise XOR A & B       |
| ANMVN | S & Rn, bitwise NOT ! A |
| ORRS  | bitwise OR A & B        |
| BICS  | bit clear A & B         |

## conditional Branches

CMP = SUBS without just flags  
 CMN = ADDS just flags

CMP R1, R0  
 B... siehe Anhängefolien

## Selection (IF)



## Sequence

flow



## Switch



## Ass do while loop

Loop: ADDS R2,R2,R1

CMP R2, #100

BLT loop

... ? end of while

## While loops

B test

loop

MULS R2,R1,R2

test

## Subtraction C=0 → borrow

$x - y \rightarrow x + (\text{Zero copy})$

$x \geq y \rightarrow \text{carry}$

CMP R2, #100

BLT loop

...

## negativieren

RSBS  $< R_d > < R_m >$

Stack SP <sup>1</sup> points on <sub>untersetzt</sub> unterste Adress nach Branch

grows top → bottom

Stack pointer on current pos

push {R3, R4} →

push LR

pop PC



pop {R3, R4}

= pop first value int > R3

second into R4

## Functions/Subroutines

Progc

prog... endp

function... end func

functions have return value

## ASS EQU

TABLELENGTH EQU 32

→ TABLELENGTH kann als var gebraucht werden

ADD \$ R0, R0, TABLELENGTH

add sub with imm 7 possible  
 from stackpointer  
 BLX store PC in LR

Parameter Passing

RO-R4 arguments/scratch  
 RO-R1 results  
 RO-R4 might be modified by callee  
 R4-R11 callee must preserve content

scratch: hold imm values during calculation  
 usually unnamed

R4-R11 variable register named in code  
 usually local variable

additional args → stack  
 R0-R4 →

Interrupt event

main program flow

↓  
event

↓  
finish exec of current instruction

↓  
save context + PC

↓  
load PC of attention needing code into PC

↓  
do stuff

↓  
BX LR

↓  
restore context

↓  
continue mainflow

return values

kleiner als word → extend to word

Double word → RO R1  
 LSW MSW

128 bit RO-R4  
 LSW MSW

composite: up to 32 Bytes in RO

→ 4 Bytes → stored in data area, address passed as argument at function call

System exceptions

Reset

NMI (z.B. hardware)

Faults (under instruction e.g.)

OS calls

Interrupts IRQ0-IRQ239

signal CPU that event needs attention

async to instruction exec.

ISR = Unterbrechungsroutine

Different ISR's

IRQ239

Exception Nr.

255

...

17

16

15

...

3

2

1

0

Interrupts 0-239

IRQn = Exceptions n + 16

System exceptions

tiefer Nummer = höhere Priorität

Program Status Registers

ABR APSR Flags

IPSR Interrupt Program status register

EPSR Execution program status register

PSR combo of all 3 above

Masking interrupts

CPSIE (enable)

CPSID (disable)

change Processor state Interrupt

CMSIS

Cortex Microcontroller Interface Standard

Vendor independent hardware abstraction layer.

Exception states

inactive: not pending

pending: waiting to be served by CPU

active: being serviced by CPU but not completed

active + pending: being serviced + other exception of same

No waiting

IRQ Interrupt request

IPn Interrupt pending

IAm Interrupt active

schon nach einer gleichen IRQ → 2. wird vergeben

Nested Exceptions

interrupts of exceptions only possible if following exception has higher priority (lower number) compared to current one. Otherwise waiting to completion of current interrupt.

multiple IRQ request pulses before entering ISR  
 signal delivery interrupt

Interrupts adv. to polling

no busy wait / short reaction time to I/O

Architecture

- Complex instruction set Computer CISC      Reduced instruction set Computer RISC
- traditional memory access
  - complex addressing
  - High code density
  - most compilers do not support all operations  $\Rightarrow$  powerful instructions may not be used
  - often require manual optimizing of assembly code for embedded systems
  - program needs to wait for external memory
- load/store architecture
  - simple addressing
  - more lines of code
  - reduced instruction set  $\Rightarrow$  less hardware
  - higher clock rates
  - allows effective compiler optimization

Pipelining

fetch  $\rightarrow$  decode  $\rightarrow$  execute

Reset



Vorteil: massiv perf. gain.

simpler hardware  $\Rightarrow$  higher clock rate  
all stages same length

instructions/sec      sequential

$$\frac{1}{\text{Dauer aller Phasen (instruction delay)}}$$

pipelined

$$\frac{1}{\text{Stage delay}}$$

problem with branches:  $\rightarrow$  outcome of branch for new fetch needed but branch result not known yet  
 $\rightarrow$  remember if branch was taken last time. change prediction after 2 successive mispredictions

worst case: sequential scheduling



general: Number of exec stations is design decisions typ 2 - 12 stages

Pipelining II

instruction use data that is used in different stages

A. after read after write RAW true dependency

B. write after read WAR anti dependency

C. write after write WAW output dependency

ex: in execution stage value of written previously write back instruction is needed. Both stages happen at same time

→ forward solution of pre-current instruction to the next before writing

→ execution stage of next instruction and writeback of current can happen simultaneously

part of processor shared hardware is needed by more than one instruction at same time

→ fetch multiple instruction at once

- fetch instruction before needed

→ data bus move force

Structural hazard: processor hardware stuff

Data hazard: read/write stuff

Control hazards: branching problem

C → Executable

1. Preprocessor

Verarb preproc. Statement's



Textfile mit mod. C Source

#define/include ersetzen

2. Compiler

C Source → Assemblybefehle → Textfile mit Assemblycode

3. Assembler

Assemblycode → Maschinenbefehl → Binäres Objektfile mit Maschinenbefehlen

4. Linker

mehrere .o files → 1 executable

objekt file auflösenReferenzen

→ binäres, executable  
nicht Objektfile

branching poss:  
direkt/indirekt