

# ELECTRONICS AND COMMUNICATION TECHNOLOGIES: ELECTRONICS SYSTEMS

LM Cyber Security – Fall 2024

**Federico Baronti, Luca Crocetti**

Dip. Ing. Informazione

Via G. Caruso, 16 – Stanza B-1-09

050 2217581 – [federico.baronti@unipi.it](mailto:federico.baronti@unipi.it)

Office hours:

Friday 14-16. Please, contact me in advance before showing up.  
We can also arrange an appointment remotely on Microsoft  
Teams.





# RISC INSTRUCTION SET ARCHITECTURE: INTEL NIOS 2 PROC.

what are the instructions executable by a processor.

# Instructions Set Architecture

- **Instruction Set Architecture (ISA)** can be seen as the specifications of a processor
  - ISA affects processor performances [Strong relationships] (RISC, CISC, GPU, ASIP\*) Reduced instruction set computers.
  - Possible different implementations of the same ISA
- Instructions for a computer must support:
  - data transfers to and from the memory
  - arithmetic and logic operations on data
  - program sequencing and control
  - input/output transfers

\* Application-Specific Instruction set Processor (you design your own instruction set)

# RISC and CISC Instruction Sets

- Nature of instructions distinguishes computer
- Two fundamentally different approaches:
  - Reduced Instruction Set Computers (RISC) have one-word instructions<sup>(1)</sup> and require arithmetic operands to be in registers
  - Complex Instruction Set Computers (CISC)  
have multi-word instructions and allow operands directly from memory

 Idea: you have powerful instruction and thus can limit the clock freq.

So to have higher frequency you could sacrifice complex instructions, this is the idea. You have more constraint<sup>(1)</sup>

# RISC Instruction Sets

- Each RISC instruction occupies a single word
- A load/store architecture is used, meaning:
  - Only Load and Store instructions are used to access memory operands
  - Operands for arithmetic/logic instructions must be in registers, or one of them can be provided explicitly in the instruction word (*Immediate* operand)
  - E.g. Load *proc\_register*, *mem\_location*
  - Addressing mode specifies actual memory location



Program counter: every time it's incremented by 4. For some inst. during the ex. of the instruction, after the fetch it's incremented by 4.

# Nios II Main Characteristics

The PC is modified. In general,

- RISC-style architecture (**all instructions are 32-bit long**)
- 32-bit data *word*
- 2x memory interfaces (Harvard architecture)
- **Byte-addressable** memory space:
  - With little-endian addressing scheme  
(lower byte addresses used for less significant bytes)
  - The LOAD and STORE instructions can transfer data in: *word, half-word, and byte*
- 32 general-purpose registers, 32-bit long
- Several additional control registers
- 2 versions: economy (5-stage) and fast (6-stage w. pipelining)

↳ If the memory is fast enough to accommodate an access within 1 clock cycle, you can have 3 stages for an instruction.

Pipelining: The ~~exe~~ time of the instruction doesn't change, but ideally every clock cycle the processor completes an instruction, even if every instruction took 5 cycles.

All instructions go through the stages (for some, they do nothing in one of the stages).

# Nios II registers

It's just the constant 0.

- General-purpose registers (r0-r31)

| Register | Name | Function              | Register | Name | Function                      |
|----------|------|-----------------------|----------|------|-------------------------------|
| r0       | zero | 0x00000000            | r16      |      | Callee-saved register         |
| r1       | at   | Assembler temporary   | r17      |      | Callee-saved register         |
| r2       |      | Return value          | r18      |      | Callee-saved register         |
| r3       |      | Return value          | r19      |      | Callee-saved register         |
| r4       |      | Register arguments    | r20      |      | Callee-saved register         |
| r5       |      | Register arguments    | r21      |      | Callee-saved register         |
| r6       |      | Register arguments    | r22      |      | Callee-saved register         |
| r7       |      | Register arguments    | r23      |      | Callee-saved register         |
| r8       |      | Caller-saved register | r24      | et   | Exception temporary           |
| r9       |      | Caller-saved register | r25      | bt   | Breakpoint temporary (1)      |
| r10      |      | Caller-saved register | r26      | gp   | Global pointer                |
| r11      |      | Caller-saved register | r27      | sp   | Stack pointer                 |
| r12      |      | Caller-saved register | r28      | fp   | Frame pointer                 |
| r13      |      | Caller-saved register | r29      | ea   | Exception return address      |
| r14      |      | Caller-saved register | r30      | ba   | Breakpoint return address (2) |
| r15      |      | Caller-saved register | r31      | ra   | Return address                |

\* Application programming interface. Used to mix C, C++ and assembly. Let's say you have a critical function that you are going to program in assembly. You want to know where the results will be placed by the compiler for example.

Those 8 registers can be used without saving their values. This means that the caller has no warranty that the value of one of those reg. won't change. They will be saved before calling the subroutine. On the contrary, the r16 to r23 are registers that if needed will be saved somewhere. You can either pass parameters through stack or with r6-r7.

NOTA: The role of some registers is fixed by F1W(231, 70). Most of the other roles are determined after a convention.

# Addressing Modes (1)

- How operands are specified in an instruction
- Nios 2 proc. supports **5 addressing modes**:
  - *Immediate mode*: a 16-bit operand is contained in the instruction itself. This value is (sign-)extended to produce a 32-bit operand for (arithmetic) instructions  
*(32-bit instructions!)*
  - *Register mode*: the operand is the content of a register  
*5 bits to specify register*
  - *Register indirect mode*: the effective address of the operand is the content of a register

↳ Displacement mode with displacement is 0

# Addressing Modes (2)

Q: Scanning an array

- Nios 2 proc. supports 5 addressing modes:
  - **Displacement mode**: the effective address of the operand is obtained by adding the content of a register and a 16-bit value contained in the instruction itself. (positive or negative displacement)
  - **Absolute mode**: is a particular case of the **Displacement mode** when the register is r0
- E.g. addi r3, r2, 100  
the content of r2 is added to 100 and the result placed in r3

Store copies a value from register to memory.  
Source 1st, dest. 2nd. Exception to the rule.

# Addressing Modes (3)

Involves appropriate extension to 32 bit

Nios II addressing modes.

| Name              | Assembler syntax                               | Addressing function |
|-------------------|------------------------------------------------|---------------------|
| Immediate         | Value                                          | Operand = Value     |
| Register          | $ri$                                           | $EA = ri$           |
| Register indirect | $(ri)$                                         | $EA = [ri]$         |
| Displacement      | $X(ri)$<br><small>q (R0) not necessary</small> | $EA = [ri] + X$     |
| Absolute          | $LOC(r0)$                                      | $EA = LOC$          |

EA = effective address

Value = a 16-bit signed number

X = a 16-bit signed displacement value

$[ri]$  indicates the content of the register  $ri$

# Instruction formats (1)

- RISC-style instructions (all 32-bit long)
  - Load/store architecture for data transfers
  - Arithmetic/logic instructions use registers
- Three instruction types:

**I-type** OP dst\_reg, src\_reg, immediate

**R-type** OP dst\_reg, src\_reg1, src\_reg2

**J-type** OP label\_or\_address

- label\_or\_address is a 26-bit unsigned immediate value

↓  
6 bits for op code, 26 bits for the address.

numbered  
immediate  
instructions

all the operands are registers

# Instruction formats (2)

- **I-type instructions** include arithmetic and logical operations in which one operand is a constant, such as `addi` and `andi`; branch operations; `load` and `store` operations; and cache management operations.

|    |    |    |    |    |    |    |    |    |    |       |    |    |    |    |    |    |    |    |    |    |    |   |   |   |   |   |   |   |   |   |   |
|----|----|----|----|----|----|----|----|----|----|-------|----|----|----|----|----|----|----|----|----|----|----|---|---|---|---|---|---|---|---|---|---|
| 31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21    | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| A  |    |    |    |    | B  |    |    |    |    | IMM16 |    |    |    |    |    |    |    |    |    | OP |    |   |   |   |   |   |   |   |   |   |   |

2 newslets

- **R-type instructions** include arithmetic and logical operations such as `add`, `and`, `nor`; comparison operations such as `cmpeq` and `cmplt`

3 reysters

1 6 least sign. bNIS  
for OP code. 6 lines

- **J-type instructions** such as `call` and `jmpi`, transfer execution anywhere within a 256-MB range \*

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0  
IMM26 OP

Y-Type vs R-Type. For all the R-type you have just 10P code and then use the 11bts(CPX) to specify the instruction.

For load, store, branch the offset is an immediate value.

- \*  $256\text{ MB} = 2^{26} = 64\text{ MB}$ ! You need a rule to extend 26 to 32 bts. Note: all the mask bits occupy a word! But since the addresses go with multiples of 4, you can remove the 2 least sig. bts (00) so we remove them. For the 4MSB for the program counter, they are not changed. So you can only move in a page of 256 MB. The 4MSB of the PC are not changed. So overall, 16 pages of 256 MB = 4GB of memory. And a call or a jump remain only in the same page.

# Notation Conventions

| Notation         | Meaning                                                                                                          |
|------------------|------------------------------------------------------------------------------------------------------------------|
| $X \leftarrow Y$ | $X$ is written with $Y$                                                                                          |
| $PC = X$         | The program counter (PC) is written with address $X$ ; the instruction at $X$ is the next instruction to execute |
| $PC$             | The address of the assembly instruction in question                                                              |
| $rA, rB, rC$     | One of the 32-bit general-purpose registers                                                                      |
| $prs.rA$         | General-purpose register $rA$ in the previous register set                                                       |
| $IMMn$           | An $n$ -bit immediate value, embedded in the instruction word                                                    |
| $IMMED$          | An immediate value                                                                                               |
| $X_n$            | The $n$ th bit of $X$ , where $n = 0$ is the LSB                                                                 |
| $X_{n..m}$       | Consecutive bits $n$ through $m$ of $X$                                                                          |
| $0xNNNM$         | Hexadecimal notation                                                                                             |
| $X : Y$          | Bitwise concatenation<br>For example, $(0x12 : 0x34) = 0x1234$                                                   |
| $\sigma(X)$      | The value of $X$ after being sign-extended to a full register-sized signed integer                               |
| $X >> n$         | The value $X$ after being right-shifted $n$ bit positions                                                        |
| $X << n$         | The value $X$ after being left-shifted $n$ bit positions                                                         |
| $X \& Y$         | Bitwise logical AND                                                                                              |
| $X   Y$          | Bitwise logical OR                                                                                               |

continued...

# Notation Conventions (con't)

| Notation      | Meaning                                               |
|---------------|-------------------------------------------------------|
| $X \wedge Y$  | Bitwise logical XOR                                   |
| $\sim X$      | Bitwise logical NOT (one's complement)                |
| Mem8[X]       | The byte located in data memory at byte address X     |
| Mem16[X]      | The halfword located in data memory at byte address X |
| Mem32[X]      | The word located in data memory at byte address X     |
| label         | An address label specified in the assembly file       |
| (signed) rX   | The value of rX treated as a signed number            |
| (unsigned) rX | The value of rX treated as an unsigned number         |

# Load and Store Instructions

- For moving data between memory (or I/O) and general-purpose registers
- Words, half-words, bytes; alignment required<sup>②</sup>
- Variants available for I/O (uncached) access
- Examples:

Multiple f4. ↗ address of the first byte (then you move up)

|                                                                                               |                      |
|-----------------------------------------------------------------------------------------------|----------------------|
| ldw r2, 40(r3)                                                                                | // load word         |
| stb r6, 4(r12)                                                                                | // store byte        |
| ldhio r9, (r20)                                                                               | // load I/O halfword |
| ldb u r2, -100(r3) <sup>1 byte!</sup>                                                         | // signed extended   |
| ldb u r2, -100(r3) <sup>We need to extend based on how the info needs to be interpreted</sup> | // load byte zero    |
| stw r7, 100(r0)                                                                               | // store word        |

ldb, ldh etc., the extension is unsigned. NOTE: you don't have stb, because you are storing a byte.

② When alignment is required the address of var is a multiple of 4; in case of small var, multiple of 2. (Word: multiple of 4), (half Word: multiple of 2)

NOTE: for load and store you have reservation for I/O. This bypasses the cache as processor has a cache. Because we don't want info to be stored in the cache for IO purposes.

# Arithmetic Instructions

- add, addi (16-bit immediate is sign-extended)
- sub, subi, mul, and muli are similar
- Mult. is unsigned, result is truncated to 32 bits
- div (signed values), divu (unsigned values)
- Examples:

```
add r2, r3, r4      // (r2 ← [r3] + [r4])
muli r6, r7, 4096   // (r6 ← [r7] × 4096)
divu r8, r9, r10    // (r8 ← [r9] / [r10])
```

# Logic Instructions

- and, or, xor, nor have **2 register operands**
- andi, ori, xori, nori have **a register operand** and an immediate operand that is **zero-extended** from 16 bits to 32 bits
- Examples:

```
or    r7, r8, r9      // (r7 ← [r8] OR [r9])
andi  r4, r5, 0xFF   // (r4 ← [r5] AND 255)
```
- andhi, orhi, xorhi shift 16-bit immediate left and clear lower 16 bits to zero

# Move Pseudo-Instructions

- Pseudoinstructions provided for convenience:

|                           |    |                                       |
|---------------------------|----|---------------------------------------|
| mov <i>ri</i> , <i>rz</i> | => | add <i>ri</i> , <i>r0</i> , <i>rz</i> |
| movi <i>ri</i> , Val16    | => | addi <i>ri</i> , <i>r0</i> , Val16    |
| moviu <i>ri</i> , Val16   | => | ori <i>ri</i> , <i>r0</i> , Val16     |

↳ Pseudoinstr.

- Move Immediate Address for 32-bit value:

|                         |    |                                       |
|-------------------------|----|---------------------------------------|
| movia <i>ri</i> , LABEL | => | orhi <i>ri</i> , <i>r0</i> , LABEL_HI |
|                         |    | ori <i>ri</i> , <i>ri</i> , LABEL_LO  |

↳ Translated into two inst.

- LABEL\_HI is upper 16 bits of LABEL, and  
LABEL\_LO is lower 16 bits of LABEL

# Branch and Jump Instructions

- Unconditional branch: br LABEL
- Instruction encoding uses **signed 16-bit byte offset**
- Signed/unsigned comparison and branch:

```
blt ri, rj, LABEL // signed [ri]<[rj]
bltu ri, rj, LABEL // unsigned [ri]<[rj]
```
- beq, bne, bge, bgeu, bgt, bgtu, ble, and bleu
- Unconditional branch beyond 16-bit offset:

```
jmp ri // jump to address in ri
```

# Subroutine Linkage Instructions

- Subroutine **call** instruction: `call LABEL`  
*You cannot do nested subroutines.*
- Saves return address (from PC) in r31 (ra)  
*Solution: use r31 slack.*
- Target encoded as 26-bit immediate, Value26
- At execution time, 32-bit address derived as:  
 $\text{PC}_{31-28} : \text{Value26} : 00$
- Call with target in register: `callr ri`  
*here the compiler places the address in r31 and you have no limitation.*
- **Return** instruction: `ret`
  - Branches to address saved in r31 (ra)

# Comparison Instructions

- Result of comparing two operands is placed in destination register: 1 (if true) or 0 (if false)
- Less-than comparisons that set *ri* to 0 or 1:

```
cmplt ri, rj, rk // signed [rj] < [rk]
cmpltu ri, rj, rk // unsigned [rj] < [rk]
cmplti ri, rj, Val16 // signed [rj] < Val16
cmpltui ri, rj, Val16 //unsigned [rj]<Val16
```

- Val16 is sign- or zero-extended based on type
- Similarly for: ...eq.., ...ne.., ...le.., ...ge.., ...gt..

# Shift and Rotate Instructions

- Shift right logical  $rj$ , destination register is  $ri$ :

```
srl  ri, rj, rk    //shift by amount in rk  
srlti ri, rj, Val5 //shift by immediate value
```

- Shift right arithmetic sra, srai: same as above except that sign in bit  $rj_{31}$  is preserved
- Shift left logical sll, slli
- Rotate left rol, roli
- Rotate right ror (no immediate version)

# Pseudoinstructions

- mov, movi, and movia already discussed;  
translated to other instructions by assembler
- Subtract immediate is actually add immediate with negation of constant:

*subi ri, rj, Value16 => addi ri, rj, -Value16*

- Also can swap operands for comparisons:

*bgt ri, rj, LABEL => blt rj, ri, LABEL*

- Awareness of pseudoinstructions is not critical,  
except when examining assembled code

# Carry/Overflow Detection for Add

- Nios II does not have condition codes (flags)
- Arithmetic performed in same manner for signed and unsigned operands
- Detect carry/overflow needs more instructions
- **Carry Detection** (unsigned operands):  
test if unsigned result is less than either one of the operands:

```
add      r4, r2, r3  
cmpltu  r5, r4, r2
```

The result will be less than both of the inputs.

- Carry bit is in r5

# Carry/Overflow Detection for Add

- **Overflow Detection** (signed operands):  
compare signs of operands & result
- Use xor, and to check for same operand signs  
and different sign for result:

```
add  r4, r2, r3
xor  r5, r4, r2
xor  r6, r4, r3
and   r5, r5, r6
blt  r5, r0, OVERFLOW
```

The sign of the result is  
different from the sign  
of both operat.

- Similar checks for subtract carry/overflow

# References

- Intel, “Nios II Processor Reference Guide,”  
*n2cpu-nii5v1gen2-683836-666887-2.pdf*
  - 8. Instruction Set Reference
- C. Hamacher, Z. Vranesic, S. Zaky, N. Manjikian  
"Computer Organization and Embedded Systems,"  
*McGraw-Hill International Edition*
  - Appendix B: from B.1 to B.10