



# Binary Neural Networks & TinyML HW

Intelligent Architectures: 5LIL0

April 1, 2025

Dr. Federico Corradi, Assistant Professor, Neuromorphic Edge Computing Systems Lab

Department of Electrical Engineering, Electronic Systems Group

# Outline

Recap

Introduction

Binary Neural Networks

Architectural choices

RISC-V MCUs & TinyML

Conclusion

# Last few lectures

## GeMM & CNN processors enhancements in ASIC, CPU, GPU

- Sparsity
  - Sparse CNN engine
  - Efficient Inference Engine
  - NVIDIA Ampere GPU
- Quantization
  - Supporting mixed-precision quantization
  - Envision (KU) Leuven

# Last few lectures

## GeMM & CNN processors enhancements in ASIC, CPU, GPU

- Sparsity
  - Sparse CNN engine
  - Efficient Inference Engine
  - NVIDIA Ampere GPU
- Quantization
  - Supporting mixed-precision quantization
  - Envision (KU) Leuven

## Now

Canvas quizzes:

- [Lecture 11: Parallelization & Stationarity in DNN HW](#)
- [Lecture 12: Sparsity & Quantization in DNN HW](#)

# Outline

Recap

Introduction

Binary Neural Networks

Architectural choices

RISC-V MCUs & TinyML

Conclusion

# Lesson Plan

## Extreme of quantization

- Binary neural networks
- Binareye:
  1. digital KU Leuven
  2. mixed-signal (KU Leuven & Stanford)

## Learning objectives

- You will be able to execute BNN inference.
- You will be able to analyze BNN neural network accelerators and describe their architectures.

# The big picture



[[Moons et al, 2018]]

# The big picture



Bert Moons · Daniel Bankman  
Marian Verhelst

# Embedded Deep Learning

Algorithms, Architectures and Circuits  
for Always-on Neural Network  
Processing

Springer

[[Moons et al, 2018]]

# Outline

Recap

Introduction

Binary Neural Networks

Architectural choices

RISC-V MCUs & TinyML

Conclusion

# Binary Neural Networks



## BNN

- push precision to the extreme

[Courbariaux, 2016]

# Binary Neural Networks



## BNN

- push precision to the extreme
- 1-bit weights

[Courbariaux, 2016]

# Binary Neural Networks



BNN

- push precision to the extreme
  - 1-bit weights
  - 1-bit activation

[Courbariaux, 2016]

## Binary Neural Networks



[Courbariaux, 2016]

BNN

- push precision to the extreme
  - 1-bit weights
  - 1-bit activation
  - threshold operation (step function)

# Binary Neural Networks



Binary  
Networks

## BNN

- push precision to the extreme
- 1-bit weights
- 1-bit activation
- threshold operation (step function)
- multiplication → XNOR operation

| A | B | $A \odot B$ (XNOR) |
|---|---|--------------------|
| 0 | 0 | 1                  |
| 0 | 1 | 0                  |
| 1 | 0 | 0                  |
| 1 | 1 | 1                  |

[Courbariaux, 2016]

Table: Truth table for the XNOR operation

# Binary neural networks: extreme quantization

|      |      |      |
|------|------|------|
| -0.4 | -0.4 | 0.9  |
| 0.9  | 0.4  | 0.8  |
| 0.4  | -0.4 | -0.4 |



The popular definition of the binarization function is given as follows:

$$Q_w(w) = \alpha b_w, Q_a(a) = \beta b_a \quad (1)$$

where:

# Binary neural networks: extreme quantization

|      |      |      |
|------|------|------|
| -0.4 | -0.4 | 0.9  |
| 0.9  | 0.4  | 0.8  |
| 0.4  | -0.4 | -0.4 |



The popular definition of the binarization function is given as follows:

$$Q_w(w) = \alpha b_w, Q_a(a) = \beta b_a \quad (1)$$

where:

- $b_w$  are the binary weights (i.e.  $\text{sign}(w)$ )

# Binary neural networks: extreme quantization

|      |      |      |
|------|------|------|
| -0.4 | -0.4 | 0.9  |
| 0.9  | 0.4  | 0.8  |
| 0.4  | -0.4 | -0.4 |



The popular definition of the binarization function is given as follows:

$$Q_w(w) = \alpha b_w, Q_a(a) = \beta b_a \quad (1)$$

where:

- $b_w$  are the binary weights (i.e.  $\text{sign}(w)$ )
- $b_a$  are the binary activations (i.e.,  $\text{sign}(a)$ )

# Binary neural networks: extreme quantization

|      |      |      |
|------|------|------|
| -0.4 | -0.4 | 0.9  |
| 0.9  | 0.4  | 0.8  |
| 0.4  | -0.4 | -0.4 |



The popular definition of the binarization function is given as follows:

$$Q_w(w) = \alpha b_w, Q_a(a) = \beta b_a \quad (1)$$

where:

- $b_w$  are the binary weights (i.e.  $\text{sign}(w)$ )
- $b_a$  are the binary activations (i.e.,  $\text{sign}(a)$ )
- $\alpha = \frac{1}{n} \|w\|_{L1}$ ,  $\beta = \frac{1}{n} \|a\|_{L1}$

# Binary neural networks: extreme quantization

|      |      |      |
|------|------|------|
| -0.4 | -0.4 | 0.9  |
| 0.9  | 0.4  | 0.8  |
| 0.4  | -0.4 | -0.4 |



|     |    |    |    |
|-----|----|----|----|
| 0.2 | -1 | -1 | 1  |
|     | 1  | 1  | 1  |
|     | 1  | -1 | -1 |

The popular definition of the binarization function is given as follows:

$$Q_w(w) = \alpha b_w, Q_a(a) = \beta b_a \quad (1)$$

where:

- $b_w$  are the binary weights (i.e.  $\text{sign}(w)$ )
- $b_a$  are the binary activations (i.e.,  $\text{sign}(a)$ )
- $\alpha = \frac{1}{n} \|w\|_{L1}$ ,  $\beta = \frac{1}{n} \|a\|_{L1}$

The sign function is used for binarization

$$\text{sign}(x) = \begin{cases} +1 & \text{if } x > 0 \\ -1 & \text{otherwise} \end{cases}$$

## Binary neural networks: forward path

In full precision, a convolution operation is expressed as:

$$z = \sigma(w \otimes a) \quad (2)$$

## Binary neural networks: forward path

In full precision, a convolution operation is expressed as:

$$z = \sigma(w \otimes a) \quad (2)$$

With the binarized weights and activations, the vector multiplication in **forward propagation** can be reformulated as:

$$z = \sigma(Qw(w) \otimes Qa(a)) = \sigma(\alpha\beta(b_w \odot b_a)) \quad (3)$$

where  $\odot$  denotes the inner product for vectors with bitwise operation XNOR-Bitcount.

# XNOR Operation in Binary Neural Networks

**Binary Neural Networks** (BNNs) use binary values instead of full-precision weights.

Multiplication between **bipolar values** ( $\pm 1$ ) can be replaced with a bitwise XNOR operation:

| A (bipolar) | B (bipolar) | A (binary) | B (binary) | A XNOR B | Result (bipolar) |
|-------------|-------------|------------|------------|----------|------------------|
| -1          | -1          | 0          | 0          | 1        | +1               |
| -1          | +1          | 0          | 1          | 0        | -1               |
| +1          | -1          | 1          | 0          | 0        | -1               |
| +1          | +1          | 1          | 1          | 1        | +1               |

**Table:** Mapping of bipolar multiplication to binary XNOR logic

**Key idea:** XNOR yields 1 when both binary values match  $\Rightarrow$  equivalent to multiplying equal-sign bipolar values.

# Binary neural networks: concept

$$\begin{array}{c} \begin{matrix} 1 & 1 & -1 \\ 1 & -1 & 1 \\ -1 & 1 & 1 \end{matrix} \otimes \begin{matrix} -1 & -1 & -1 & 1 & 1 & 1 & -1 & 1 \\ 1 & 1 & -1 & 1 & 1 & -1 & 1 & 1 \\ 1 & -1 & 1 & -1 & 1 & 1 & -1 & -1 \\ 1 & 1 & 1 & -1 & 1 & -1 & 1 & 1 \\ -1 & -1 & 1 & 1 & -1 & -1 & -1 & 1 \\ 1 & -1 & -1 & 1 & 1 & -1 & -1 & 1 \\ 1 & -1 & -1 & -1 & 1 & -1 & 1 & 1 \\ -1 & -1 & 1 & 1 & 1 & -1 & 1 & -1 \end{matrix} = \\ \text{1-bit weight} \qquad \qquad \qquad \text{1-bit activation} \end{array}$$

[Qin et al 2020]

stride = 1, padding = 0

# Binary neural networks: concept



[Qin et al 2020]

stride = 1, padding = 0

# Binary neural networks: concept



[Qin et al 2020]

stride = 1, padding = 0

# Binary neural networks: backward path



- step function is not differentiable



[Anderson & Berg, 2017]

# Binary neural networks: backward path



- step function is not differentiable
- straight-through estimator (STE)



[Anderson & Berg, 2017]

# Binary neural networks: backward path



- step function is not differentiable
- straight-through estimator (STE)
- this preserves the gradient's information and cancels large gradients

[Anderson & Berg, 2017]

# Binary neural networks: training algorithm

**Algorithm 1** Training an  $L$ -layers CNN with binary weights:

---

**Input:** A minibatch of inputs and targets ( $\mathbf{I}, \mathbf{Y}$ ), cost function  $C(\mathbf{Y}, \hat{\mathbf{Y}})$ , current weight  $\mathcal{W}^t$  and current learning rate  $\eta^t$ .

**Output:** updated weight  $\mathcal{W}^{t+1}$  and updated learning rate  $\eta^{t+1}$ .

- 1: Binarizing weight filters:
- 2: **for**  $l = 1$  to  $L$  **do**
- 3:   **for**  $k^{\text{th}}$  filter in  $l^{\text{th}}$  layer **do**
- 4:      $\mathcal{A}_{lk} = \frac{1}{n} \|\mathcal{W}_{lk}^t\|_{\ell_1}$
- 5:      $\mathcal{B}_{lk} = \text{sign}(\mathcal{W}_{lk}^t)$
- 6:      $\widetilde{\mathcal{W}}_{lk} = \mathcal{A}_{lk} \mathcal{B}_{lk}$
- 7:     $\hat{\mathbf{Y}} = \text{BinaryForward}(\mathbf{I}, \mathcal{B}, \mathcal{A})$  //
- 8:     $\frac{\partial C}{\partial \widetilde{\mathcal{W}}} = \text{BinaryBackward}(\frac{\partial C}{\partial \hat{\mathbf{Y}}}, \widetilde{\mathcal{W}})$  // standard backward propagation except that gradients are computed using  $\widetilde{\mathcal{W}}$  instead of  $\mathcal{W}^t$
- 9:     $\mathcal{W}^{t+1} = \text{UpdateParameters}(\mathcal{W}^t, \frac{\partial C}{\partial \widetilde{\mathcal{W}}}, \eta_t)$  // Any update rules (e.g., SGD or ADAM)
- 10:    $\eta^{t+1} = \text{UpdateLearningrate}(\eta^t, t)$  // Any learning rate scheduling function

[Rastegari et al, 2016]

# Binary neural networks: Does it work?



|                                            | Network Variations                                                                                                                                | Operations used in Convolution | Memory Saving (Inference) | Computation Saving (Inference) | Accuracy on ImageNet (AlexNet) |
|--------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------|---------------------------|--------------------------------|--------------------------------|
| Standard Convolution                       | Real-Value Inputs<br>0.11 -0.21 ... -0.34 ...<br>-0.25 0.61 ... 0.52 ...<br>Real-Value Weights<br>0.12 -1.2 ... 0.41 ...<br>-0.2 0.5 ... 0.68 ... | + , - , $\times$               | 1x                        | 1x                             | %56.7                          |
| Binary Weight                              | Real-Value Inputs<br>0.11 -0.21 ... -0.34 ...<br>-0.25 0.61 ... 0.52 ...<br>Binary Weights<br>1 -1 1 ... 1<br>-1 1 ... 1                          | + , -                          | ~32x                      | ~2x                            | %56.8                          |
| BinaryWeight<br>Binary Input<br>(XNOR-Net) | Binary Inputs<br>1 -1 ... -1<br>-1 1 ... 1<br>Binary Weights<br>1 -1 1 ... 1<br>-1 1 ... 1                                                        | XNOR , bitcount                | ~32x                      | ~58x                           | %44.2                          |

[Rastegari et al, 2016]

# Binary neural networks: Does it work?



[Rastegari et al, 2016]

# Binary neural networks: Binareye

| Layer | Type | W,H            | K | C   | Stride |
|-------|------|----------------|---|-----|--------|
| 0     | IO   | 32             | 2 | 256 | 1      |
| 1     | CONV | 32             | 2 | 256 | 1      |
| 2     | CONV | 31             | 2 | 256 | 1      |
| 3     | CONV | 30             | 2 | 256 | 1      |
| 4     | CONV | 29             | 2 | 256 | 1      |
| 4p    | MP   | 28             | - | -   | 2      |
| 5     | CONV | 14             | 2 | 256 | 1      |
| 6     | CONV | 13             | 2 | 256 | 1      |
| 6p    | MP   | 12             | - | -   | 2      |
| 7     | CONV | 6              | 2 | 256 | 1      |
| 8     | CONV | 5              | 2 | 256 | 1      |
| 9     | FC   | 4096 – to - 10 |   |     |        |

Can I work with 1 bit and have useful NN?

- yes for simple tasks, with slight accuracy drop, and not for all tasks! [\[Moons et al, 2018\]](#)

| MNIST | CIFAR<br>-10 | Face Det. <sup>d</sup> | Owner Det. <sup>e</sup> | 3 / 7<br>Angles |
|-------|--------------|------------------------|-------------------------|-----------------|
| 98.85 | 86           | 98.1*/95.7+            | 98.2*/83.3+             | 99.1/ 98.9      |

# Thermometer Coding

**Definition:** Thermometer coding is a unary encoding scheme where the number of ones represents the value. It is often used for low-precision quantization in neural networks due to its robustness to bit-flips.

**Example (3-bit resolution):**

| Value | Thermometer Code |
|-------|------------------|
| 0     | 000              |
| 1     | 100              |
| 2     | 110              |
| 3     | 111              |

**Properties:**

- Only one transition from 0 to 1
- Robust to single-bit errors
- Useful for quantized activations and analog-to-digital conversions

# Binareye: architecture



- always-on network for wake-up systems

[Moons et al, 2018]

# Binareye: architecture



- always-on network for wake-up systems
- a small input image (32x32)

[Moons et al, 2018]

# Binareye: architecture



- always-on network for wake-up systems
- a small input image (32x32)
- it has a data path to execute a binary convolutional neural network

[Moons et al, 2018]

# Binareye: architecture



- always-on network for wake-up systems
- a small input image (32x32)
- it has a data path to execute a binary convolutional neural network
- weight in SRAM blocks (top/bottom), + inputs and output SRAM

[Moons et al, 2018]

# Binareye: architecture



- always-on network for wake-up systems
- a small input image (32x32)
- it has a data path to execute a binary convolutional neural network
- weight in SRAM blocks (top/bottom), + inputs and output SRAM
- the controller can only do 3 instructions

[Moons et al, 2018]

# Binareye: architecture



[Moons et al, 2018]

- always-on network for wake-up systems
- a small input image (32x32)
- it has a data path to execute a binary convolutional neural network
- weight in SRAM blocks (top/bottom), + inputs and output SRAM
- the controller can only do 3 instructions
  1. take input

# Binareye: architecture



[Moons et al, 2018]

- always-on network for wake-up systems
- a small input image (32x32)
- it has a data path to execute a binary convolutional neural network
- weight in SRAM blocks (top/bottom), + inputs and output SRAM
- the controller can only do 3 instructions
  1. take input
  2. CONV

# Binareye: architecture



[Moons et al, 2018]

- always-on network for wake-up systems
- a small input image (32x32)
- it has a data path to execute a binary convolutional neural network
- weight in SRAM blocks (top/bottom), + inputs and output SRAM
- the controller can only do 3 instructions
  1. take input
  2. CONV
  3. FC (in a separate data path)

# Binareye: architecture



- Weights stationary (in RF)
- Inputs and output parallel
- 1024 inputs each clock cycles

[Moons et al, 2018]

# Binareye: what kind of stationarity/parallelism?



- Load each column  $C^*FX^*FY$  in local register, i.e., **weight stationary**

[Moons et al, 2018]

# Binareye: what kind of stationarity/parallelism?



- Load each column  $C^*FX^*FY$  in local register, i.e., **weight stationary**
- Load inputs, and inputs are shared across 64 columns (every neuron reuse the same input), i.e., **input reuse**

[Moons et al, 2018]

# Binareye: what kind of stationarity/parallelism?



- Load each column  $C^*FX^*FY$  in local register, i.e., **weight stationary**
- Load inputs, and inputs are shared across 64 columns (every neuron reuse the same input), i.e., **input reuse**
- Across one column, results are accumulated (they are computing products for the same output), i.e., **output reuse**

[Moons et al, 2018]

# Binareye: what kind of stationarity/parallelism?



- Very few memory fetches!  
(**weight stationary**, **input reuse**,  
**output reuse**)

[Moons et al, 2018]

# Binareye: what kind of stationarity/parallelism?



- Very few memory fetches!  
(**weight stationary, input reuse, output reuse**)
- Every cc:

[Moons et al, 2018]

# Binareye: what kind of stationarity/parallelism?



- Very few memory fetches!  
(**weight stationary, input reuse, output reuse**)
- Every cc:
  - fetch  $2 \times 2 \times 256$  (1024) bits input

[Moons et al, 2018]

# Binareye: what kind of stationarity/parallelism?



- Very few memory fetches!  
(**weight stationary, input reuse, output reuse**)
- Every cc:
  - fetch  $2 \times 2 \times 256$  (1024) bits input
  - store 64 bits output

[Moons et al, 2018]

# Binareye: what kind of stationarity/parallelism?



- Very few memory fetches!  
(**weight stationary, input reuse, output reuse**)
- Every cc:
  - fetch  $2 \times 2 \times 256$  (1024) bits input
  - store 64 bits output
  - 65536 MUL (1-bit)

[Moons et al, 2018]

# Binareye: what kind of stationarity/parallelism?

```
for(k2=0 to K/64-1); //for each output channel
  for(x=0 to X-1); //for each in/output column
    for(y=0 to Y-1); //for each in/output row
      parfor(c=0 to 255); //for each input channel
        parfor(fx=0 to 2); //for each kernel row
          parfor(fy=0 to 2); //for each kernel column
            parfor(k1=0 to 63); //for each output channel
              o[b] [k1+64.k2] [x] [y] += i[b] [c] [x+fx] [y+fy] *
                w[k1+64.k2] [c] [fx] [fy]
```

why weight stationary?

# Binareye: what kind of stationarity/parallelism?

```
for(k2=0 to K/64-1); //for each output channel  
for(x=0 to X-1); //for each in/output column  
for(y=0 to Y-1); //for each in/output row  
parfor(c=0 to 255); //for each input channel  
parfor(fx=0 to 2); //for each kernel row  
parfor(fy=0 to 2); //for each kernel column  
parfor(k1=0 to 63); //for each output channel  
    o[b] [k1+64.k2] [x] [y] += i[b] [c] [x+fx] [y+fy] *  
    w[k1+64.k2] [c] [fx] [fy]
```

## why weight stationary?

- because my x,y loop is just before the parfor (x,y reuse the same weight)

# Binareye: KU Leuven



# Binareye: KU Leuven



- 1 op = BINARY operation!

# Binareye: KU Leuven



(a) Breakdown by type



(b) Functional breakdown

- The benefits of this is massive parallelization → massive data reuse



(a) Breakdown by type



(b) Functional breakdown

- The benefits of this is massive parallelization → massive data reuse
- Thus, energy goes into effective compute and less into memory fetches! (75%!)



(a) Breakdown by type



(b) Functional breakdown

- The benefits of this is massive parallelization → massive data reuse
- Thus, energy goes into effective compute and less into memory fetches! (75%!)
- It is possible to do better?

# Binareye: KU Leuven & Stanford



## Stanford Mixed-Signal Design

Design the **same functionality** but using analog and mixed-signal design principles  
[Bankman et al, 2019]

# Binareye: KU Leuven & Stanford



## Stanford Mixed-Signal Design

Design the **same functionality** but using analog and mixed-signal design principles  
[Bankman et al, 2019]

# Binareye: KU Leuven & Stanford



- Wide vector summation in the neuron array is the main energy bottleneck in the digital implementation

# Binareye: KU Leuven & Stanford



- Wide vector summation in the neuron array is the main energy bottleneck in the digital implementation
- Addition in the analog domain with a switched-capacitor (SC) neuron

# Binareye: KU Leuven & Stanford



## Mixed-signal design

Keep XOR digital, and add up in the analog domain → charge distribution, over smaller cap, compared to digital.

# Binareye: KU Leuven & Stanford



## Mixed-signal design

Keep XOR digital, and add up in the analog domain → charge distribution, over smaller cap, compared to digital.

# Binareye: KU Leuven & Stanford



## Downsides?

- Area (from  $1.3mm^2$  to  $2.4mm^2$ )

# Binareye: KU Leuven & Stanford



## Downsides?

- Area (from  $1.3mm^2$  to  $2.4mm^2$ )
- Calibration (bias-voltage, PVT variations)



## Results

- **Left:** all digital binareye, **Right:** mixed-signal binareye.
- Neuron array bottleneck reduced by 13x, system level by 4x!
- CNN, Cifar-10, accuracy=86.05%,  $E_{frame} = 3.79\mu J$ , power=0.899mW, fps=237, 532TOPS/W(1bit)

# Outline

Recap

Introduction

Binary Neural Networks

Architectural choices

RISC-V MCUs & TinyML

Conclusion

# Architectural choices



- Number of memory levels
- Size of memories
- BW of memories
- Number of PEs
- Spatial unrolling
- Precision

Many, many, many degrees of freedom

So many approaches and degrees of freedom → we need **analytical costs models**.

# Optimizing DNN embedded processor design

## Analytical costs models

- Energy, latency, area
- Technology characteristics
- Hardware architecture & constraints
- Neural network workload
- Software compilers/mapping tools

# Optimizing DNN embedded processor design

## Analytical costs models

- Energy, latency, area
- Technology characteristics
- Hardware architecture & constraints
- Neural network workload
- Software compilers/mapping tools

## Models/Tools

- Timeloop (MIT) [paper](#), [code](#).
- Interstellar (Stanford) [paper](#), [code](#).
- ZigZag (KU Leuven) [paper](#),[code](#).
- MAESTRO (Nvidia) [paper](#)

# Summary

## Extreme of quantization

- Binary neural networks
- Binareye:
  1. digital KU Leuven
  2. mixed-signal KU Leuven & Stanford

## Extreme edge-AI processors

- extreme quantization can be useful for always-on systems

# Summary

## Extreme of quantization

- Binary neural networks
- Binareye:
  1. digital KU Leuven
  2. mixed-signal KU Leuven & Stanford

## Extreme edge-AI processors

- extreme quantization can be useful for always-on systems
- it is possible to reach very high  $A_f$  at the cost of accuracy

# Summary

## Extreme of quantization

- Binary neural networks
- Binareye:
  1. digital KU Leuven
  2. mixed-signal KU Leuven & Stanford

## Extreme edge-AI processors

- extreme quantization can be useful for always-on systems
- it is possible to reach very high  $A_f$  at the cost of accuracy
- might be acceptable for some applications, but not for all

## Extreme of quantization

- Binary neural networks
- Binareye:
  1. digital KU Leuven
  2. mixed-signal KU Leuven & Stanford

## Extreme edge-AI processors

- extreme quantization can be useful for always-on systems
- it is possible to reach very high  $A_f$  at the cost of accuracy
- might be acceptable for some applications, but not for all
- mixed signal approach enable to save further energy in compute at the cost of area and tedious calibration

# Reduced Precision in Research

Much more than just binary...

- Reduce number of bits
  - Binary Nets [Courbariaux, 2015]
- Reduce number of unique weights and/or activations
  - Ternary Weight Nets [Li, 2022]
  - XNOR-Net [Rastegari, 2016]
- Non-Linear Quantization
  - LogNet [Lee, 2017]



[Courbariaux, 2016]

# Outline

Recap

Introduction

Binary Neural Networks

Architectural choices

RISC-V MCUs & TinyML

Conclusion

## State-of-the-art tinyML

- motivations for ML on microcontroller (MCU)
- tinyML platforms and applications
- anatomy of a keyword spotting application
- energy/cost analysis

## DNN processors overview



- there is a big range in power and efficiency!

[Reuther et al, 2022]

# DNN processors overview



- there is a big range in power and efficiency!
- high peak performance → very specialized solutions

[Reuther et al, 2022]

## DNN processors overview



- there is a big range in power and efficiency!
  - high peak performance → very specialized solutions
  - how about programmability & very-low-power?

[Reuther et al, 2022]

# TinyML: why?



## Some promising social applications of TinyML

- wildlife conservation
- agriculture
- early detection of respiratory diseases

# TinyML: why?



[Rabaey et al, 2019]

# TinyML: challenges

## TinyML

- provide machine learning capabilities in a microcontroller-like HW (i.e., few mW in average - operational vs stand-by)
- battery lifetime
- cheap (small area, simple system)
- versatile (audio, vision, radar, vibration, ..)



HiMax WE-I



Arduino Nano BLE  
Sense [M. Banzi, 2021]



SparkFun Edge 2



ESP-EYE

# Example of Micro Controller Architecture



# DNN processors overview



- there is a big range in power and efficiency!

[Reuther et al, 2022]

# DNN processors overview



- there is a big range in power and efficiency!
- high peak performance → very specialized solutions

[Reuther et al, 2022]

# DNN processors overview



- there is a big range in power and efficiency!
- high peak performance → very specialized solutions
- MCUs are in the very-low-power region

[Reuther et al, 2022]

# Can we use MCUs for ML tasks?

Example: keyword spotting on ARM MCU

- Always-on systems (e.g., "Hello Google", "Alexa")

# Can we use MCUs for ML tasks?

## Example: keyword spotting on ARM MCU

- Always-on systems (e.g., "Hello Google", "Alexa")
- How do they work?

# Can we use MCUs for ML tasks?

## Example: keyword spotting on ARM MCU

- Always-on systems (e.g., "Hello Google", "Alexa")
- How do they work?



# Can we use MCUs for ML tasks?

## Microcontroller systems

- Memory is limited
- Operation per second is limited
- Low-throughput for compute-intensive workloads
- Some MCUs (e.g., Cortex-M4/M7) have DPS instructions (SIMD, MAC)
- Require low-precision DNNs

| Arm Mbed™ platform | Processor | Frequency | SRAM   | Flash  |
|--------------------|-----------|-----------|--------|--------|
| Mbed LPC11U24      | Cortex-M0 | 48 MHz    | 8 KB   | 32 KB  |
| Nordic nRF51-DK    | Cortex-M0 | 16 MHz    | 32 KB  | 256 KB |
| Mbed LPC1768       | Cortex-M3 | 96 MHz    | 32 KB  | 512 KB |
| Nucleo F103RB      | Cortex-M3 | 72 MHz    | 20 KB  | 128 KB |
| Nucleo L476RG      | Cortex-M4 | 80 MHz    | 128 KB | 1 MB   |
| Nucleo F411RE      | Cortex-M4 | 100 MHz   | 128 KB | 512 KB |
| FRDM-K64F          | Cortex-M4 | 120 MHz   | 256 KB | 1 MB   |
| Nucleo F746ZG      | Cortex M7 | 216 MHz   | 320 KB | 1 MB   |

# Neural Networks Architectures for Keyword Spotting (KWS)

For example, in an ARM Cortex M7:

- What layers requires more throughput/power/energy per op?



# Neural Networks Architectures for Keyword Spotting (KWS)

For example, in an ARM Cortex M7:

- What layers require more throughput/power/energy per op?
- What types of layers have the highest throughput and are less MAC intensive?



# TinyML: microcontrollers start to support ML workloads

## How to achieve tinyML in MCU?

- add specialized instruction (e.g., MAC)
- add efficient data motion instruction (e.g., auto-increment, vectorized data fetch)
- HW support for multiple low-precision instructions in a single clock cycle (single instruction multiple data SIMD)
- this will inevitably increase HW costs → we need a large gain in energy efficiency

# RISC-V architecture and instruction set



## What is an instruction set?

Is a group of commands for a central processing unit (CPU) in machine language. The term can refer to all possible instructions for a CPU or a subset of instructions to enhance its performance in certain situations.

# RISC-V architecture and instruction set



## What is an instruction set?

Is a group of commands for a central processing unit (CPU) in machine language. The term can refer to all possible instructions for a CPU or a subset of instructions to enhance its performance in certain situations.

## RISC-V

- RISC-V is an open instruction set (free and open)
- started in UC Berkeley in 2010
- instruction set can be extended by anybody!

# TinyML: workloads



[Li et al, 2021]

Model zoo → programmable hardware!

- tiny neural networks (many network topologies)
- still useful! (not at the level of larger models)
- costs (memory & complexity)
- computational capabilities (elementary operation needed to run one inference)

# TinyML: workloads



[Li et al, 2021]

Model zoo → some good news!

- lots of operations per data fetch

# TinyML: workloads



[Li et al, 2021]

Model zoo → some good news!

- lots of operations per data fetch
- massive data parallelism

# TinyML: workloads



[Li et al, 2021]

Model zoo → some good news!

- lots of operations per data fetch
- massive data parallelism
- MAC-dominated

# TinyML: workloads



[Li et al, 2021]

Model zoo → some good news!

- lots of operations per data fetch
- massive data parallelism
- MAC-dominated
- they support low-precision  
(8b,4b,2b)

# TinyML: RISC-V architecture and instruction set

## RISC-V

- there is a small core of instruction set  
(common in all RISC-V family processors)



[RISC-V Reference Card]

# TinyML: RISC-V architecture and instruction set

## RISC-V

- there is a small core of instruction set (common in all RISC-V family processors)
- there are a set of extra instruction set that can be added (depending on the functionality)



[RISC-V Reference Card]

# RISC-V architecture and instruction set

Free & Open RISC-V Reference Card

| Base Integer Instructions: RV32I, RV64I, and RV128I |                            |        |           |           |           |         | RV Privileged Instructions |           |      |           |         |           |      | RV mnemonic |         |           |         |         |         |  |
|-----------------------------------------------------|----------------------------|--------|-----------|-----------|-----------|---------|----------------------------|-----------|------|-----------|---------|-----------|------|-------------|---------|-----------|---------|---------|---------|--|
|                                                     |                            |        |           | RV32I/64I |           |         | +RV(64,128)                |           |      |           |         |           |      |             |         |           |         |         |         |  |
| Category                                            | Name                       | Format | Op        | rd        | rs1       | rs2     | imm                        | rd        | rs1  | rs2       | imm     | rd        | rs1  | rs2         | imm     | rd        | rs1     | rs2     | imm     |  |
| Loads                                               | Load Byte                  | I      | LB        | rd        | rs1,imm   |         |                            | L         | rd   | rs1,imm   |         | L         | rd   | rs1,imm     |         | CSEWR     | rd      | car,r1  |         |  |
|                                                     | Load Halfword              | I      | LDH       | rd        | rs1,imm   |         |                            | LHD       | rd   | rs1,imm   |         | LHD       | rd   | rs1,imm     |         | CSERW     | rd      | car,r1  |         |  |
|                                                     | Load Word                  | I      | LD        | rd        | rs1,imm   |         |                            | L         | D    | rd        | rs1,imm | LWD       | rd   | rs1,imm     |         | CSERW     | rd      | car,r1  |         |  |
| Load Byt Unsigned                                   | Load Byte Unsigned         | I      | LBUI      | rd        | rs1,imm   |         |                            | LBU       | rd   | rs1,imm   |         | LBU       | rd   | rs1,imm     |         | ATOMICR   | rd      | car,imm |         |  |
|                                                     | Load Half Unsigned         | I      | LDHUI     | rd        | rs1,imm   |         |                            | LHDU      | rd   | rs1,imm   |         | LHDU      | rd   | rs1,imm     |         | ATOMICR   | rd      | car,imm |         |  |
|                                                     | Load Word Unsigned         | I      | LDUI      | rd        | rs1,imm   |         |                            | LWU       | rd   | rs1,imm   |         | LWU       | rd   | rs1,imm     |         | ATOMICR   | rd      | car,imm |         |  |
| Stores                                              | Store Byte                 | S      | SB        | rd        | rs1,imm   |         |                            | S         | rd   | rs1,imm   |         | S         | rd   | rs1,imm     |         | STCR      | rd      | car,r1  |         |  |
|                                                     | Store Halfword             | S      | SH        | rd        | rs1,imm   |         |                            | SHL       | rd   | rs1,imm   |         | SHL       | rd   | rs1,imm     |         | STCR      | rd      | car,r1  |         |  |
|                                                     | Store Word                 | S      | SW        | rd        | rs1,imm   |         |                            | SLW       | rd   | rs1,imm   |         | SLW       | rd   | rs1,imm     |         | STCR      | rd      | car,r1  |         |  |
| Shifts                                              | Shift Left                 | R      | SL        | rd        | rs1,imm   |         |                            | SL        | rd   | rs1,imm   |         | SL        | rd   | rs1,imm     |         | SHTR      | rd      | car,r1  |         |  |
|                                                     | Shift Left Immediate       | R      | SLI       | rd        | rs1,shamt |         |                            | SLI       | rd   | rs1,shamt |         | SLI       | rd   | rs1,shamt   |         | SHTR      | rd      | car,r1  |         |  |
|                                                     | Shift Right                | R      | SR        | rd        | rs1,imm   |         |                            | SR        | rd   | rs1,imm   |         | SR        | rd   | rs1,imm     |         | SHTR      | rd      | car,r1  |         |  |
| Shift Right Arithmetic                              | Shift Right                | R      | SR        | rd        | rs1,imm   |         |                            | SR        | rd   | rs1,imm   |         | SR        | rd   | rs1,imm     |         | SHTR      | rd      | car,r1  |         |  |
|                                                     | Shift Right Arithmetic     | R      | SRAT      | rd        | rs1,imm   |         |                            | SRAT      | rd   | rs1,imm   |         | SRAT      | rd   | rs1,imm     |         | SHTR      | rd      | car,r1  |         |  |
|                                                     | Shift Right Arithmetic Imm | R      | SRATI     | rd        | rs1,imm   |         |                            | SRATI     | rd   | rs1,imm   |         | SRATI     | rd   | rs1,imm     |         | SHTR      | rd      | car,r1  |         |  |
| Arithmetic                                          | ADD                        | R      | AD        | rd        | rs1,rs2   |         |                            | ADD       | rd   | rs1,rs2   |         | ADD       | rd   | rs1,rs2     |         | ADDS      | rd      | rs1,rs2 |         |  |
|                                                     | ADD Immediate              | R      | ADI       | rd        | rs1,imm   |         |                            | ADI       | rd   | rs1,imm   |         | ADI       | rd   | rs1,imm     |         | ADDS      | rd      | rs1,imm |         |  |
|                                                     | Subtract                   | R      | SD        | rd        | rs1,rs2   |         |                            | SD        | rd   | rs1,rs2   |         | SD        | rd   | rs1,rs2     |         | SDS       | rd      | rs1,rs2 |         |  |
| Load Upper Imm                                      | Load Upper Imm             | U      | LUD       | rd        | imm       |         |                            | LUD       | rd   | rs1,imm   |         | LUD       | rd   | rs1,imm     |         | LDUR      | rd      | rs1,imm |         |  |
|                                                     | Add Upper Imm              | U      | AUD       | rd        | rs1,imm   |         |                            | AUD       | rd   | rs1,imm   |         | AUD       | rd   | rs1,imm     |         | ADUR      | rd      | rs1,imm |         |  |
|                                                     | Subtract Upper Imm         | U      | SUD       | rd        | rs1,imm   |         |                            | SUD       | rd   | rs1,imm   |         | SUD       | rd   | rs1,imm     |         | SDUR      | rd      | rs1,imm |         |  |
| Logical                                             | XOR                        | R      | XOR       | rd        | rs1,rs2   |         |                            | XOR       | rd   | rs1,rs2   |         | XOR       | rd   | rs1,rs2     |         | XOR       | rd      | rs1,rs2 |         |  |
|                                                     | XOR Immediate              | R      | XORI      | rd        | rs1,imm   |         |                            | XORI      | rd   | rs1,imm   |         | XORI      | rd   | rs1,imm     |         | XORI      | rd      | rs1,imm |         |  |
|                                                     | OR                         | R      | OR        | rd        | rs1,rs2   |         |                            | OR        | rd   | rs1,rs2   |         | OR        | rd   | rs1,rs2     |         | OR        | rd      | rs1,rs2 |         |  |
| OR Immediate                                        | OR Immediate               | R      | OPI       | rd        | rs1,imm   |         |                            | OPI       | rd   | rs1,imm   |         | OPI       | rd   | rs1,imm     |         | OPI       | rd      | rs1,imm |         |  |
|                                                     | AND                        | R      | AND       | rd        | rs1,rs2   |         |                            | AND       | rd   | rs1,rs2   |         | AND       | rd   | rs1,rs2     |         | AND       | rd      | rs1,rs2 |         |  |
|                                                     | AND Immediate              | R      | ANDI      | rd        | rs1,imm   |         |                            | ANDI      | rd   | rs1,imm   |         | ANDI      | rd   | rs1,imm     |         | ANDI      | rd      | rs1,imm |         |  |
| Comparison                                          | Set Equal                  | S      | SEQ       | rd        | rs1,rs2   |         |                            | SEQ       | rd   | rs1,rs2   |         | SEQ       | rd   | rs1,rs2     |         | SEQ       | rd      | rs1,rs2 |         |  |
|                                                     | Set Less                   | S      | SEL       | rd        | rs1,rs2   |         |                            | SEL       | rd   | rs1,rs2   |         | SEL       | rd   | rs1,rs2     |         | SEL       | rd      | rs1,rs2 |         |  |
|                                                     | Set < Unsigned             | S      | SELU      | rd        | rs1,rs2   |         |                            | SELU      | rd   | rs1,rs2   |         | SELU      | rd   | rs1,rs2     |         | SELU      | rd      | rs1,rs2 |         |  |
| Branches                                            | Branch                     | B      | BEQ       | rd        | rs1,rs2   |         |                            | BEQ       | rd   | rs1,rs2   |         | BEQ       | rd   | rs1,rs2     |         | BEQ       | rd      | rs1,rs2 |         |  |
|                                                     | Branch                     | B      | BNE       | rd        | rs1,rs2   |         |                            | BNE       | rd   | rs1,rs2   |         | BNE       | rd   | rs1,rs2     |         | BNE       | rd      | rs1,rs2 |         |  |
|                                                     | Branch                     | B      | BGE       | rd        | rs1,rs2   |         |                            | BGE       | rd   | rs1,rs2   |         | BGE       | rd   | rs1,rs2     |         | BGE       | rd      | rs1,rs2 |         |  |
| Branches                                            | Branch & Unsigned          | B      | BGEU      | rd        | rs1,rs2   |         |                            | BGEU      | rd   | rs1,rs2   |         | BGEU      | rd   | rs1,rs2     |         | BGEU      | rd      | rs1,rs2 |         |  |
|                                                     | Branch & Unsigned          | B      | BGEUI     | rd        | rs1,rs2   |         |                            | BGEUI     | rd   | rs1,rs2   |         | BGEUI     | rd   | rs1,rs2     |         | BGEUI     | rd      | rs1,rs2 |         |  |
|                                                     | Branch & Unsigned          | B      | BGEIU     | rd        | rs1,rs2   |         |                            | BGEIU     | rd   | rs1,rs2   |         | BGEIU     | rd   | rs1,rs2     |         | BGEIU     | rd      | rs1,rs2 |         |  |
| Jump & Link                                         | JMP                        | J      | JAL       | rd        | rs1,imm   |         |                            | JAL       | rd   | rs1,imm   |         | JAL       | rd   | rs1,imm     |         | JALR      | rd      | rs1,imm |         |  |
|                                                     | JMP & Link Register        | J      | JALR      | rd        | rs1,imm   |         |                            | JALR      | rd   | rs1,imm   |         | JALR      | rd   | rs1,imm     |         | JALR      | rd      | rs1,imm |         |  |
|                                                     | Sync                       | S      | SYN       | rd        | rs1,imm   |         |                            | SYN       | rd   | rs1,imm   |         | SYN       | rd   | rs1,imm     |         | SYN       | rd      | rs1,imm |         |  |
| System                                              | System Call                | I      | SCALL     | rd        | rs1,imm   |         |                            | SCALL     | rd   | rs1,imm   |         | SCALL     | rd   | rs1,imm     |         | SCALL     | rd      | rs1,imm |         |  |
|                                                     | System Break               | I      | SBRK      | rd        | rs1,imm   |         |                            | SBRK      | rd   | rs1,imm   |         | SBRK      | rd   | rs1,imm     |         | SBRK      | rd      | rs1,imm |         |  |
|                                                     | System Break               | I      | SBRKI     | rd        | rs1,imm   |         |                            | SBRKI     | rd   | rs1,imm   |         | SBRKI     | rd   | rs1,imm     |         | SBRKI     | rd      | rs1,imm |         |  |
| Counters                                            | Read CYCLE                 | R      | RCYCLE    | rd        |           |         |                            | RCYCLE    | rd   |           |         | RCYCLE    | rd   |             |         | RCYCLE    | rd      |         |         |  |
|                                                     | Read CYCLE upper Half      | R      | RCYCLEH   | rd        |           |         |                            | RCYCLEH   | rd   |           |         | RCYCLEH   | rd   |             |         | RCYCLEH   | rd      |         |         |  |
|                                                     | Read TIME                  | R      | RTIME     | rd        |           |         |                            | RTIME     | rd   |           |         | RTIME     | rd   |             |         | RTIME     | rd      |         |         |  |
| Read INSTR                                          | Read INSTR                 | R      | RINSTR    | rd        |           |         |                            | RINSTR    | rd   |           |         | RINSTR    | rd   |             |         | RINSTR    | rd      |         |         |  |
|                                                     | Read INSTR upper Half      | R      | RINSTRH   | rd        |           |         |                            | RINSTRH   | rd   |           |         | RINSTRH   | rd   |             |         | RINSTRH   | rd      |         |         |  |
|                                                     | Read INSTR RET             | R      | RINSTRRET | rd        |           |         |                            | RINSTRRET | rd   |           |         | RINSTRRET | rd   |             |         | RINSTRRET | rd      |         |         |  |
| Shifts                                              | Shift Left Imm             | S      | SLI       | rd        | rs1,imm   |         |                            | SLI       | rd   | rs1,imm   |         | SLI       | rd   | rs1,imm     |         | SLI       | rd      | rs1,imm |         |  |
|                                                     | Shift Left Imm             | S      | SLW       | rd        | rs1,imm   |         |                            | SLW       | rd   | rs1,imm   |         | SLW       | rd   | rs1,imm     |         | SLW       | rd      | rs1,imm |         |  |
|                                                     | Shift Left Imm             | S      | SLD       | rd        | rs1,imm   |         |                            | SLD       | rd   | rs1,imm   |         | SLD       | rd   | rs1,imm     |         | SLD       | rd      | rs1,imm |         |  |
| Stores                                              | Load Word                  | S      | C         | LD        | rd        | rs1,imm |                            |           | C    | LD        | rd      | rs1,imm   |      | C           | LD      | rd        | rs1,imm |         |         |  |
|                                                     | Load Word SP               | S      | CLW       | rd        | rs1,imm   |         |                            | CLW       | rd   | rs1,imm   |         | CLW       | rd   | rs1,imm     |         | CLW       | rd      | rs1,imm |         |  |
|                                                     | Load Double                | S      | CD        | rd        | rs1,imm   |         |                            | CD        | rd   | rs1,imm   |         | CD        | rd   | rs1,imm     |         | CD        | rd      | rs1,imm |         |  |
| Stores                                              | Load Double SP             | S      | CLWD      | rd        | rs1,imm   |         |                            | CLWD      | rd   | rs1,imm   |         | CLWD      | rd   | rs1,imm     |         | CLWD      | rd      | rs1,imm |         |  |
|                                                     | Load Quad                  | S      | CQ        | rd        | rs1,imm   |         |                            | CQ        | rd   | rs1,imm   |         | CQ        | rd   | rs1,imm     |         | CQ        | rd      | rs1,imm |         |  |
|                                                     | Load Quad SP               | S      | CLWDQ     | rd        | rs1,imm   |         |                            | CLWDQ     | rd   | rs1,imm   |         | CLWDQ     | rd   | rs1,imm     |         | CLWDQ     | rd      | rs1,imm |         |  |
| Arithmetics                                         | ADD                        | R      | CR        | rd        | rs1,rs2   |         |                            | CR        | rd   | rs1,rs2   |         | CR        | rd   | rs1,rs2     |         | CR        | rd      | rs1,rs2 |         |  |
|                                                     | ADD Immediate              | R      | CRD       | rd        | rs1,imm   |         |                            | CRD       | rd   | rs1,imm   |         | CRD       | rd   | rs1,imm     |         | CRD       | rd      | rs1,imm |         |  |
|                                                     | ADD Word Imm               | R      | CRDI      | rd        | rs1,imm   |         |                            | CRDI      | rd   | rs1,imm   |         | CRDI      | rd   | rs1,imm     |         | CRDI      | rd      | rs1,imm |         |  |
| Arithmetics                                         | ADD Word                   | R      | CRD       | rd        | rs1,rs2   |         |                            | CRD       | rd   | rs1,rs2   |         | CRD       | rd   | rs1,rs2     |         | CRD       | rd      | rs1,rs2 |         |  |
|                                                     | ADD Word Imm               | R      | CRDIW     | rd        | rs1,imm   |         |                            | CRDIW     | rd   | rs1,imm   |         | CRDIW     | rd   | rs1,imm     |         | CRDIW     | rd      | rs1,imm |         |  |
|                                                     | ADD Word Imm               | R      | CRDIP     | rd        | rs1,imm   |         |                            | CRDIP     | rd   | rs1,imm   |         | CRDIP     | rd   | rs1,imm     |         | CRDIP     | rd      | rs1,imm |         |  |
| Arithmetics                                         | ADD SP Imm                 | R      | CRDIPSP   | rd        | rs1,imm   |         |                            | CRDIPSP   | rd   | rs1,imm   |         | CRDIPSP   | rd   | rs1,imm     |         | CRDIPSP   | rd      | rs1,imm |         |  |
|                                                     | ADD SP Imm                 | R      | CRDIPSPM  | rd        | rs1,imm   |         |                            | CRDIPSPM  | rd   | rs1,imm   |         | CRDIPSPM  | rd   | rs1,imm     |         | CRDIPSPM  | rd      | rs1,imm |         |  |
|                                                     | ADD SP Imm                 | R      | CRDIPSPMI | rd        | rs1,imm   |         |                            | CRDIPSPMI | rd   | rs1,imm   |         | CRDIPSPMI | rd   | rs1,imm     |         | CRDIPSPMI | rd      | rs1,imm |         |  |
| Arithmetics                                         | Load Immediate             | R      | CL        | rd        | rs1,imm   |         |                            | CL        | rd   | rs1,imm   |         | CL        | rd   | rs1,imm     |         | CL        | rd      | rs1,imm |         |  |
|                                                     | Load Immediate             | R      | CLD       | rd        | rs1,imm   |         |                            | CLD       | rd   | rs1,imm   |         | CLD       | rd   | rs1,imm     |         | CLD       | rd      | rs1,imm |         |  |
|                                                     | Load Immediate             | R      | CLDZ      | rd        | rs1,imm   |         |                            | CLDZ      | rd   | rs1,imm   |         | CLDZ      | rd   | rs1,imm     |         | CLDZ      | rd      | rs1,imm |         |  |
| Arithmetics                                         | MUL                        | R      | CM        | rd        | rs1,rs2   |         |                            | CM        | rd   | rs1,rs2   |         | CM        | rd   | rs1,rs2     |         | MUL       | rd      | rs1,rs2 |         |  |
|                                                     | MUL                        | R      | CMV       | rd        | rs1,imm   |         |                            | CMV       | rd   | rs1,imm   |         | CMV       | rd   | rs1,imm     |         | MUL       | rd      | rs1,imm |         |  |
|                                                     | MUL                        | R      | CMS       | rd        | rs1,imm   |         |                            | CMS       | rd   | rs1,imm   |         | CMS       | rd   | rs1,imm     |         | MUL       | rd      | rs1,imm |         |  |
| Shifts                                              | Shift Left Imm             | S      | C         | SLLI      | rd        | rs1,imm |                            |           | SLLI | rd        | rs1,imm |           | SLLI | rd          | rs1,imm |           | SLLI    | rd      | rs1,imm |  |
|                                                     | Shift Left Imm             | S      | C         | BRHS      | rd        | rs1,imm |                            |           | BRHS | rd        | rs1,imm |           | BRHS | rd          | rs1,imm |           | BRHS    | rd      | rs1,imm |  |
|                                                     | Shift Left Imm             | S      | C         | BRS       | rd        | rs1,imm |                            |           | BRS  | rd        | rs1,imm |           | BRS  | rd          | rs1,imm |           | BRS     | rd      | rs1,imm |  |
| Shifts                                              | Jump                       | J      | CR        | rd        | rs1,imm   |         |                            | CR        | rd   | rs1,imm   |         | CR        | rd   | rs1,imm     |         | JAL       | rd      | rs1,imm |         |  |
|                                                     | Jump Register              | J      | CRJ       | rd        | rs1,imm   |         |                            | CRJ       | rd   | rs1,imm   |         | CRJ       | rd   | rs1,imm     |         | JALR      | rd      | rs1,imm |         |  |
|                                                     | Jump & Link                | J      | CJAL      | rd        | rs1,imm   |         |                            | CJAL      | rd   | rs1,imm   |         | CJAL      | rd   | rs1,imm     |         | JAL       | rd      | rs1,imm |         |  |

① Free & Open  RISC-V Reference Card ([riscv.org](http://riscv.org)) ②

| Optional Atomic/Multi-Divide Instruction Extension: RVX               |                           |           |                    |                 |              |       |    |    |             |
|-----------------------------------------------------------------------|---------------------------|-----------|--------------------|-----------------|--------------|-------|----|----|-------------|
| RV32M (Multiply-Divide)                                               |                           |           |                    |                 | RV64 (12B)   |       |    |    |             |
| Category                                                              | Name                      | Fmt       | Op                 | Desc            | Op           | Op    | Op | Op | Desc        |
| Multiply                                                              | MUL                       | R MUL     | rd,rs1,rs2         | MUL(W D)        | rd,rs1,rs2   |       |    |    |             |
|                                                                       | Multiply upper Half       | R MULH    | rd,rs1,rs2         | MULH(W D)       | rd,rs1,rs2   |       |    |    |             |
|                                                                       | Multiply Half Sign-Ext    | R MULHS   | rd,rs1,rs2         | MULHS(W D)      | rd,rs1,rs2   |       |    |    |             |
| Divide                                                                | DIV                       | R DIV     | rd,rs1,rs2         | DIV(W D)        | rd,rs1,rs2   |       |    |    |             |
|                                                                       | UDIV Unsigned             | R DIVU    | rd,rs1,rs2         | UDIV(W D)       | rd,rs1,rs2   |       |    |    |             |
|                                                                       | SDIV Signed               | R DIVS    | rd,rs1,rs2         | SDIV(W D)       | rd,rs1,rs2   |       |    |    |             |
| Remainder                                                             | REMDER                    | R REM     | rd,rs1,rs2         | REM(W D)        | rd,rs1,rs2   |       |    |    |             |
|                                                                       | REMAINDER Unsigned        | R REMU    | rd,rs1,rs2         | REMUS(W D)      | rd,rs1,rs2   |       |    |    |             |
|                                                                       | REMAINDER Signed          | R REMS    | rd,rs1,rs2         | REMWS(W D)      | rd,rs1,rs2   |       |    |    |             |
| Optional Atomic Instruction Extension: RVA                            |                           |           |                    |                 |              |       |    |    |             |
| Category                                                              | Name                      | Fmt       | Op                 | Desc            | Op           | Op    | Op | Op | +RV64 (12B) |
| Load                                                                  | Load Reserved             | R LDR     | rd,rs1             | LDR(B D Q)      | rd,rs1       |       |    |    |             |
| Store                                                                 | Store Condition           | R SCL     | rd,rs1,rs2         | SC(D D Q)       | rd,rs1,rs2   |       |    |    |             |
| Swap                                                                  | SWAP                      | R AMOSWAP | rd,rs1,rs2         | AMOSWAP(D Q)    | rd,rs1,rs2   |       |    |    |             |
| Add                                                                   | ADDS                      | R ADDS    | rd,rs1,rs2         | ADDSD(M D)      | rd,rs1,rs2   |       |    |    |             |
| Logical                                                               | XOR                       | R XOR     | rd,rs1,rs2         | XORSD(M D)      | rd,rs1,rs2   |       |    |    |             |
| XOR                                                                   | XORK                      | R XORK    | rd,rs1,rs2         | XORSK(M D)      | rd,rs1,rs2   |       |    |    |             |
| Min/Max                                                               | MINM                      | R ANORMIN | rd,rs1,rs2         | ANORMIN(D Q)    | rd,rs1,rs2   |       |    |    |             |
| MAXM                                                                  | ANORMAX                   | R ANORMAX | rd,rs1,rs2         | ANORMAX(D Q)    | rd,rs1,rs2   |       |    |    |             |
| MINM Unsigned                                                         | AMINR                     | R AMINR   | rd,rs1,rs2         | AMINR(D Q)      | rd,rs1,rs2   |       |    |    |             |
| MAXM Unsigned                                                         | AMAXR                     | R AMAXR   | rd,rs1,rs2         | AMAXR(D Q)      | rd,rs1,rs2   |       |    |    |             |
| Three Optional Floating-Point Instruction Extensions: RVF, RVD, & RVQ |                           |           |                    |                 |              |       |    |    |             |
| Category                                                              | Name                      | Fmt       | Op                 | Desc            | Op           | Op    | Op | Op | +RV64 (12B) |
| Move                                                                  | New floating-point format | RV FMT    | rd,rs1             | FMT(F D Q)      | rd,rs1       |       |    |    |             |
|                                                                       | Move to Integer           | RV FPI    | (B D Q),rs1        | FPI(B D Q)      | rd,rs1       |       |    |    |             |
| Convert                                                               | Convert from Int          | RV FCVT   | (B D Q),W,rd,rs1   | FCVT(B D Q),W   | (L T),rd,rs1 |       |    |    |             |
|                                                                       | Convert from Int Unsigned | RV FCVTR  | (B D Q),W,rd,rs1   | FCVTR(B D Q),W  | (L T),rd,rs1 |       |    |    |             |
|                                                                       | Convert to Int            | RV FCVTI  | (B D Q),W,rd,rs1   | FCVTI(B D Q),W  | (L T),rd,rs1 |       |    |    |             |
|                                                                       | Convert to Int Unsigned   | RV FCVTIU | (B D Q),W,rd,rs1   | FCVTIU(B D Q),W | (L T),rd,rs1 |       |    |    |             |
| Sign Inject                                                           | Sign Inject               | RV FSIGIN | (B D Q),rd,rs1,rs2 | FSIGIN(B D Q)   | rd,rs1,rs2   |       |    |    |             |
|                                                                       | Negative Sign Source      | RV FSIGNN | (B D Q),rd,rs1,rs2 | FSIGNN(B D Q)   | rd,rs1,rs2   |       |    |    |             |
|                                                                       | Xor Sign Source           | RV FSIGBX | (B D Q),rd,rs1,rs2 | FSIGBX(B D Q)   | rd,rs1,rs2   |       |    |    |             |
| Min/Max                                                               | MINM                      | R FMIN    | (B D Q),rd,rs1,rs2 | FMIN(B D Q)     | rd,rs1,rs2   |       |    |    |             |
| MAXM                                                                  | MAXM                      | R FMAX    | (B D Q),rd,rs1,rs2 | FMAX(B D Q)     | rd,rs1,rs2   |       |    |    |             |
| Compare                                                               | Compare Fwd =             | R FBQ     | (S D Q),rd,rs1,rs2 | FBQ(S D Q)      | rd,rs1,rs2   | a0=1  |    |    |             |
|                                                                       | Compare Fwd =             | R FBQF    | (S D Q),rd,rs1,rs2 | FBQF(S D Q)     | rd,rs1,rs2   | a1=1  |    |    |             |
|                                                                       | Compare Fwd =             | R FBQF    | (S D Q),rd,rs1,rs2 | FBQF(S D Q)     | rd,rs1,rs2   | a10=1 |    |    |             |
| Categorization                                                        | Classify Type             | R FCLASR  | (B D Q),rd,rs1,rs2 | FCLASR(B D Q)   | rd,rs1,rs2   | a11=1 |    |    |             |
| Configuration                                                         | Read Status               | R FCSR    | rd                 | FCSR            | rd           |       |    |    |             |
|                                                                       | Read Rounding Node        | R FRNU    | rd                 | FRNU            | rd           |       |    |    |             |
|                                                                       | Read Flags                | R FRFLAGS | rd                 | FRFLAGS         | rd           |       |    |    |             |
|                                                                       | Swap Status Reg           | R FCSR    | rd,rs1             | FCSR            | rd,rs1       |       |    |    |             |
|                                                                       | Swap Round Mode           | R FPMR    | rd,rs1             | FPMR            | rd,rs1       |       |    |    |             |
|                                                                       | Swap Round Mode Imm       | R FPMRI   | rd,imm             | FPMRI           | rd,imm       |       |    |    |             |
|                                                                       | Swap Flags Imm            | R FPLAHI  | rd,imm             | FPLAHI          | rd,imm       |       |    |    |             |

# RISC-V architecture and instruction set

Free & Open RISC-V Reference Card

| Basic Instructions                                                       |  |  |  |  |                                   |
|--------------------------------------------------------------------------|--|--|--|--|-----------------------------------|
| <b>Category</b> <b>Op</b> <b>Opnd</b> <b>Opnd</b> <b>Opnd</b> <b>RVC</b> |  |  |  |  |                                   |
| <b>Loads</b>                                                             |  |  |  |  | <b>RV Privileged Instructions</b> |
| Load Word                                                                |  |  |  |  | <b>RV(mvmm)</b>                   |
| Load HalfWord                                                            |  |  |  |  |                                   |
| Load Word                                                                |  |  |  |  |                                   |
| Load Byte Unsigned                                                       |  |  |  |  |                                   |
| Stores                                                                   |  |  |  |  |                                   |
| Store HalfWord                                                           |  |  |  |  |                                   |
| Store Word                                                               |  |  |  |  |                                   |
| Shifts                                                                   |  |  |  |  |                                   |
| Shift Left Immediate                                                     |  |  |  |  |                                   |
| Shift Right Immediate                                                    |  |  |  |  |                                   |
| Shift Right Arithmetic                                                   |  |  |  |  |                                   |
| Shift Right Arithmetic Immediate                                         |  |  |  |  |                                   |
| <b>Arithmetic</b>                                                        |  |  |  |  |                                   |
| ADD Immediate                                                            |  |  |  |  |                                   |
| ADD Immediate                                                            |  |  |  |  |                                   |
| Load Upper Immediate                                                     |  |  |  |  |                                   |
| Subtract Immediate                                                       |  |  |  |  |                                   |
| Logical                                                                  |  |  |  |  |                                   |
| XOR Immediate                                                            |  |  |  |  |                                   |
| OR Immediate                                                             |  |  |  |  |                                   |
| AND Immediate                                                            |  |  |  |  |                                   |
| Compare                                                                  |  |  |  |  |                                   |
| Set < Immediate                                                          |  |  |  |  |                                   |
| Set < Unsigned                                                           |  |  |  |  |                                   |
| Set < Imm Unsigned                                                       |  |  |  |  |                                   |
| Branches                                                                 |  |  |  |  |                                   |
| Branch                                                                   |  |  |  |  |                                   |
| Branch                                                                   |  |  |  |  |                                   |
| Branch < Unsigned                                                        |  |  |  |  |                                   |
| Branch < Unsigned                                                        |  |  |  |  |                                   |
| Branch < Unsigned                                                        |  |  |  |  |                                   |
| Jump & Link                                                              |  |  |  |  |                                   |
| Load & Link                                                              |  |  |  |  |                                   |
| Sync Inst & Data                                                         |  |  |  |  |                                   |
| System Upper CALL                                                        |  |  |  |  |                                   |
| System Upper BREAK                                                       |  |  |  |  |                                   |
| Counters                                                                 |  |  |  |  |                                   |
| Reset CYCLES                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLES                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |
| Reset CYCLEL                                                             |  |  |  |  |                                   |
| Reset CYCLEH                                                             |  |  |  |  |                                   |

### **32-bit Instruction Formats**

① Free & Open RISC-V Reference Card (riscv.org) ②

| Optional Multi-Byte Divide Instruction Extension: RVN                 |                              |                     |                          |                |             |      |                    |            |                                  |
|-----------------------------------------------------------------------|------------------------------|---------------------|--------------------------|----------------|-------------|------|--------------------|------------|----------------------------------|
| RV32M (Multiply-Divide)                                               |                              |                     |                          |                | +RV(64,12B) |      |                    |            |                                  |
| Category                                                              | Name                         | Fmt                 | Op                       | Opnd           | Opnd        | Opnd | Opnd               | Opnd       | Opnd                             |
|                                                                       | Multiply                     | R MUL               |                          | rd,rs1,rs2     |             |      | MUL(W D)           | rd,rs1,rs2 |                                  |
|                                                                       | Multiply Half                | R MULH              |                          | rd,rs1,rs2     |             |      | MULH(W D)          | rd,rs1,rs2 |                                  |
| Multiply                                                              | Multiply upper Half          | R MULHU             |                          | rd,rs1,rs2     |             |      | MULHU(W D)         | rd,rs1,rs2 |                                  |
|                                                                       | Divide                       | R DIV               |                          | rd,rs1,rs2     |             |      | DIV(W D)           | rd,rs1,rs2 |                                  |
|                                                                       | Divide Unsigned              | R DIVU              |                          | rd,rs1,rs2     |             |      | DIVU(W D)          | rd,rs1,rs2 |                                  |
| Remainder                                                             | REMD                         | R REM               |                          | rd,rs1,rs2     |             |      | REM(W D)           | rd,rs1,rs2 |                                  |
| RtHander Unsigned                                                     |                              |                     |                          |                |             |      |                    |            |                                  |
| Optional Atomic Instruction Extension: RVA                            |                              |                     |                          |                |             |      |                    |            |                                  |
| Category                                                              | Name                         | Fmt                 | RV32A (Atomic)           |                | +RV(64,12B) |      |                    |            |                                  |
|                                                                       | Load                         | Load Registered     | R LW                     |                | rd,rs1      |      | LW(B D)            | rd,rs1     |                                  |
|                                                                       | Store                        | Store Conditional   | R SC                     |                | rd,rs1,rs2  |      | SC(B D)            | rd,rs1,rs2 |                                  |
| Swap                                                                  | Swap                         | R ANOSHAD_W         |                          | rd,rs1,rs2     |             |      | ANOSHAD_W(D D)     | rd,rs1,rs2 |                                  |
|                                                                       | Add                          | R AMOADD_W          |                          | rd,rs1,rs2     |             |      | AMOADD_W(D D)      | rd,rs1,rs2 |                                  |
|                                                                       | Logical                      | R ANOMIN_W          |                          | rd,rs1,rs2     |             |      | ANOMIN_W(D D)      | rd,rs1,rs2 |                                  |
| Min/Max                                                               | AND                          | R ANOMAX_W          |                          | rd,rs1,rs2     |             |      | ANOMAX_W(D D)      | rd,rs1,rs2 |                                  |
|                                                                       | OR                           | R ANOKH_W           |                          | rd,rs1,rs2     |             |      | ANOKH_W(D D)       | rd,rs1,rs2 |                                  |
|                                                                       | Min/Max                      | R ANOMIN_W          |                          | rd,rs1,rs2     |             |      | ANOMIN_W(D D)      | rd,rs1,rs2 |                                  |
| Min/Max                                                               | MAXIMUM                      | R ANOMAX_W          |                          | rd,rs1,rs2     |             |      | ANOMAX_W(D D)      | rd,rs1,rs2 |                                  |
|                                                                       | MINIMUM                      | R ANOMIN_W          |                          | rd,rs1,rs2     |             |      | ANOMIN_W(D D)      | rd,rs1,rs2 |                                  |
|                                                                       | MINIMUM Unsigned             | R ANOMAX_W          |                          | rd,rs1,rs2     |             |      | ANOMAX_W(D D)      | rd,rs1,rs2 |                                  |
| Three Optional Floating-Point Instruction Extensions: RVF, RVD, & RVQ |                              |                     |                          |                |             |      |                    |            |                                  |
| Category                                                              | Name                         | Fmt                 | RV32F (Single-Precision) |                | +RV(64,12B) |      |                    |            |                                  |
|                                                                       | Move                         | Move from Int       | R FV(.B .D .S .W)        |                | rd,rs1      |      | FMV(.B .D .S .W)   | rd,rs1     |                                  |
|                                                                       | Move                         | Move to Int         | R VF(.B .D .S .W)        |                | rd,rs1      |      | FMV(.X (D S))      | rd,rs1     |                                  |
| Convert                                                               | Convert from Int             | R FCVT(.H (D S))_W  |                          | rd,rs1         |             |      | FCVT(.H (D S))_W   | (L T)      | rd,rs1                           |
|                                                                       | Convert from Int Unsigned    | R FCVT(.H (D S))_U  |                          | rd,rs1         |             |      | FCVT(.H (D S))_U   | (L T)      | rd,rs1                           |
|                                                                       | Convert to Int               | R FCVT(.H (D S))_I  |                          | rd,rs1         |             |      | FCVT(.L (T))_I     | (D B)      | rd,rs1                           |
| Convert                                                               | Convert to Int Unsigned      | R FCVT(.H (D S))_UI |                          | rd,rs1         |             |      | FCVT(.L (T))_UI    | (D B)      | rd,rs1                           |
|                                                                       | Load                         | Load                | I FLD                    |                | rd,rs1      |      |                    |            |                                  |
|                                                                       | Store                        | Store               | I FLD                    |                | rd,rs1,rs2  |      |                    |            |                                  |
| RISC-V Calling Convention                                             |                              |                     |                          |                |             |      |                    |            |                                  |
| Arithmetic                                                            | ADD                          | R FADD(.S (D B))_D  |                          | rd,rs1,rs2     |             |      | REGISTERS ABI Name | Save       | Description                      |
|                                                                       | Subtract                     | R FSUB(.S (D B))_D  |                          | rd,rs1,rs2     |             |      |                    | zero       | Hard-wired zero                  |
|                                                                       | Multiply                     | R FMUL(.S (D B))_D  |                          | rd,rs1,rs2     |             |      |                    | ra         | Return address                   |
| Mul-Add                                                               | Divide                       | R FDIV(.S (D B))_D  |                          | rd,rs1,rs2     |             |      |                    | sp         | Stack pointer                    |
|                                                                       | Multiply-Add                 | R FMA(.S (D B))_D   |                          | rd,rs1,rs2,rs3 |             |      |                    | fp         | Global pointer                   |
|                                                                       | Multiply-Subtract            | R FMS(.S (D B))_D   |                          | rd,rs1,rs2,rs3 |             |      |                    | tp         | Temporary                        |
| Sign Inject                                                           | Negative Multiply-Subtract   | R FMSH(.S (D B))_D  |                          | rd,rs1,rs2,rs3 |             |      |                    | fp_offset  | Saved register/frame pointer     |
|                                                                       | Negative Multiply-Add        | R FMSH(.S (D B))_U  |                          | rd,rs1,rs2,rs3 |             |      |                    | sa         | Saved register                   |
|                                                                       | Negative Sign Source         | R FMSB(.S (D B))_D  |                          | rd,rs1,rs2     |             |      |                    | a1-a11     | Function arguments/return values |
| Sign Inject                                                           | Xor Sign Source              | R FMSB(.S (D B))_U  |                          | rd,rs1,rs2     |             |      |                    | a1-a11     | Function arguments               |
|                                                                       | Negative Sign Source         | R FMSB(.S (D B))_D  |                          | rd,rs1,rs2     |             |      |                    | a1-a11     | Temporary registers              |
|                                                                       | Xor Sign Source              | R FMSB(.S (D B))_U  |                          | rd,rs1,rs2     |             |      |                    | a1-a11     | Temporary registers              |
| Min/Max                                                               | Minimum                      | R FMIN(.B (D S))_D  |                          | rd,rs1,rs2     |             |      |                    |            |                                  |
|                                                                       | Maximum                      | R FMAX(.B (D S))_D  |                          | rd,rs1,rs2     |             |      |                    |            |                                  |
|                                                                       | Compare                      | Compare Float       | R FBQ(.S (D B))_D        |                | rd,rs1,rs2  |      |                    |            |                                  |
| Compare                                                               | Compare Float                | R FBQ(.S (D B))_U   |                          | rd,rs1,rs2     |             |      |                    |            |                                  |
|                                                                       | Compare Float                | R FBQ(.S (D B))_D   |                          | rd,rs1,rs2     |             |      |                    |            |                                  |
|                                                                       | Categorization               | Classify Type       | R PCLASR(.B (D S))_D     |                | rd,rs1,rs2  |      |                    |            |                                  |
| Configuration                                                         | Read Status                  | R FCSR              |                          | rd             |             |      |                    |            |                                  |
|                                                                       | Read Rounding Node           | R FRNU              |                          | rd             |             |      |                    |            |                                  |
|                                                                       | Read Flags                   | R FPLFLAGS          |                          | rd             |             |      |                    |            |                                  |
| Swap                                                                  | Swap Status Reg              | R FCSR              |                          | rd,rs1         |             |      |                    |            |                                  |
|                                                                       | Swap Rounding Node           | R FRNU              |                          | rd,rs1         |             |      |                    |            |                                  |
|                                                                       | Swap Flags                   | R FPLFLAGS          |                          | rd,rs1         |             |      |                    |            |                                  |
| Swap                                                                  | Swap Floating-Point Mode Imm | R FPMUI             |                          | rd,imm         |             |      |                    |            |                                  |
|                                                                       | Swap Frame Imm               | R FPFLAGSI          |                          | rd,imm         |             |      |                    |            |                                  |
|                                                                       | Swap Floating-Point Mode     | R FPMU              |                          | rd,imm         |             |      |                    |            |                                  |

DHV-17, a 16-kDa protein with a molecular mass of 17 kDa, contains 176 amino acids (DBT238.0). It contains 176 amino acids (DBT238.0), 176 amino acids (DBT238.0), and

# RISC-V architecture and instruction set

## Free & Open RISC-V Reference Card

| Basic Instructions |                        | RV32I Base |        | RV32I and RV128I |     | RV32I Base |      | RV32I and RV128I |     |
|--------------------|------------------------|------------|--------|------------------|-----|------------|------|------------------|-----|
| Category           | Name / Form            | Form       | Op     | Op               | Op  | Op         | Op   | Op               | Op  |
| Loads              | Load Imm               | rd,rs1,imm | ld     | rs1              | op  | ld         | rs1  | imm              | op  |
|                    | Load Halfword          | rd,rs1,imm | ld     | rs1              | op  | ld         | rs1  | imm              | op  |
|                    | Load Word              | rd,rs1,imm | ld     | rs1              | op  | ld         | rs1  | imm              | op  |
|                    | Load Byte Unsigned     | rd,rs1,imm | ld     | rs1              | op  | ld         | rs1  | imm              | op  |
|                    | Load Byte Signed       | rd,rs1,imm | ld     | rs1              | op  | ld         | rs1  | imm              | op  |
| Stores             | Store Imm              | rs1,rd,imm | sd     | rs1              | imm | sd         | rs1  | imm              | imm |
|                    | Store Halfword         | rs1,rd,imm | sd     | rs1              | imm | sd         | rs1  | imm              | imm |
|                    | Store Word             | rs1,rd,imm | sd     | rs1              | imm | sd         | rs1  | imm              | imm |
| Shifts             | Shift Left Immediate   | rd,rs1,imm | sl     | rs1              | imm | sl         | rs1  | imm              | imm |
|                    | Shift Right Immediate  | rd,rs1,imm | sr     | rs1              | imm | sr         | rs1  | imm              | imm |
|                    | Shift Left Arithmetic  | rd,rs1,imm | sl     | rs1              | imm | sl         | rs1  | imm              | imm |
|                    | Shift Right Arithmetic | rd,rs1,imm | sr     | rs1              | imm | sr         | rs1  | imm              | imm |
| Arithmetic         | Add Immediate          | rd,rs1,imm | add    | rs1              | imm | add        | rs1  | imm              | imm |
|                    | Subtract Immediate     | rd,rs1,imm | sub    | rs1              | imm | sub        | rs1  | imm              | imm |
|                    | Load Word              | rd,rs1,imm | lw     | rs1              | imm | lw         | rs1  | imm              | imm |
|                    | Load Halfword          | rd,rs1,imm | lh     | rs1              | imm | lh         | rs1  | imm              | imm |
|                    | Load Double            | rd,rs1,imm | ld     | rs1              | imm | ld         | rs1  | imm              | imm |
|                    | Load Quad              | rd,rs1,imm | lq     | rs1              | imm | lq         | rs1  | imm              | imm |
|                    | Load Quad SP           | rd,rs1,imm | ldr    | rs1              | imm | ldr        | rs1  | imm              | imm |
|                    | Store Quad             | rs1,rd,imm | sw     | rs1              | imm | sw         | rs1  | imm              | imm |
|                    | Store Double SP        | rs1,rd,imm | sdsp   | rs1              | imm | sdsp       | rs1  | imm              | imm |
|                    | Branch                 | rs1,rd,imm | br     | rs1              | imm | br         | rs1  | imm              | imm |
|                    | Branch +1              | rs1,rd,imm | br     | rs1              | imm | br         | rs1  | imm              | imm |
|                    | Branch -1              | rs1,rd,imm | br     | rs1              | imm | br         | rs1  | imm              | imm |
|                    | Branch + Unsigned      | rs1,rd,imm | br     | rs1              | imm | br         | rs1  | imm              | imm |
|                    | Branch - Unsigned      | rs1,rd,imm | br     | rs1              | imm | br         | rs1  | imm              | imm |
| Jump & Link        | Jump                   | rd,imm     | jr     | rd               | imm | jr         | rd   | imm              | imm |
|                    | Jump & Link Register   | rd,imm     | jal    | rd               | imm | jal        | rd   | imm              | imm |
|                    | Synch                  | rs1,rd,imm | sync   | rs1              | rd  | imm        | sync | rs1              | rd  |
|                    | System Trap            | rd,imm     | trap   | rd               | imm | trap       | rd   | imm              | imm |
| System Trap        | Call System            | rd,imm     | ecall  | rd               | imm | ecall      | rd   | imm              | imm |
| Counters           | Count CYCLE            | rd,imm     | ecyc   | rd               | imm | ecyc       | rd   | imm              | imm |
|                    | Read CYCLE             | rd,imm     | lcycle | rd               | imm | lcycle     | rd   | imm              | imm |
|                    | Read TIME              | rd,imm     | ltim   | rd               | imm | ltim       | rd   | imm              | imm |
|                    | Read INSTR             | rd,imm     | linstr | rd               | imm | linstr     | rd   | imm              | imm |
| System             | Env. Break             | rd,imm     | ebreak | rd               | imm | ebreak     | rd   | imm              | imm |

32-bit Instruction Formats

| R  | S  | U  | Imm[31] | Imm[21] | Imm[12] | Imm[11] | Imm[10] | Imm[9] | Imm[8] | Imm[7] | Imm[6] | Imm[5] | Imm[4] | Imm[3] | Imm[2] | Imm[1] | Imm[0] | Op |
|----|----|----|---------|---------|---------|---------|---------|--------|--------|--------|--------|--------|--------|--------|--------|--------|--------|----|
| 00 | 00 | 00 | rd      | rs1     | rd      | rs1     | rd      | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | op |
| 01 | 00 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 10 | 00 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 11 | 00 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 00 | 01 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 01 | 01 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 10 | 01 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 11 | 01 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 00 | 10 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 01 | 10 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 10 | 10 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 11 | 10 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 00 | 00 | 01 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 01 | 00 | 01 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 10 | 00 | 01 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 11 | 00 | 01 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 00 | 00 | 10 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 01 | 00 | 10 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 10 | 00 | 10 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 11 | 00 | 10 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 00 | 00 | 11 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 01 | 00 | 11 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 10 | 00 | 11 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 11 | 00 | 11 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |

16-bit (RVC) Instruction Formats

| R  | S  | U  | Imm[31] | Imm[21] | Imm[12] | Imm[11] | Imm[10] | Imm[9] | Imm[8] | Imm[7] | Imm[6] | Imm[5] | Imm[4] | Imm[3] | Imm[2] | Imm[1] | Imm[0] | Op |
|----|----|----|---------|---------|---------|---------|---------|--------|--------|--------|--------|--------|--------|--------|--------|--------|--------|----|
| 00 | 00 | 00 | rd      | rs1     | rd      | rs1     | rd      | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | op |
| 01 | 00 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 10 | 00 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 11 | 00 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 00 | 01 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 01 | 01 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 10 | 01 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 11 | 01 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 00 | 10 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 01 | 10 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 10 | 10 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 11 | 10 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 00 | 00 | 01 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 01 | 00 | 01 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 10 | 00 | 01 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 11 | 00 | 01 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 00 | 00 | 10 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 01 | 00 | 10 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 10 | 00 | 10 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 11 | 00 | 10 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 00 | 00 | 11 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 01 | 00 | 11 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 10 | 00 | 11 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 11 | 00 | 11 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |

32-bit Instruction Formats

| R  | S  | U  | Imm[31] | Imm[21] | Imm[12] | Imm[11] | Imm[10] | Imm[9] | Imm[8] | Imm[7] | Imm[6] | Imm[5] | Imm[4] | Imm[3] | Imm[2] | Imm[1] | Imm[0] | Op |
|----|----|----|---------|---------|---------|---------|---------|--------|--------|--------|--------|--------|--------|--------|--------|--------|--------|----|
| 00 | 00 | 00 | rd      | rs1     | rd      | rs1     | rd      | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | op |
| 01 | 00 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 10 | 00 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 11 | 00 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 00 | 01 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 01 | 01 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 10 | 01 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 11 | 01 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 00 | 10 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | rd     | rs1    | op |
| 01 | 10 | 00 | rs1     | rd      | rs1     | rd      | rs1     | rd     | rs1    | rd     |        |        |        |        |        |        |        |    |

# RISC-V architecture and instruction set

| Free & Open RISC-V Reference Card                                                        |  |  |  |  |                                   |  |  |  |  |
|------------------------------------------------------------------------------------------|--|--|--|--|-----------------------------------|--|--|--|--|
| Basic Instructions                                                                       |  |  |  |  |                                   |  |  |  |  |
| <b>Basic Instructions</b> (32-bit) Instruction Extension: RV32I and RV128I<br>+RV(FP128) |  |  |  |  | <b>Privilege Mode</b>             |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>RV Privileged Instructions</b> |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |
| <b>Category</b>                                                                          |  |  |  |  | <b>Category</b>                   |  |  |  |  |

# What are the performance metrics?

## Performance metrics

- **Area** (num. of simple logic gates) or  $mm^2$
- **Frequency** (depends on num. of gates on longest path)
- **Power**
  - Strongly depends on area & frequency
  - Leakage: dissipated even when not working (Area)
  - Dynamic Power: dissipated on logic transitions (Frequency & Area)

## CPU design

- **IPC** (instruction per cycle)
  - IPC implicitly measured in commonly used benchmarks (Coremark, Dhrystone, SpecInt)
- **Energy Efficiency:** OPs/Joule

## Hardware designer

- Tries to find a good balance
- Considers application (HPC, tinyML, etc..)
- One size unfortunately does not fit all!

# Several RISC-V flavours

## Zero-riscy / Ibex

- Low Cost Core
- 32-bit CPU
- fully open source
- GitHub



lowRISC  
OPEN TO THE CORE  
OpenTitan Quickstart Guide

## DSP

- RI5CY
- 32-bit CPU SIMD
- fully open source
- GitHub



## Ariane

- Linux capable core
- 64-bit CPU
- fully open source
- GitHub



# Zero-riscy / Ibex, small core for control applications

## Ibex

- 2-stage pipeline
- Optimized for area
  - Area:
    - 16kGE (Zero-riscy)
    - 12kGE (Micro-riscy)
  - Critical path:
    - $\sim 30$  logic levels



Two main configurations:

- Zero-riscy: RV32IMC (2,44 Coremark/Mhz)
  - 32 registers, hardware multiplier
- Micro-riscy: RV32EC (0,91 Coremark/Mhz)
  - 16 registers (E), software emulated multiplier

## Applications

- Zero-riscy/ibex is suitable for simple applications (control, book-keeping)
- For number crunching, we need more capable cores
- Tuned for cost and simplicity
- Does not make use of custom extensions

# RI5CY 32-bit core

## RI5CY

- 4-stage pipeline
  - Area: 41kGE
- Includes extensions
  - SIMD
  - Fixed point
  - Bit manipulations
  - HW loops



## Different Options:

- FPU: IEEE 754 single precision
  - including hardware support for FDIV, FSQRT, FMAC, FMUL
- Privilege support
  - Supports privilege mode (M) and (U)

# RISC-V has space for custom instruction

## Instruction Set X

- There is a reserved decoding space for custom instructions
  - allows everyone to add new instructions to the core
  - the address decoding space is reserved, it will not be used by further extensions
  - implementations supporting custom instructions will be compatible with standard ISA
    - code compiled for standard RISC-V will run without issues
- The user has to provide support to take advantage of the additional instructions
  - compiler that generates code for the custom instructions

# RISC-V ISA extension example

## Example of C code

```
for(int i=0; i<128; i++) {  
    c[i] += a[i]+b[i];  
}
```

Baseline 11  
cycles/output

```
mv x5,0  
mv x4, 100  
Lstart:  
lb x2,0(x10)  
lb x3,0(x12)  
addi x10,x10,1  
addi x11,x11,1  
add x2, x3, x2  
sb x2, 0(x12)  
addi x4,x4,-1  
addi x12,x12,1  
bne x4,x5, Lstart
```

Auto-incr, 8  
cycles/output

```
mv x5,0  
mv x4, 100  
Lstart:  
lb x2,0(x10!)  
lb x3,0(x12!)  
addi x4,x4,-1  
add x2, x3, x2  
sb x2, 0(x12!)  
bne x4,x5, Lstart
```

HW loops, 5  
cycles/output

```
lp.setupi 100, Lend  
lb x2,0(x10!)  
lb x3,0(x12!)  
add x2, x3, x2  
Lend sb x2, 0(x12!)
```

SIMD, 1.25  
cycles/output

```
lp.setupi 25, Lend  
lw x2,0(x10!)  
lw x3,0(x12!)  
pv.add.b x2, x3, x2  
Lend sw x2, 0(x12!)
```

# RISC-V architecture and instruction set

## Basic instruction set is not efficient for DNNs

With the **basic instruction set**, we require 4 cycles to do a single MAC operation!

```
addi a0,a0,1  
addi t1,t1,1  
addi t3,t3,1  
addi t4,t4,1  
lbu a7,-1(a0)  
lbu a6,-1(t4)  
lbu a5,-1(t3)  
lbu t5,-1(1)  
mul s1,a7,a6  
mul a7,a7,a5  
add s0,s0,s1  
mul a6,a6,t5  
add t0,t0,a7  
mul a5,a5,t5  
add t2,t2,a6  
add t6,t6,a5  
bne s5,a0,1c000bc
```

- multiply accumulate of two 8 bits of words (dot product of two vectors)
- with the baseline instruction set it is inefficient
- we need instruction extension to make it more efficient!

# RISC-V architecture and instruction set

Many DNN operations are just non-data dependent loops!

- Automatic address update
  1. update base register with computed address after the memory access
  2. save instruction to update address
- post-increment
  1. base address serves as memory address
- offset can be stored in
  - register
  - immediate

```
for(int i=0; i<128; i++) {  
    c += a[i]*b[i];  
}
```

- combine load instruction with increment
- load instruction with post increment

Orginal RISC-V

```
addi x4,x0,64  
Lstart:  
lb x2,0(x10)  
lb x3,0(x12)  
addi x10,x10,1  
addi x12,x12,1  
...  
bne x2,x3, Lstart
```

Auto-incr load/store

```
addi x4,x0,64  
Lstart:  
lb x2,0(x10!)  
lb x3,0(x12!)  
...  
bne x2,x3, Lstart
```

- save 2 instructions

# Outline

Recap

Introduction

Binary Neural Networks

Architectural choices

RISC-V MCUs & TinyML

Conclusion

# Conclusions

- TinyML enables practical ML on low-power MCUs
- RISC-V provides a flexible, efficient architecture
- Custom instruction extensions significantly boost efficiency
- Many real-world applications in various domains