

# RISC-V SoC Microarchitecture Design & Optimization

## Design Review #1

**Group 23**

**Instructor & Sponsor:** Weikang Qian

**Group Member:** Li Shi, Jian Shi, Yichao Yuan, Yiqiu Sun, Zhiyuan Liu



JOINT INSTITUTE  
交大密西根学院

# Overview

- Team Roles & Individual Introductions
- Review of Design Problem
- Literature Search & Benchmarking
- Quantification of Design Specifications
- Project Plan
- Conclusion

# 1. Team Roles & Individual Introductions

# Team Roles & Individual Introductions



**Li Shi**  
*ECE*  
*Team Leader*



**Jian Shi**  
*ECE*  
*Programmer*



**Yichao Yuan**  
*ECE*  
*Hardware Support*

# Team Roles & Individual Introductions



**Yiqiu Sun**  
*ECE*  
*Technical Support*



**Zhiyuan Liu**  
*ECE*  
*Programmer*

## 2. Review of Design Problem

# Review of Design Problem

## Trend – Moore's Law Is Dying



Figure 1. Growth in processor performance over 40 years.

Source: John L. Hennessy, David A. Patterson. *Computer Architecture: A Quantitative Approach* (Sixth Edition). Morgan Kaufmann, 2017.

# Review of Design Problem

## Design Problem – Applications in Embedded Systems



Figure 2. Tesla self-driving cars.

Source: [www.businessinsider.com/tesla-autopilot-full-self-driving-subscription-early-2021-elon-musk-2020-12](http://www.businessinsider.com/tesla-autopilot-full-self-driving-subscription-early-2021-elon-musk-2020-12)



Figure 3. Face recognition.

Source: [www.faceplusplus.com.cn/face-detection](http://www.faceplusplus.com.cn/face-detection)



# 3. Literature Search & Benchmarking

# Literature Search & Benchmarking

## MIPS processor with classical 5-stage pipeline



Figure 4. MIPS processor with classical 5-stage pipeline.

Source: John L. Hennessy, David A. Patterson. *Computer Organization and Design MIPS Edition: The Hardware/Software Interface* (Sixth Edition). Elsevier Science, 2020.



# Literature Search & Benchmarking

## Rocket Core



Figure 5. Rocket core.

Source: [www.cl.cam.ac.uk/~jrrk2/docs/tagged-memory-v0.1/rocket-core/](http://www.cl.cam.ac.uk/~jrrk2/docs/tagged-memory-v0.1/rocket-core/)



# Literature Search & Benchmarking

## The Berkeley Out-of-Order Machine (BOOM)



Figure 6. BOOM core.

Source: [docs.boom-core.org/en/latest/sections/intro-overview/boom-pipeline.html](https://docs.boom-core.org/en/latest/sections/intro-overview/boom-pipeline.html)



# 4. Quantification of Design Specifications

# Quantification of Design Specifications

## Customer Requirement (CR)

Be compatible  
with RISC-V apps

Have detailed docs  
for reference

Is inexpensive

*General*

Run fast for normal  
arithmetic programs

Run fast for machine  
learning (ML) apps

Run fast for  
memory-bound apps

*Performance*

Be easy to configure  
for various parameters

Have good support for  
multiple I/O devices

Save power

Respond quickly

*Embedded-System*

# Quantification of Design Specifications

## Engineering Specifications (ES)

|                                                  | <b>Unit</b> | <b>Target Value</b> |
|--------------------------------------------------|-------------|---------------------|
| Support RV32G instruction set architecture (ISA) | -           | Yes                 |
| Core frequency on FPGA test platform             | MHz         | 100                 |
| Number of pipeline stages                        | -           | 9                   |
| Instructions executed per clock cycle (IPC)      | -           | 0.5                 |
| Support instruction dynamic scheduling           | -           | Yes                 |
| Typical total cache size                         | KB          | 32                  |
| Number of function units                         | -           | 6                   |
| Average response time to a request for service   | ms          | 10                  |
| Usage of look-up tables (LUT) on FPGA            | k           | 120                 |
| Usage of block RAM (BRAM) on FPGA                | -           | 50                  |
| Usage of digital signal processor (DSP) on FPGA  | -           | 30                  |
| Power consumption on target FPGA test platform   | W           | 5                   |
| Operations processed within unit energy          | MOp/J       | 25                  |
| Number of flexibly-configured modules            | -           | 10                  |
| Number of I/O device types                       | -           | 3                   |
| User guide and programmers manual                | -           | Yes                 |

# Quantification of Design Specifications

## Engineering Specifications (ES)

|                                                  | <b>Unit</b> | <b>Target Value</b> |
|--------------------------------------------------|-------------|---------------------|
| Support RV32G instruction set architecture (ISA) | -           | Yes                 |
| Core frequency on FPGA test platform             | MHz         | 100                 |
| Number of pipeline stages                        | -           | 9                   |
| Instructions executed per clock cycle (IPC)      | -           | 0.5                 |
| Support instruction dynamic scheduling           | -           | Yes                 |
| Typical total cache size                         | KB          | 32                  |
| Number of function units                         | -           | 6                   |
| Average response time to a request for service   | ms          | 10                  |
| Usage of look-up tables (LUT) on FPGA            | k           | 120                 |
| Usage of block RAM (BRAM) on FPGA                | -           | 50                  |
| Usage of digital signal processor (DSP) on FPGA  | -           | 30                  |
| Power consumption on target FPGA test platform   | W           | 5                   |
| Operations processed within unit energy          | MOp/J       | 25                  |
| Number of flexibly-configured modules            | -           | 10                  |
| Number of I/O device types                       | -           | 3                   |
| User guide and programmers manual                | -           | Yes                 |

1

2

# QFD



Figure 7. QFD diagram.

Data source: Alexander Dörflinger, et al. 2021. A comparative survey of open-source application-class RISC-V processor implementations. *Proceedings of the 18th ACM International Conference on Computing Frontiers*. ACM, New York, NY, USA, 12–20.

# 5. Project Plan

# Project Plan

## Milestones

**Week 2-5**  
*Preparation*

Write main components in RISC-V out-of-order (O3) pipeline  
Be familiar with memory blocks on FPGA development board

**Week 5-7**  
*Milestone 1*

Build a naive simulation-correct and synthesizable RISC-V O3 core

**Week 7-8**  
*Milestone 1.5*

Finish building a complete simulation-correct and synthesis-correct RISC-V O3 core, with a complete memory workflow

**Week 8-10**  
*Milestone 2*

Complete extra components, e.g., approximate units, I/O devices

**Week 10-12**  
*Milestone 3*

Run various research experiments on the architecture to test out the performance/power/area balance under different scenarios

# Project Plan

## Gantt chart



Figure 8. Gantt chart.

# 6. Conclusion

# Conclusion



# Q&A

## RISC-V SoC Microarchitecture Design & Optimization

**Group 23**

**Instructor & Sponsor:** Weikang Qian

**Group Member:** Li Shi, Jian Shi, Yichao Yuan, Yiqiu Sun, Zhiyuan Liu



JOINT INSTITUTE  
交大密西根学院

# Thank you!

## RISC-V SoC Microarchitecture Design & Optimization

**Group 23**

**Instructor & Sponsor:** Weikang Qian

**Group Member:** Li Shi, Jian Shi, Yichao Yuan, Yiqiu Sun, Zhiyuan Liu



JOINT INSTITUTE  
交大密西根学院