



inst.eecs.berkeley.edu/~eecs251b

# EECS251B : Advanced Digital Circuits and Systems

## Lecture 2 – Building SoCs



Borivoje Nikolić  
Vladimir Stojanović  
Sophia Shao



<https://github.com/ucl-bar/chipyard>

EECS251B L02 CHIPYARD

Berkeley 

1



## Announcements

- Assignment 1/Lab 1 posted this week
- No class on February 22 (ISSCC)

EECS251B L02 CHIPYARD

2

2

## Projects and Exam

- Done in pairs or alone
- Due dates:
  - Abstract: February 24
  - Title, a paragraph and 5 references
  - Midterm report: March 18, before Spring break
    - 4 pages, paper study
  - Final report: May 2
    - 6 pages
    - Design
  - Final exam is on April 28 (last class)

EECS251B L02 CHIPYARD

3

## Assigned Reading

On an SoC generator

- A. Amid, et al, "Chipyard: Integrated design, simulation, and implementation framework for custom SoCs," IEEE Micro, 2020.

On transistor models (in about 2 weeks):

- R.H. Dennard et al, "Design of ion-implanted MOSFET's with very small physical dimensions" IEEE Journal of Solid-State Circuits, April 1974.
  - Just the scaling principles
- C.G. Sodini, P.-K. Ko, J.L. Moll, "The effect of high fields on MOS device and circuit performance," IEEE Trans. on Electron Devices, vol. 31, no. 10, pp. 1386 - 1393, Oct. 1984.
- K.-Y. Toh, P.-K. Ko, R.G. Meyer, "An engineering model for short-channel MOS devices" IEEE Journal of Solid-State Circuits, vol. 23, no. 4, pp. 950-958, Aug. 1988.
- T. Sakurai, A.R. Newton, "Alpha-power law MOSFET model and its applications to CMOS inverter delay and other formulas," IEEE Journal of Solid-State Circuits, vol. 25, no. 2, pp. 584 - 594, April 1990.

EECS251B L02 CHIPYARD

4

## Outline

- SoC generator: Chipyard
  - Great for class (and other) projects

EECS251B L02 CHIPYARD

5

## RISC-V Processor Core



EECS251B L02 CHIPYARD

6



7



8



9



10

## Data Cache

- L1 cache



11

## To Build an SoC

- Processor cores (Rocket, BOOM, ...)
  - Memory system (w/ coherence protocol)
  - Interconnect (TileLink)
  - Custom blocks (e.g. communication, imaging)
  - Standard peripheral devices
    - JTAG
    - SPI
    - I2C
    - BootROM
    -

12



13



14



15



16



17



18



19



20

## Gemmini: A Systolic(ish) Generator

- Systolic Array Accelerator
- Fully configurable
  - Dataflow – Output/Weight Stationary
  - Dimensions
  - Bitwidths/Datatypes
  - Pipeline Depth
  - Memory capacity
  - Memory banking
  - Memory Bus Width

<https://www.github.com/ucb-bar/gemmini>

21

## Hwacha: A Vector Accelerator

- Configurable access/execute decoupled vector architecture\*

\* Not based on RV-V

22

## L2 Cache and Memory System

- Multi-bank shared L2
  - SiFive's open-source IP
  - Fully coherent
  - Configurable size, associativity
  - Supports atomics, prefetch hints
- Non-caching L2 Broadcast Hub
  - Coherence w/o caching
  - Bufferless design
- Multi-channel memory system
  - Conversion to AXI4 for compatible DRAM controllers



23

## Core Complex Devices

- BootROM
  - First-stage bootloader
  - DeviceTree
- PLIC
- CLINT
  - Software interrupts
  - Timer interrupts
- Debug Unit
  - DMI
  - JTAG



24

## Other Chipyard Blocks

- **Hardfloat:** Parameterized Chisel generators for hardware floating-point units
- **IceNet:** Custom NIC for FireSim simulations
- **SiFive-Blocks:** Open-sourced Chisel peripherals
  - GPIO, SPI, UART, etc.
- **TestchipIP:** Berkeley utilities for chip testing/bringup
  - Tethered serial interface
  - Simulated block device
- **SHA3:** Educational SHA3 RoCC accelerator

EECS251B L02 CHIPYARD

25

## Customization

- Cores and controllers: Intra-core Rocket/BOOM configurations
  - Control core / PMU as an example
- Simple RoCC accelerators
  - SHA3 as an ‘instructional’ demo
- Complex RoCC accelerators
  - Hwacha and Gemmini as examples
- MMIO Tilelink accelerators
- Peripherals

The diagram illustrates several Rocket and BOOM core configurations and their interconnections. It includes:
 

- Rocket/BOOM RV64OC Application Processor:** Contains a Rocket RV64MAC Core, 4 KB Spd, 4 KB L1\$, and a RoCC Accelerator with 16 KB L1DS and 16 KB L1\$. It connects to a TileLink Crossbar.
- Rocket RV64MAC Control Core Unit:** Contains a RoCC L2 Accelerator with 16 KB Spd, 4 KB Spd, and 4 KB L1\$.
- Rocket RV64GC:** Contains a SHA3 RoCC Accelerator with 16 KB L1DS and 16 KB L1\$.
- TileLink Crossbars:** Connect the cores to a 512 KB L2 Cache and DRAM.
- Peripherals:** Includes a Systolic Array, Accumulators, 512 KB Scratchpad, UART, GPIO, JTAG, and a SimBlockDevice.
- BOOM RV64GC:** Contains a 3-wide Decode, RF, Sched, TLB, FPU, ROB, and 32 KB L1\$.
- Hwacha Vector Accelerator:** Contains a Master Seq., Scalar Unit, and 4 KB V1\$.
- Vector Lane 0:** Contains a Vector Execution Unit (VXU), Sequencer/Expander, and Vector Memory Unit (VMU).
- Interconnects:** TileLink Crossbars connect the cores to a 1024 KB L2 Cache, which then connects to a SimAXiMem.

EECS251B L02 CHIPYARD

26

## Rocket Chip Configuration

```

class MyCustomConfig extends Config(
    new WithExtMemSize((1<<30) * 2L)
    new WithBlockDevice
    new WithGPIO
    new WithBootROM
    new hwacha.DefaultHwachaConfig
    new WithInclusiveCache(capacityKB=1024)
    new boom.common.WithLargeBooms
    new boom.system.WithNBoomCores(3)
    new WithNormalBoomRocketTop
    new rocketchip.system.BaseConfig)

```

EECS251B L02 CHIPYARD

27

## Rocket Chip Configuration

```

class MyCustomConfig extends Config(
    new WithExtMemSize((1<<30) * 2L)
    new WithBlockDevice
    new WithGPIO
    new WithBootROM
    new hwacha.DefaultHwachaConfig
    new WithInclusiveCache(capacityKB=1024)
    new boom.common.WithLargeBooms
    new boom.system.WithNBoomCores(2)
    new rocketchip.subsystem.WithNBigCores(1) ++
    new WithNormalBoomRocketTop
    new rocketchip.system.BaseConfig)

```

EECS251B L02 CHIPYARD

28

## Rocket Chip Configuration

```

class MyCustomConfig extends Config(
    new WithExtMemSize((1<<30) * 2L)
    new WithBlockDevice
    new WithGPIO
    new WithBootROM
    new WithMultiRoCCGemmini(2)
    new WithMultiRoCCSha3(1)
    new WithMultiRoCCHwacha(0)
    new WithInclusiveCache(capacityKB=1024)
    new boom.common.WithLargeBooms
    new boom.system.WithNBoomCores(2)
    new rocketchip.subsystem.WithNBigCores(1)
    new WithNormalBoomRocketTop
    new rocketchip.system.BaseConfig)
++
```

EECS251B L02 CHIPYARD

29

## Rocket Chip Configuration

```

class MyCustomConfig extends Config(
    new WithExtMemSize((1<<30) * 2L)
    new WithBlockDevice
    new WithGPIO
    new WithBootROM
    new WithMultiRoCCGemmini(2)
    new WithMultiRoCCSha3(1)
    new WithMultiRoCCHwacha(0)
    new WithInclusiveCache(capacityKB=1024)
    new boom.common.WithLargeBooms
    new boom.system.WithNBoomCores(2)
    new rocketchip.subsystem.WithRV32
    new rocketchip.subsystem.WithNBigCores(1)
    new WithNormalBoomRocketTop
    new rocketchip.system.BaseConfig)
++
```

EECS251B L02 CHIPYARD

30

## Rocket Chip Configuration

```

class MyCustomConfig extends Config(
    new WithExtMemSize((1<<30) * 2L)
    new WithBlockDevice
    new WithGPIO
    new WithJtagDTM
    new WithBootROM
    new WithMultiRoCCGemmini(2)
    new WithMultiRoCCSha3(1)
    new WithMultiRoCCHwacha(0)
    new WithInclusiveCache(capacityKB=1024)
    new boom.common.WithLargeBooms
    new boom.system.WithNBoomCores(2)
    new rocketchip.subsystem.WithRV32
    new rocketchip.subsystem.WithNBigCores(1) ++
    new WithNormalBoomRocketTop
    new rocketchip.system.BaseConfig)

```

EECS251B L02 CHIPYARD

The diagram illustrates the TestHarness architecture. It features a central 'Top' section containing three tiles: Tile 0, Tile 1, and Tile 2. Tile 0 contains a 3-w BOOM, Hwacha, L1I\$, and L1D\$ blocks. Tile 1 contains a 3-w BOOM, SHA3, L1I\$, and L1D\$ blocks. Tile 2 contains an RV32Rocket, Gemmini, L1I\$, and L1D\$ blocks. The Top section also includes a SysBus, MemBus, GPIOs, BootROM, and a JTAG module. Below the tiles are two green rectangular blocks labeled 'SimBlockDevice' and 'SimAXIMem'. The number '31' is located in the bottom right corner.

31

## Rocket Chip Configuration

```

class MyCustomConfig extends Config(
    new WithExtMemSize((1<<30) * 2L)
    new WithBlockDevice
    new WithGPIO
    new WithJtagDTM
    new WithBootROM
    new WithRationalBoomTiles
    new WithRationalRocketTiles
    new WithMultiRoCCGemmini(2)
    new WithMultiRoCCSha3(1)
    new WithMultiRoCCHwacha(0)
    new WithInclusiveCache(capacityKB=1024)
    new boom.common.WithLargeBooms
    new boom.system.WithNBoomCores(2)
    new rocketchip.subsystem.WithRV32
    new rocketchip.subsystem.WithNBigCores(1) ++
    new WithNormalBoomRocketTop
    new rocketchip.system.BaseConfig)

```

EECS251B L02 CHIPYARD

This diagram shows the same TestHarness architecture as the previous one, but with three specific clock signals highlighted by red boxes: clk\_1, clk\_2, and clk\_0. These signals are associated with the tiles and the Top section. The rest of the components and layout are identical to the first diagram. The number '32' is located in the bottom right corner.

32



33



34

## Hammer

- Modular VLSI flow
  - Allow reusability
  - Allow for multiple “small” experts instead of a single “super” expert
  - Build abstractions/APIs on top
  - Improve portability
  - Improve hierarchical partitioning
- Three categories of flow input
  - Design-specific
  - Tool/Vendor-specific
  - Technology-specific

EECS251B L02 CHIPYARD

35

## Simulation/Implementation Targets

- Custom hardware design is not just about generated IP blocks!
- Different collaterals for different simulation or implementation targets
  - Design cycle RTL simulation
  - Verification / validation
  - VLSI flow

EECS251B L02 CHIPYARD

36

## Software

- Compatible standard RISC-V Tools versions
- ESP-Tools as a non-standard equivalent SW tools package with custom accelerator extensions (Hwacha, Gemmini)
- Improved BareMetal testing flow
  - Use libgloss and newlib instead of in-house syscalls
- FireMarshal workload management

```

graph TD
    Stack[Core Application Logic  
Libraries  
User-space distros  
OS Kernel  
Drivers] --> Toolchain[RISC-V Toolchain  
Standard  
Custom]
    Toolchain --> QEMU[QEMU Functional Emulation]
    Toolchain --> Spike[Spike ISA Simulation]
    Toolchain --> Software[Software RTL Simulation]
    Toolchain --> FireSim[FireSim Simulation]
    Toolchain --> TestChip[Test Chip]
  
```

EECS251B L02 CHIPYARD

37

## Summary

- We will use Chipyard to generate a minimalist RISC-V SoC for logic design and circuits experiments
- Labs will exercise the design flow
- Think of projects that can:
  - Test a circuit idea in a larger system
  - Design a block to improve SoC
    - Co-processor/Accelerator
    - Peripheral device

EECS251B L02 CHIPYARD

38



39