

# **DESIGNING DIGITAL SIGNAL PROCESSORS WITH ROCKETCHIP**

Paul Rigge, Borivoje Nikolić

{rigge,bora}@eecs.berkeley.edu

UC Berkeley

CARRV 2018

June 2, 2018



**BWRC**





BWRC

# SoCs Combine Programmability + Efficiency





BWRC

# Digital Signal Processing



- SoCs Integrate a lot of signal processing
  - Cellular+WiFi
  - Audio
  - Image Processing
  - GPS
- Strongly benefits from custom hardware
  - Parallelism
  - High locality
  - Trim unneeded bits



BWRC

# Designing SoCs is Hard



- Long development cycle
- High cost of tools, respins
- Reuse limited
- High NRE only justifiable in high volume



BWRC

# Chisel, FIRRTL, Rocketchip



- Chisel: domain specific language (DSL) for writing programs that generate circuits
- FIRRTL: Flexible Intermediate Representation for RTL (LLVM for hardware)
- RocketChip: Open-source RISC-V implementation in Chisel





BWRC

# Growing Infrastructure



- Stable cores
- Compilers
- Software Infrastructure
- Accelerators
  - DMA
  - Hwacha
  - Etc.
- Interconnect Generators



# Outline

- DSP Generators in Chisel
- AXI-4 Stream + Diplomacy
- Memory-mapped DSP Peripherals
  - Useful building blocks
- Verification
- OFDM Baseband Example



BWRC



# DSP Generators in Chisel



BWRC



# dsptools

Screenshot of the GitHub repository page for ucb-bar / dsptools.

Repository details:

- Search or jump to... (input field)
- Pull requests, Issues, Marketplace, Explore (navigation tabs)
- Unwatch (button), 17 (watch count), Star (button), 18 (star count), Fork (button)
- Code, Issues 21, Pull requests 2, Projects 0, Wiki, Insights, Settings (repository navigation tabs)

Description: A Library of Chisel3 Tools for Digital Signal Processing

Add topics

Statistics:

- 221 commits
- 39 branches
- 11 releases
- 5 contributors
- BSD-3-Clause license

Branch: master ▾ New pull request Create new file Upload files Find file Clone or download

Recent activity:

- grebe Merge pull request #126 from ucb-bar/lessVerbose ... Latest commit 0d8c8ce 12 days ago
- project Bump sbt. 3 months ago
- src Merge pull request #126 from ucb-bar/lessVerbose 12 days ago
- .gitignore Ignore annotations a year ago
- .install\_verilator.sh Add travis support (#99) 9 months ago
- .travis.yml Travis should output everything. 26 days ago

<https://github.com/ucb-bar/dsptools>

Paul Rigge, Angie Wang, Stevo Bailey, Chick Markley, Adam Izraelevitz

# Floating Point

- Start implementing hardware without worrying about precision
  - Validate IO, control logic, algorithm
- Validate floating point hardware implementation against floating point golden model
- Uses Verilog “real” types (non-synthesizable)





# Fixed Point

- Fixed point types in Chisel and FIRRTL
- Width inference like UInt, SInt
- Binary point inference

```
val sel = Wire(Bool())
val a = Wire(FixedPoint(width = 10.W,
                         binaryPoint = 9.BP))
val b = Wire(FixedPoint(width = 12.W,
                         binaryPoint = 10.BP))
val reg = Reg(FixedPoint())
when (sel) {
    reg := a
} .otherwise {
    reg := b
}
```



# Complex Numbers

- dsptools defines a Complex type
- Generic as to underlying type, i.e.  
DspComplex[SInt] or DspComplex[FixedPoint]  
or DspComplex[FloatingPoint] OK
- Can choose between 3 or 4 real-multiplies for  
a complex multiply



# DspContext

- Automatic pipeline register insertion for adds, multiplies
- Rounding
- Precision for literals
- Override global defaults with scope, e.g.

```
DspContext.alter(myContext) {  
    val sum = a context_* b // auto-pipelined  
}
```





BWRC

# Numeric Type Classes



- Type classes: support ad hoc polymorphism by adding constraints to type variables
- Support using type-generic generators with user-defined types
- Use type classes from Spire numeric library
  - Add new type classes for hardware constructs, especially Bool
  - Hide expensive operations, e.g. division, sqrt



BWRC

# Numeric Type Classes



- Ring
  - +, -, \*, zero, and one
- Eq
  - === and !=
- Order (extends Eq)
  - <, >, <=, >=, max, min
- Real (extends Ring with Order with Sign)
  - ceil, floor, round, isWhole
- Integer (extends Real)
  - mod



# DspTester

- Verification needs to be as parameterizable as hardware generators
- Type-generic PeekPokeTester
- Assert output is within epsilon (set by type)





BWRC



# AXI-4 Stream + Diplomacy



BWRC

# Diplomacy Background



- Generator runs in two phases
  - Negotiate parameters
  - Elaborate hardware
- Parameters flow both “in” and “out”



# AXI-4 Stream

- AMBA standard for streaming data
- Defines ready/valid handshake semantics
- Most fields optional
  - Even TDATA!





BWRC

# AXI-4 Stream Diplomacy



- Nodes exchange parameters
- Edge resolves parameters, chooses bundle parameters





BWRC



# Memory-mapped DSP Peripherals

- Connect DSP accelerators to Rocket via Periphery Bus



# DspBlock

- Basic building block of DSP functionality
- Streaming inputs and outputs (any number)
- Optional memory interface





BWRC

# Type-generic DSP Blocks



- Define DSP functionality and interconnect separately
- Treat type of memory interface as parameter



# DspChain

- DspBlock composed of many internal DspBlocks
- Generate memory interconnect, connections between blocks
- Add design-for-test (DFT) structures





BWRC



# Synchronous Data Flow



- Represent computation as digraph
- Samples produced/consumed by each node known a priori



BWRC

# DspRegister, DspQueue



- Building blocks for SDF-style design
- DspRegister
  - Register with programmable vector length
  - Stream in and out simultaneously
  - Processor has visibility into contents
- DspQueue
  - Throw interrupt when entries exceeds programmable threshold
  - Support real-time

# Verification

# Unit Tests

- PeekPokeTester
  - Type-generic with DspTester
- FIRRTL Interpreter very fast for small tests



## Scala Test

```
// initialize DFT
bindMaster(dut.io.in).addTransactions(testVector)
val slave = bindSlave(dut.io.out)

// Interact with DUT
val out = tlReadWord(dutAddr0)
if (out != 0) {
  tlWriteWord(dutAddr1, 3)
}
assert(tlReadWord(dutAddr2) != 0)

// Get output
val output = slave.getTransactions()
```



# Integration Tests

- Write C programs to run on Rocket, use Rocket test harness
- Generate design-for-test (DFT) structures
  - Load test vectors, set muxes, store outputs
- Same binary for simulation and bring-up



## C Test

```
// Initialize DFT
pg_load_data(pg_addr, test_vector);
initiate_capture(cap_addr);

// Interact with DUT
out = read_reg(dut_addr0);
if (out != 0) {
    write_reg(dut_addr1, 3);
}
assert(read_reg(dut_addr2) != 0);

// Get output
get_output(cap_addr, output);
```

- XML schema describing circuit metadata
  - Port mappings
  - interface types
  - generator parameters
- Use external tools
  - Python test vector generation + Verification Workbench





BWRC



# OFDM Baseband Example

# OFDM Background

- Frequency domain equalization
- Uses FFT
- Relax time domain synchronization





# OFDM Baseband

- Transmitter and receiver





# Conclusion

- Chisel + FIRRTL + dsptools help building DSP
- RocketChip is not just a processor, library of useful components
  - Diplomacy
  - Interconnect
  - Utilities
- Can build and verify complex SoCs with RocketChip



BWRC

# Thank You



- Collaborators
  - Stevo Bailey, Angie Wang, Adam Izraelevitz, Chick Markley, Colin Schmidt, Timo Joas, and Jim Lawson
  - UCB BAR
- Support
  - NSF GRFP
  - Adept and BWRC