



# OPEN SOURCE HARDWARE / SOFTWARE CO-DESIGN

DAVID DONOFRIO, CHIEF HARDWARE ARCHITECT

JOHN D. LEIDEL, CHIEF SCIENTIST

{DDONOFRIO,JLEIDEL}@TACTCOMPLABS.COM

ANASTASIIA BUTKO, SCIENTIST, LAWRENCE BERKELEY NATIONAL LAB

JEFFERY YOUNG, SCIENTIST GEORGIA TECH

DECEMBER, 2021

# AGENDA AND SPEAKERS

- DAVID DONOFRIO & JOHN LEIDEL - *TACTICAL COMPUTING LABORATORIES*
  - OVERVIEW OF OPEN SOURCE HARDWARE DEVELOPMENT FLOWS
- ANASTASIIA BUTKO - *LAWRENCE BERKELEY NATIONAL LABORATORY*
  - OPEN2C: OPEN-SOURCE GENERATOR FOR COHERENT CACHE MEMORY SUBSYSTEM
- JEFFERY YOUNG - *GEORGIA TECH*
  - NOVEL RISC-V IMPLEMENTATIONS AND ACCELERATORS

# POST MOORE TECHNOLOGY CURVE

GREAT OPPORTUNITIES EXIST FOR INNOVATION THROUGH THE END OF MOORE'S LAW



# SPECIALIZATION AS THE PATH TO PERFORMANCE

- PROCESS SCALING TAPERING OFF
- SPECIALIZED ARCHITECTURES ARE EVERYWHERE
  - GPUs
  - AI ACCELERATORS
  - RE-CONFIGURABLE
  - MATRIX ACCELERATORS
- INCREASING HETEROGENEITY IN SYSTEMS
  - HETEROGENOUS IN BOTH TYPES OF COMPUTATION AS WELL AS WHERE COMPUTING IS PERFORMED
  - AS COMPUTING BECOMES MORE DECENTRALIZED WE WILL NEED A GREATER VARIETY OF ACCELERATORS
  - THIS DRIVES A NEED FOR FRAMEWORKS THAT ALLOW RAPID DESIGN SPACE EXPLORATION

# ACCELERATORS CRITICAL TO PERFORMANCE

- TOP 500



# VENDOR COUNT INCREASING

- DEMOCRATIZATION OF HPC?



# OPEN SOURCE: DRIVING INNOVATION

## Open Source Software

Over 80% of  
companies run  
Open Source  
Software



1991 -  
Linux 1.0

Today

## Open Source Hardware



Today

# OPEN SOURCE CORES QUITE CAPABLE

- OPEN SOURCE IMPLIES LOW COST... NOT NECESSARILY LOW PERFORMANCE



Figure 6: SonicBOOM SPEC17 IPC compared to Intel Skylake and AWS Graviton cores.



Results from SonicBOOM  
Zhao, et. al. UC Berkeley

# OPEN SOURCE HW IS MORE THAN JUST ASICS

- HARDWARE GENERATION METHODOLOGIES
  - ENABLE FPGA COMPUTING
  - ENABLE ARCHITECTURAL COMPARISON
    - COMMON BASELINES
  - FASTER ITERATION
- MORE EFFECTIVE THAN SHARING POINT DESIGNS
  - ROCKET CORE VS. OPENCORES
- Co-GENERATION OF SOFTWARE STACK DESIRABLE



# TRULY OPEN ASICS

- HARDWARE GENERATION TOOLS TYPICALLY SOLVE THE DIGITAL DESIGN AND SIMULATION PROBLEM
- MANY MORE COMPONENTS NEEDED FOR A COMPLETE ASIC
  - PLL, JTAG, SRAM, VREG...
- ASIC PDKs TYPICALLY OBSCURED BEHIND NDAs, ETC.

# TRULY OPEN ASICS

- OPENLANE IS A COMPLETE, OPEN SOURCE, RTL TO GDSII FLOW
  - COMBINATION OF MANY EXISTING OPEN SOURCE FLOWS
    - YOSYS, MAGIC, OPENROAD, NETGEN, AND MANY OTHERS
  - [HTTPS://GITHUB.COM/EFABLESS/OPENLANE](https://github.com/efabless/openlane)
- GOOGLE HAS PARTNERED WITH SKYWATER FAB
  - 130NM PDK OPEN SOURCE
    - APACHE LICENSE
  - SHUTTLE RUNS ARE NOW **FREE** AND OPEN TO **ALL**
    - FINAL DESIGN MUST BE 100% OPEN TO QUALIFY



The Google logo, featuring the word "Google" in its signature multi-colored, rounded font.



```
[bash-4.1$ ./flow.tcl -design behav_counter
[INFO]:
[INFO]: Version: rc2-3-gaf87b79
[INFO]: Using design configuration at /openLANE_flow/designs/behav_counter/config.tcl
[INFO]: couldn't read file "/Users/apple/openlane/sky130A/sky130A/libs.tech/openlane/config.tcl": no such file or directory
      while executing
        "source $pdk_config"
```

# A BIT OF HISTORY

- LBNL & TEXAS TECH WERE FUNDED TO DEVELOP A SPECIAL-PURPOSE, RISC-V BASED SoC
  - FPGA-BASED
  - HMC MEMORIES, CUSTOM ISA EXTENSIONS, SCRATCHPADS
- PHD DISSERTATION
  - GOBLINCORE-64: RISC-V BASED SoC FOR DATA INTENSIVE COMPUTING
  - ISA EXTENSIONS, PIPELINE CHANGES. SMT
  - NO CACHES, MEMORY COALESCING



Lawrence Berkeley  
National Laboratory



# SYSTEM ARCHITECT GEN1

- THIS “RINSE AND REPEAT” DESIGN FLOW CAN BE AUTOMATED!
- SYSTEM ARCHITECT GEN1 WAS BORN AS A TOOLCHAIN FOR DEVELOPING RISC-V EXTENSIONS
- UTILIZED COMPILER-TECHNIQUES TO HANDLE DEPENDENCE ANALYSIS OF DESIGN PARAMETERS TO RISC-V INTERNALS
- **INPUT:** RISC-V CORE & DESIGN PARAMETERS
- **OUTPUT:** CHISEL RTL FOR ENTIRE SOC & LLVM COMPILER



# SYSARCH TOOL CHAIN & FLOW

- OPENSoC TOOL CHAIN & FLOW CONSISTS OF SEVERAL INTEGRAL MOVING PARTS:
  - OPENSoC SYSTEM ARCHITECT FRONTEND
  - COREGEN IR & BACKEND
  - OPENSoC FABRIC NoC INTERCONNECT
  - RISC-V STANDARD CORES (IN CHISEL)
  - LLVM COMPILER INFRASTRUCTURE
- USERS INPUT DESIGN SPECS AND EXTENSIONS VIA FRONTEND
- COREGEN ENSURES DESIGN CORRECTNESS AND CONTINUITY
  - ALSO GENERATES BACKEND CHISEL AND LLVM IMPLEMENTATION
- OPENSoC FABRIC “GLUES” ALL INTERMEDIATE MODULES TOGETHER WITH SCALABLE NoC



# SYSTEM ARCHITECT GEN2

- SEVERAL EARLY USERS POSED AN INTERESTING QUESTION:
- ***WHAT ABOUT DEVELOPING ARBITRARY ISA's?***
- ***GEN2 IS A COMPLETE RE-WRITE OF THE INFRASTRUCTURE TO SUPPORT ARBITRARY ISA DEVELOPMENT***
  - NEW INTERMEDIATE REPRESENTATION (FORMATTED IN YAML)
  - NEW DEPENDENCE ANALYSIS ENGINE (COREGEN)
  - HIGH LEVEL INSTRUCTION DEFINITION LANGUAGE (STONECUTTER)
  - SUPPORT FOR PLUGINS



*Maintain verification performance similar to a modern optimizing compiler!*

# WHAT IS SYSTEM ARCHITECT?

- A FAMILY OF TOOLS, APIs AND ASSOCIATED INFRASTRUCTURE TO PERMIT USERS TO RAPIDLY DEVELOP MULTI-FACETED HARDWARE
- UTILIZES A COMBINATION OF MODULAR HARDWARE DESIGN REUSE PRINCIPLES, OBJECT ORIENTED DEVELOPMENT AND DEPENDENCE ANALYSIS TECHNIQUES (COMPILER THEORY) TO PROVIDE AN INFRASTRUCTURE FOR:
  - **DESIGN & DESIGN EXPERIMENTATION**
  - **HIGH LEVEL VERIFICATION**
- THE ARTIFACTS GENERATED BY A SYSTEM ARCHITECT DESIGN FLOW INCLUDE:
  - **CHISEL HDL AND VERILOG RTL**
  - **C++ CYCLE-BASED SIMULATOR**
  - **LLVM COMPILER**

# WHAT IS SYSTEM ARCHITECT NOT?

- SYSTEM ARCHITECT IS NOT THE LATEST C-TO-GATES TOOL
  - IT PERMITS RAPID DESIGN, VERIFICATION AND REUSE
  - IT DOES NOT AUTO-GENERATE HARDWARE BASED UPON APPLICATION CODE
- SYSTEM ARCHITECT STILL RELIES UPON THE USER TO UTILIZE REASONABLE DESIGN CONCEPTS
  - SYSTEM ARCHITECT WILL **NOT** AUTO-GENERATE OPTIMIZED DESIGNS BASED UPON UNREASONABLE INPUTS
  - USERS NEED TO HAVE A CONCEPT OF THE PHYSICAL PLATFORM (FPGA, ASIC, ETC)
- SYSTEM ARCHITECT DOES **NOT** CURRENTLY HAVE A NOTION OF PHYSICAL LAYOUT
  - THE GENERATED OUTPUT WILL NOT INCLUDE LUT COUNTS, PHYSICAL DESIGN DIMENSIONS OR POWER ESTIMATES
  - EXTERNAL FPGA/ASIC TOOLS ARE REQUIRED FOR THIS LEVEL OF DETAIL

# GEN2 HIGH LEVEL DESIGN



# HOW DOES IT WORK?

- WE INPUT DESIGNS IN THE USER-FACING TOOLS AND GENERATE **COREGEN INTERMEDIATE REPRESENTATION (IR)**
- THE IR PRESERVES THE NATURAL DEPENDENCIES WITHIN THE DESIGN
  - REGISTER CLASSES DEPEND UPON REGISTERS
  - INSTRUCTION FORMATS DEPEND UPON REGISTER CLASSES
  - INSTRUCTION SETS DEPEND UPON INSTRUCTIONS
  - CORES REQUIRE INSTRUCTION SETS
- WE EXECUTE HIGH LEVEL VERIFICATION “PASSES” AGAINST THE IR
  - SIMILAR IN DESIGN TO TRADITIONAL COMPILER PASSES
  - WALKS THE IR DEPENDENCE GRAPH AND DERIVES PROPERTIES OF THE DESIGN
  - REPORTS ISSUES IN THE DESIGN INFRASTRUCTURE, OUTPUTS INTERESTING DATA OR OPTIMIZES THE DESIGN INFRASTRUCTURE
- USERS IMPLEMENT INSTRUCTIONS IN STONECUTTER INSTRUCTION IMPLEMENTATION LANGUAGE
  - C-LIKE INTEGRATED WITH COREGEN IR TO DEFINE A SINGLE INSTRUCTION
  - OPTIMIZED BY A TRADITIONAL COMPILER FLOW TO GENERATE CHISEL HDL FOR A SINGLE INSTRUCTION
- FOLLOWING THE HIGH LEVEL VERIFICATION PHASE, WE EXECUTE GENERATE DOWNSTREAM CODE (CODE GENERATION)
  - GENERATES CHISEL HDL
    - COMPILED DOWN TO VERILOG & C++ CYCLE-BASED SIMULATOR
  - GENERATES LLVM COMPILER FOR THE TARGET DESIGN

Tactical Computing Laboratories



Example  
dependence  
graph from  
CoreGen  
design input

# COREGEN VERIFICATION INFRASTRUCTURE



# SYSTEM ARCHITECT GEN2: RAPID DESIGN GENERATION AND ANALYSIS

- GRAPH-BASED DEPENDENCE ANALYSIS FOR DESIGN VERIFICATION
- HARDWARE MODULES ARE DEFINED AS NODES IN THE DAG INSIDE COREGEN
  - DEPENDENCE GRAPH PROCESSED IN MULTIPLE STAGES TO EXPOSE CONNECTIVITY, COMMUNICATION LINKS, INSTRUCTION AND REGISTER ENCODINGS
- MULTIPLE PASSES PERFORMED ON THE DAG FOR DESIGN VERIFICATION
  - ELIMINATION OF DANGLING NODES / REGIONS
    - EARLY DETECTION TO AVOID ISSUES WITH LATER SYNTHESIS OF THE DESIGN
  - ISA ENCODING COLLISION CHECK
    - EARLY DETECTION WITH DECODE LOGIC AND ASSEMBLY/DISASSEMBLY
  - COMBINATORIAL LOGIC ANALYSIS
  - USER-DEFINED PASSES COMING SOON



[https://en.wikipedia.org/wiki/Component\\_\(graph\\_theory\)](https://en.wikipedia.org/wiki/Component_(graph_theory))

# SAMPLE PERFORMANCE

- HETEROGENEOUS, TILED DESIGN WITH MULTI-LEVEL CACHES AND NOC
  - MULTIPLE ISA's WITH RISC CORES AND ATTACHED ACCELERATORS
- DESIGN SCALED FROM 1 TO 511 TILES
- **FULL VERIFICATION RUN OF 511 CORES < 1 SEC**



# WEB LINKS

- SYSTEM ARCHITECT PUBLIC WEB
  - <HTTP://WWW.SYSTEMARCHITECT.TECH/>
- DOCUMENTATION
  - LATEST IR SPECIFICATION:
    - <HTTP://WWW.SYSTEMARCHITECT.TECH/INDEX.PHP/COREGENIRSPEC/>
- TUTORIALS
  - <HTTP://WWW.SYSTEMARCHITECT.TECH/INDEX.PHP/TUTORIALS/>
  - <HTTPS://GITHUB.COM/OPENSOCSYSARCH/COREGENTUTORIALS>

# SOURCE CODE

- MAIN SOURCE CODE HOSTED ON GITHUB:
  - [HTTPS://GITHUB.COM/OPENSOCSYSARCH](https://github.com/opensocsysarch)
- COREGEN INFRASTRUCTURE
  - [HTTPS://GITHUB.COM/OPENSOCSYSARCH/COREGEN](https://github.com/opensocsysarch/CoreGen)
- COREGENPORTAL GUI
  - [HTTPS://GITHUB.COM/OPENSOCSYSARCH/COREGENPORTAL](https://github.com/opensocsysarch/CoreGenPortal)
- COREGEN IR SPEC
  - [HTTPS://GITHUB.COM/OPENSOCSYSARCH/COREGENIRSPEC](https://github.com/opensocsysarch/CoreGenIRSpec)
- SYSTEM ARCHITECT WEEKLY DEVELOPMENT RELEASES
  - [HTTPS://GITHUB.COM/OPENSOCSYSARCH/SYSTEMARCHITECTRELEASE](https://github.com/opensocsysarch/SystemArchitectRelease)

# LIVE SYSTEM ARCHITECT WALK THROUGH

- *LIVE DEMO / INSTRUCTION TIME*
- GOAL: CREATE YOUR OWN RISC-V CORE USING OPENSOC SYSTEM ARCHITECT
- GENERATE CORE THEN REVIEW AND INSPECT:
  - GRAPH OUTPUT
  - POWER ESTIMATIONS
  - HARDWARE, COMPILER AND OTHER GENERATED OUTPUTS

# LIVE SYSTEM ARCHITECT WALK THROUGH

- *LIVE DEMO / INSTRUCTION TIME*
- EXTEND THE RISC-V RV32I INSTRUCTION BY ADDING A CUSTOM INSTRUCTION DEFINITION
  - EXAMPLE DEFINITION PROVIDED, OR MAKE YOUR OWN
- RE-GENERATE CORE AND COMPILER
  - INSPECT GRAPH OUTPUT AND NOTE NEW INSTRUCTIONS
  - INSPECT AND COMPARE POWER OUTPUT
  - INSPECT AND MODIFY PIPELINE OPTIONS

# SUMMARY

- SPECIALIZATION IS KEY TO CONTINUED PERFORMANCE GAINS
  - CALLS FOR NIMBLE, ACCESSIBLE HARDWARE CO-DESIGN FLOWS
- OPEN-SOURCE FLOWS HAVE MATURED TO ENABLE CUSTOM DESIGNS
- FABRICATION COSTS HAVE DROPPED TO \$0 FOR OPEN DESIGNS
- OPENSoC SYSTEM ARCHITECT ENABLES NEXT-GENERATION CO-DESIGN FLOWS
  - PROVIDES THE NECESSARY HARDWARE AND COMPILER GENERATION INFRASTRUCTURE



DAVID DONOFRIO, CHIEF HARDWARE ARCHITECT

JOHN D. LEIDEL, CHIEF SCIENTIST

{DDONOFRIO,JLEIDEL}@TACTCOMPLABS.COM