

# **Designing Languages for Designing Hardware**

Adrian Sampson

Cornell



How do we harness the power of computing?

What are the secrets of human intelligence?

How do computers work? What is computation, really?

What is a computer?





What is a computer?

What is computation?

“The complexity for minimum component costs has increased at a rate of roughly a factor of two per year.”



**Gordon Moore**  
co-founder of Intel  
1965

**The size of a single  
transistor decreases by  
half every 18 months.**



**Gordon Moore**  
co-founder of Intel  
1965

**and cost, and power...**

The size of a single transistor decreases by half every 18 months.



**Gordon Moore**  
co-founder of Intel  
1965







A horizontal timeline arrow pointing to the right. A vertical line segment is drawn from the word 'time' down to the timeline. The word 'immemorial' is written below the timeline. The year '2005' is marked on the timeline, and the year '2015' is marked further to the right.

**free lunch**

time  
immemorial

exponential  
single-threaded  
performance  
scaling!

(not to scale)



we'll scale the  
number of cores  
instead



CPU core frequency

10 GHz

1 GHz

100 MHz

10 MHz

1 MHz

1970

1975

1981

1986

1992

1997

2003

2008

2014

year of introduction









The performance returns from Moore's Law ended in 2015!

The only way forward is to trade off generality for efficiency!

# A New Golden Age of Computing

A New Golden Age of Computing  
Domain-Specific Hardware,  
Enhanced Security, Open Instruction Sets,  
Agile Chip Development

John L. Hennessy and David A. Patterson

CATAPULT



DOUG  
BURGER

(MICROSOFT,  
FORMERLY  
UT)

# A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services

Andrew Putnam   Adrian M. Caulfield   Eric S. Chung   Derek Chiou<sup>1</sup>  
Kypros Constantinides<sup>2</sup>   John Demme<sup>3</sup>   Hadi Esmaeilzadeh<sup>4</sup>   Jeremy Fowers  
Gopi Prashanth Gopal   Jan Gray   Michael Haselman   Scott Hauck<sup>5</sup>   Stephen Heil  
Amir Hormati<sup>6</sup>   Joo-Young Kim   Sitaram Lanka   James Larus<sup>7</sup>   Eric Peterson  
Simon Pope   Aaron Smith   Jason Thong   Phillip Yi Xiao   Doug Burger

Microsoft

23  
AUTHORS!

## Abstract

*Datacenter workloads demand high computational capabilities, flexibility, power efficiency, and low cost. It is challenging to improve all of these factors simultaneously. To advance datacenter capabilities beyond what commodity server designs can provide, we have designed and built a composable, reconfigurable fabric to accelerate portions of large-scale software services. Each instantiation of the fabric consists of a 6x8 2-D torus of high-end Stratix V FPGAs embedded into a half-rack of 48 machines. One FPGA is placed into each server, accessible through PCIe, and wired directly to other FPGAs with pairs of 10 Gb SAS cables.*

*In this paper, we describe a medium-scale deployment of this fabric on a bed of 1,632 servers, and measure its efficacy in accelerating the Bing web search engine. We describe the requirements and architecture of the system, detail the*

desirable to reduce management issues and to provide a consistent platform that applications can rely on. Second, datacenter services evolve extremely rapidly, making non-programmable hardware features impractical. Thus, datacenter providers are faced with a conundrum: they need continued improvements in performance and efficiency, but cannot obtain those improvements from general-purpose systems.

Reconfigurable chips, such as Field Programmable Gate Arrays (FPGAs), offer the potential for flexible acceleration of many workloads. However, as of this writing, FPGAs have not been widely deployed as compute accelerators in either datacenter infrastructure or in client devices. One challenge traditionally associated with FPGAs is the need to fit the accelerated function into the available reconfigurable area. One could virtualize the FPGA by reconfiguring it at run-time to support more functions than could fit into a single device. However, current reconfiguration times for standard FPGAs

# **RTL**

register-transfer level

Verilog   VHDL   Bluespec   Chisel



**Actual Silicon**

C

:

C



⋮  
⋮

⋮



**C**



# High-Level Synthesis

**RTL**

register-transfer level

image and video processing, financial analytics, biomimetics, and scientific computing applications. Since RTL programming in VHDL or Verilog is unacceptable to most application software developers, it is essential to provide a highly automated compilation/synthesis flow from C/C++ to FPGAs.

As a result, a growing number of FPGA designs are

**Verilog  
is unacceptable**



**we must program  
FPGAs in C**

# HLS

An enormous series of *ad hoc* consistency checks, hacks, and workarounds to compile some C programs to Verilog.

# Seashell

A new language for hardware accelerator design with a **type system** that defines which programs are realizable on FPGAs.

```
int A[10];
int B[10];
for (int i = 0; i < 10; i++) {
    int x = A[i];
    int y = x * 5;
    B[i] = y;
}
```





```
#pragma HLS ARRAY_PARTITION variable=A factor=5
#pragma HLS ARRAY_PARTITION variable=B factor=5

int A[10];
int B[10];
for (int i = 0; i < 10; i++) {
    #pragma HLS UNROLL factor=5
    int x = A[i];
    int y = x * 5;
    B[i] = y;
}
```



```
#pragma HLS ARRAY_PARTITION variable=A factor=5
#pragma HLS ARRAY_PARTITION variable=B factor=5

int A[10];
int B[10];
for (int i = 0; i < 10; i++) {
    #pragma HLS UNROLL factor=5
    int x = A[i];
    int y = x * 5;
    B[i] = y;
}
```

# Memory types

```
memory A : int[10];  
for (...) {  
    access A[i];  
    access A[i+1];  
}
```

# Affine memory types

```
memory A : int[10];
for (...) {
    access A[i];
    access A[i+1]; ← error: A already used in this context
}
```

**Affine types and linear types**, as made famous recently by **Rust**.

The screenshot shows the official Rust website at [rust-lang.org](https://rust-lang.org). The page features a dark header with the Rust logo, navigation links for Install, Learn, Tools, Governance, Community, and Blog, and a prominent yellow "GET STARTED" button. The main title "Rust" is displayed in large, bold, black letters. Below it, a tagline reads "Empowering everyone to build reliable and efficient software." A red banner at the bottom announces "The Rust 2018 Edition is here!" and a green footer section contains the text "Why Rust?".

rust-lang.org

Install Learn Tools Governance Community Blog

# Rust

GET STARTED

Version 1.33.0

The Rust 2018 Edition is here!

## Why Rust?

# Banked memory types

```
memory bank(5) A : int[10];
```



# Banked memory types

```
memory bank(5) A : int[10];  
for (let i in 0..1) {  
    access A[0][i];  
    access A[1][i];  
    access A[2][i];  
    access A[3][i];  
    access A[4][i];  
}  
  
STATIC ← DYNAMIC // one access to each  
                  A[j] allowed here
```



# Hybrid indices for unrolling

```
memory bank(5) A : int[10];  
for (let i in 0..9) unroll 5 {  
    access A[??][??];  
}
```

i : idx<0..5, 0..2>

A pair of a **static index** from 0 through 4 and a **dynamic index** that's either 0 or 1.

# Seashell

:

# C



⋮  
⋮



⋮



# github.com/cucapra/seashell

A screenshot of a GitHub repository page for the project "cucapra/seashell". The page shows the following details:

- Repository Name:** cucapra / seashell
- Watchers:** 2
- Stars:** 10
- Forks:** 2
- Code:** 1,006 commits
- Branches:** 7 branches
- Releases:** 0 releases
- Environment:** 1 environment
- Contributors:** 5 contributors
- Licence:** MIT

The repository description is: "A typed programming language for safe high-level synthesis" and the URL is <https://capra.cs.cornell.edu/fuse>. There is an "Edit" button and a "Manage topics" link.

The commit history shows the following recent changes:

- tedbauer and rachitnigam Add vim syntax highlighting support (#92) - Latest commit 856f704 12 hours ago
- .circleci update circleci 23 days ago
- buildbot changes to buildbot and Dockerfile 23 days ago
- docs Create new docs website. 22 hours ago
- examples Stencil support files a day ago
- notes rename docs/ to notes/ 22 hours ago
- paper define fuse syntax and GeMM for sec 2 20 days ago
- project add project assembly dep 23 days ago
- src adding TSizedInt rule for consumeBanks (#88) 20 hours ago
- ... (partial commit entry)

At the bottom, there are buttons for "Create new file", "Upload files", "Find file", and "Clone or download".



architecture

programming  
languages

