

# Verifiable Computing Project

“How do we build trusted, open hardware?”

Joyce Ng

# Talk Outline

- Why care about verifiable computing?
- Why use FPGAs?
- Existing Works
- FPGA Design Flow
- Toolchains
- Implementation
- Demo
- Planned Projects

# What will NOT be covered

- Computer Architecture
- Digital Design
- VLSI design

# Why care about verifiable computing?



What do these devices have in common?

# Why care about verifiable computing?



They all are computers!

# Why care about verifiable computing?

- End user:  
"How do I know that the devices I use are trustworthy?"
- Chip designer: "How do I know that the chips I get from the fab are exactly as per design and not tampered with?"

# Why care about verifiable computing?



Real life example – iPhone X compromised!

# Why care about verifiable computing?

- Phone known to be compromised as adversary messages contact with things said on a live Zoom call
- Police forensics unable to find entry point nor traces of stalkerware
- Phone not jailbroken by owner
- Many consumer might as well be black boxes to their owners!



# Why care about verifiable computing?



Second example - Simple CMOS Inverter

# Why care about verifiable computing?



(a) Original



(b) Trojan

Try spotting the difference!

# Why care about verifiable computing?

## Intel's Ivy Bridge RNG design



# Why care about verifiable computing?

## Simplified view of the Rate Matcher



- Rater Matcher uses AES in counter mode
- Stage registers k and c contain truly random numbers
- Stage registers k and c are updated after iteration

# Why care about verifiable computing?

## Trojan Rate Matcher



- Modify registers of k so that they output a constant
- Modify 128-n registers of c in the same way
  - ⇒ The output of the RNG depends only on n random bits!
  - ⇒ For n=32 the RNG still passes NIST random number test suit

Secret keys generated using this Trojan RNG insecure

# Why care about verifiable computing?



RNG with dopant level trojan

# Why care about verifiable computing?

random key  $K$ . The attacker has a chance of  $1/2^{128}$  to correctly guess a random number resulting in an attack complexity of 128-bits. The goal of our Trojan is to reduce the attack complexity to  $n$  bits, while being as stealthy as possible. This is achieved by cleverly applying our dopant-based Trojan idea described in Section 2 to internal flip-flops used in the rate matcher. In the first step we modify the internal flip-flops that store  $K$  in a way that  $K$  is set to a constant. In the second step the flip-flops storing  $c$  are modified in the same way, but  $n$  flip-flops of  $c$  are not manipulated. Hence, only  $(128 - n)$  flip-flops of  $c$  are set to a constant value. This has the effect that a 128-bit random number  $r$  depends only on  $n$  random bits and  $128 + (128 - n)$  constant bits known to the Trojan designer.

The owner of the Trojan can therefore predict a 128-bit random number  $r$  with a probability of  $1/2^n$ . This effectively reduces the attack complexity from 128-bit down to  $n$  bits. On the other hand, for an evaluator who does not know the Trojan constants,  $r$  looks random and legitimate since AES generates outputs with very good random properties, even if the inputs only differ in a few bits.

# Why care about verifiable computing?

In [9] it is stated that “This BIST logic avoids the need for conventional on-chip test mechanisms (e.g., scan and JTAG) that could undermine the security of the DRNG.” This fact is also mentioned in an Intel presentation in which it is argued that for security reasons the RNG circuitry should be free of scan chains and test ports [24]. Therefore, to prevent physical attacks, only the BIST should be used to detect manufacturing defects. From an attacker’s point of view, this means that a hardware Trojan that passes the BIST will also pass functional testing. Although Intel’s BIST is very good at detecting manufacturing and aging defects, it turns out that it cannot prevent our dopant Trojans. One simple approach to overcome the BIST would be to add a dopant Trojan into the BIST itself to constantly disable the error flag. However, it could be very suspicious if the BIST never reports any manufacturing defects.

To pass the BIST, the Trojan rate matcher needs to generate outputs  $r'_1, \dots, r'_4$  during the BIST that have the same 32-bit CRC checksum as the correct outputs  $r_1, \dots, r_4$ . Since the input to the rate matcher during the BIST is known, the Trojan designer can compute the expected 32-bit CRC checksum. He then only needs to find a suitable value for the Trojan constants  $c[1 : 128]$  and  $K[1 : 128 - n]$ , which generate the correct CRC checksum for the inputs provided during the BIST. Since the chance that two outputs have the same 32-bit CRC is  $1/2^{32}$ , the attacker only needs  $2^{32}/2$  tries on average to find values for  $c$  and  $K$  that result in the expected 32-bit CRC. This can easily be done by simulation. By cleverly choosing  $c$  and  $K$  the Trojan now passes the BIST, while the BIST will still detect manufacturing and aging defects and therefore raises no suspicion.

# Why care about verifiable computing?

- Dopant level attacks cannot be found even with Scanning Electron Microscopy!
- Only way to detect is to compare production chips with “golden chip”
- Hard to know if chip is even tampered unless you already are looking out for a specific attack!
- Difficult to inspect fab process; easy for state level actor to carry out said attack

# Why care about verifiable computing?

- Unfortunately, an open source chip fabrication process doesn't yet exist
- Work is being done in that space. (see image!)
- What can we do for now?



# Sam Zeloof DIY Z2 Chip – Photos!



# Sam Zeloof Chip Lab



# Why use FPGAs?

- General Purpose device
- FPGA used in various devices and applications, thus difficult to tamper in silicon
- Allows a user to verify the correct operation of a device by inspecting HDL (Hardware Description Language) code.
- User can customize/hack their devices if desired.

# Why use FPGAs? - FPGA Architecture



General Structure of a FPGA

# Why use FPGAs? - FPGA Architecture



Lattice iCE40 Internal Block Diagram

# Why use FPGAs? - FPGA Architecture



Lattice iCE40 – Programmable Logic Block

# Why use FPGAs? - FPGA Architecture



Lattice iCE40 – I/O Bank and I/O Cell

# Existing Works

## Precursor

Mobile, Open Hardware, RISC-V System-on-Chip (SoC) Development Kit

- FPGA Boards
- Mobile Devices
- Security & Privacy



\$443,557 raised  
of \$220,000 goal

201% Funded!

Order Below

31  
updates      Dec 15 2020  
funded on

636  
backers

In stock. Order now, ships within three business days.

\$590

[View Purchasing Options](#)



## Bunnie Studios Precursor

# Existing Works



Precursor – Block Diagram (Mainboard)

# Existing Works



PRECUSOR™  
SoC FPGA Block Diagram  
October 2020

## Precursor – Block Diagram (SoC)

# FPGA Design Flow

## Xilinx Design Flow



# FPGA Design Flow - Processes

- **Synthesis:**  
Converts HDL code into netlist file describing logical connections between blocks.
- **Place and Route (PnR):**  
Takes synthesized design and turns it into a physical implementation based on the target FPGA device.
- **Bitstream Generation:**  
Generates a bitstream file containing instructions for Configurable Logic Blocks (CLB) on target device.

# Toolchains – Process Tools

- Synthesis - Yosys
- Place and Route - nextpnr
- Bitstream Generation – various projects under F4PGA

# Toolchains - F4PGA

- Aims to be the “GCC” for FPGAs
- Open source FPGA toolchain
- Focused on FPGA architecture definitions
- Supports Xilinx 7-Series, Lattice iCE40 and ECP5, as well as QuickLogic EOS-S3

# Toolchains - F4PGA



# Toolchains - F4PGA

|                                 | Project<br>Icestorm | Project<br>Trellis | Project<br>X-Ray | QuickLogic<br>Database |
|---------------------------------|---------------------|--------------------|------------------|------------------------|
| <b>Basic Tiles:</b><br>- Logic  | ✓                   | ✓                  | ✓                | ✓                      |
| - Block RAM                     | ✓                   | ✓                  | ✗                | ✓                      |
| <b>Advanced Tiles:</b><br>- DSP | ✓                   | ✓                  | ✗                | ✓                      |
| - Hard Blocks                   | ✓                   | ✓                  | ✗                | ✓                      |
| - Clock Tiles                   | ✓                   | ✓                  | ✓                | ✓                      |
| - IO Tiles                      | ✓                   | ✓                  | ✓                | ✓                      |
| <b>Routing:</b><br>- Logic      | ✓                   | ✓                  | ✓                | ✓                      |
| - Clock                         | ✓                   | ✓                  | ✓                | ✓                      |

F4PGA Device Support Status

# Toolchains - LiteX

- Framework for building FPGA SoCs
- Supports building various RISC-V and other soft core CPUs
- Uses Migen, a Python based HDL
- Easily customize your own SoC design with various peripheral cores (serial, PCIe, Ethernet, etc)

# Toolchains - LiteX



# Toolchains - LiteX



Enjoy Digital  
Do

Linux SoC on Acorn CLE215+

Built with



Example SoC design on Acorn CLE-215+ board

# Toolchains - LiteX

```
22     ("cclk200", 0,
23         Subsignal("p", Pins("J19"), IOStandard("DIFF_SSTL15")),
24         Subsignal("n", Pins("H19"), IOStandard("DIFF_SSTL15"))
25     ),
26
27     # Leds.
28     ("user_led", 0, Pins("G3"), IOStandard("LVCMOS33")),
29     ("user_led", 1, Pins("H3"), IOStandard("LVCMOS33")),
30     ("user_led", 2, Pins("G4"), IOStandard("LVCMOS33")),
31     ("user_led", 3, Pins("H4"), IOStandard("LVCMOS33")),
32
33     # SPIFlash.
34     ("flash_cs_n", 0, Pins("T19"), IOStandard("LVCMOS33")),
35     ("flash", 0,
36         Subsignal("mosi", Pins("P22")),
37         Subsignal("miso", Pins("R22")),
38         Subsignal("wp", Pins("P21")),
39         Subsignal("hold", Pins("R21")),
40         IOStandard("LVCMOS33")
41     ),
42
43     # PCIe.
44     ("pcie_clkreq_n", 0, Pins("G1"), IOStandard("LVCMOS33")),
45     ("pcie_x4", 0,
46         Subsignal("rst_n", Pins("J1"), IOStandard("LVCMOS33"), Misc("PULLUP=TRUE")),
47         Subsignal("clk_p", Pins("F6")),
48         Subsignal("clk_n", Pins("E6")),
49         Subsignal("rx_p", Pins("B10 B8 D11 D9")),
50         Subsignal("rx_n", Pins("A10 A8 C11 C9")),
51         Subsignal("tx_p", Pins("B6 B4 D5 D7")),
52         Subsignal("tx_n", Pins("A6 A4 C5 C7")),
53     ),
54
55     # DDR3 SDRAM.
56     ("ddram", 0,
57         Subsignal("a", Pins(
58             "M15 L21 M16 L18 K21 M18 M21 N20",
59             "M20 N19 J21 M22 K22 N18 N22 J22"),
60             IOStandard("TTI_1.2V_DDR3")))
```

Acorn CLE-215+ board definition

# Implementation - Processor

- Why RISC-V?
- Open Source ISA
- <50 instructions for RV32I
- Frozen Base Instructions + Standard Instructions
- Custom extensions possible



# Implementation - Processor

RV32I Base Instruction Set

|               |                       |      |       |             |        |
|---------------|-----------------------|------|-------|-------------|--------|
|               | imm[31:12]            |      | rd    | 0110111     | LUI    |
|               | imm[31:12]            |      | rd    | 0010111     | AUIPC  |
|               | imm[20:10:1 11 19:12] |      | rd    | 1101111     | JAL    |
|               | imm[11:0]             | rs1  | 000   | rd          | JALR   |
| imm[12:10:5]  | rs2                   | rs1  | 000   | imm[4:1 11] | BEQ    |
| imm[12:10:5]  | rs2                   | rs1  | 001   | imm[4:1 11] | BNE    |
| imm[12:10:5]  | rs2                   | rs1  | 100   | imm[4:1 11] | BLT    |
| imm[12:10:5]  | rs2                   | rs1  | 101   | imm[4:1 11] | BGE    |
| imm[12:10:5]  | rs2                   | rs1  | 110   | imm[4:1 11] | BLTU   |
| imm[12:10:5]  | rs2                   | rs1  | 111   | imm[4:1 11] | BGEU   |
| imm[11:0]     |                       | rs1  | 000   | rd          | LB     |
| imm[11:0]     |                       | rs1  | 001   | rd          | LH     |
| imm[11:0]     |                       | rs1  | 010   | rd          | LW     |
| imm[11:0]     |                       | rs1  | 100   | rd          | LBU    |
| imm[11:0]     |                       | rs1  | 101   | rd          | LHU    |
| imm[11:5]     | rs2                   | rs1  | 000   | imm[4:0]    | SB     |
| imm[11:5]     | rs2                   | rs1  | 001   | imm[4:0]    | SH     |
| imm[11:5]     | rs2                   | rs1  | 010   | imm[4:0]    | SW     |
| imm[11:0]     |                       | rs1  | 000   | rd          | ADDI   |
| imm[11:0]     |                       | rs1  | 010   | rd          | SLTI   |
| imm[11:0]     |                       | rs1  | 011   | rd          | SLTIU  |
| imm[11:0]     |                       | rs1  | 100   | rd          | XORI   |
| imm[11:0]     |                       | rs1  | 110   | rd          | ORI    |
| imm[11:0]     |                       | rs1  | 111   | rd          | ANDI   |
| 0000000       | shamt                 | rs1  | 001   | rd          | SLLI   |
| 0000000       | shamt                 | rs1  | 101   | rd          | SRLI   |
| 0100000       | shamt                 | rs1  | 101   | rd          | SRAI   |
| 0000000       | rs2                   | rs1  | 000   | rd          | ADD    |
| 0100000       | rs2                   | rs1  | 000   | rd          | SUB    |
| 0000000       | rs2                   | rs1  | 001   | rd          | SLL    |
| 0000000       | rs2                   | rs1  | 010   | rd          | SLT    |
| 0000000       | rs2                   | rs1  | 011   | rd          | SLTU   |
| 0000000       | rs2                   | rs1  | 100   | rd          | XOR    |
| 0000000       | rs2                   | rs1  | 101   | rd          | SRL    |
| 0100000       | rs2                   | rs1  | 101   | rd          | SRA    |
| 0000000       | rs2                   | rs1  | 110   | rd          | OR     |
| 0000000       | rs2                   | rs1  | 111   | rd          | AND    |
| fm            | pred                  | succ | rs1   | 000         | FENCE  |
| 0000000000000 |                       |      | 00000 | 000         | ECALL  |
| 0000000000001 |                       |      | 00000 | 000         | EBREAK |

# Implementation

The screenshot shows a dark-themed website for SQRL. At the top left is the SQRL logo, which includes a stylized fox head icon and the word "SQRL". At the top right are three navigation links: "ABOUT", "PRODUCTS", and "STORE". Below the navigation is a large image of the Acorn CLE 215 mining accelerator card, which is an M.2 form factor card with a green PCB and a central Xilinx FPGA chip. To the right of the image, the product name "Acorn CLE 215" is displayed in bold, followed by the tagline "Maximize hash rate to power ratio" in a smaller, italicized font. A detailed description follows: "Acorn is the first-ever cryptocurrency mining accelerator card. By using best-in-class Xilinx FPGA chips in an M.2 slot, Acorn CLE-215 does everything the CLE-101 does while providing a boost to core-intensive mining using more than double the FPGA logic. Acorn CLE-215 makes mining algorithms like X16r and Lyra2Rev2 on GPUs more efficient." Below this is a section titled "ALGORITHMS" with a list of supported mining algorithms: ETHASH, CRYPTONIGHT V1, LYRA2REV2, and X16R. The price is listed as "\$274.99 USD". A red button at the bottom right of this section says "ORDER NOW".

## Acorn SQRL CLE-215+

# Implementation



Acorn SQRL CLE-215+ (Block Diagram)

# Implementation

Artix-7 FPGA Product Table

|                          | XC7A12T | XC7A15T | XC7A25T | XC7A35T | XC7A50T | XC7A75T | XC7A100T | XC7A200T |
|--------------------------|---------|---------|---------|---------|---------|---------|----------|----------|
| COMPARE                  |         |         |         |         |         |         |          |          |
| Reset                    |         |         |         |         |         |         |          |          |
| XC7A35T                  |         |         |         |         |         |         |          |          |
| Logic Cells              | 33,280  | 52,160  | 75,520  | 101,440 | 215,360 |         |          |          |
| DSP Slices               | 90      | 120     | 180     | 240     | 740     |         |          |          |
| Memory                   | 1,800   | 2,700   | 3,780   | 4,860   | 13,140  |         |          |          |
| GTP 6.6Gb/s Transceivers | 4       | 4       | 8       | 8       | 16      |         |          |          |
| I/O Pins                 | 250     | 250     | 300     | 300     | 500     |         |          |          |

Xilinx Artix-7 Family

# Implementation - Setup





# Demo

# Future Projects

- RISC-V Laptop:  
Verifiable laptop design using RISC-V on FPGA for processor.
- Airwave Project:  
Investigation into cellular device security; with focus on cellular communications. Possible phone implementation based on Software Defined Radio (SDR) for baseband?

# Future Projects



Messing with postmarketOS for Airwave Project

# References and Resources

- Precursor Wiki: <https://github.com/betrusted-io/betrusted-wiki/wiki>
- F4PGA: <https://f4pga.org/>
- LiteX: <https://github.com/enjoy-digital/litex>
- Bootstrapping a Libre, Self-Hosting RISC-V Computer: <https://open-src-soc.org/2021-03/media/slides/3rd-RISC-V-Meeting-2021-03-31-13h30-Gabriel-Somlo.pdf>

# References and Resources

- Stealthy Dopant-Level Hardware Trojans:  
<https://sharps.org/wp-content/uploads/BECKER-CHES.pdf>

# Where to find me:

- Twitter: @quantumcatgirl  
Mastodon: @sleepyowl@chaos.social

For direct contact:

Email: joyce\_ng@hyantechnologies.com

Discord: Schrödinger's Catgirl (Joyce)#3724



# Questions?