

# Avoiding Leaks in Hardware and Software

## My Journey Towards Full-Stack Security



**Klaus v. Gleissenthall**

Assistant Professor

Computers know our **secrets**,  
but will they keep them **safe**?



**Side-channels leak secrets  
indirectly**



# Timing Leaks



# Timing Leaks



# Timing Leaks



# Cache Leaks



# Cache Leaks



# Cache Leaks



# Side-Channels

Used to break:



RSA



DB



Neural Nets



# Programming Model

(1) div  secret



Great! And now my  
programs are safe?

(2) if  secret { ... }



(3) a [  secret ]



# Past Failures



# Past Failures

if



Already  
4 new ones so far,  
in 2023



a [ ]

secret



# My Main Idea



We need to know if,  
when, and where the  
hardware leaks.



Verify hardware is  
safe under certain  
assumptions.

# My Journey

- [1] Iodine: What's timing in HW. **Usenix Security'19.**
- [2] Xenon: Help non-experts find assumptions. **CCS'21.**
- [3] LeaVe: Assumptions at Software level. **CCS'23.**  
**Distinguished paper.**



# My Journey

- [1] Iodine: What's timing in HW. **Usenix Security'19.**
- [2] Xenon: Help non-experts find assumptions. **CCS'21.**
- [3] LeaVe: Assumptions at Software level. **CCS'23.**  
**Distinguished paper.**



# Iodine: Timing in Hardware



Verilog



Iodine

constant time



fast



fast



fast



slow



# Problem: What does *time* mean in hardware?



# Problem: What does *time* mean in hardware?



Input/Output Stream

*Inputs over Multiple Cycles*

*Outputs over Multiple Cycles*

# Problem: What does *time* mean in hardware?



# Problem: Input/Output Stream



# Problem: Input/Output Stream



Which input belongs to which output?

# Problem: Input/Output Stream & Pipelining



Which input belongs to which output?

# Problem: Input/Output Stream & Pipelining



Which input belongs to which output?

# Problem: What does *time* mean in hardware?



Input/Output Stream

*Inputs over Multiple Cycles*

*Outputs over Multiple Cycles*

# Problem: *Inputs* over Multiple Cycles



Computing Hash: Multiple Inputs

# Input/Output Stream



*Inputs over Multiple Cycles*

*Outputs over Multiple Cycles*

# Problem: *Outputs over Multiple Cycles*

|        |    |     |
|--------|----|-----|
| cycle  | 1  | ... |
| number | 10 | ... |
| output | 0  | ... |

Countdown: Multiple Outputs

# Problem: *Outputs* over Multiple Cycles

| cycle  | 1  | 2  | ... |
|--------|----|----|-----|
| number | 10 | 0  | ... |
| output | 0  | 10 | ... |

Countdown: Multiple Outputs

# Problem: *Outputs* over Multiple Cycles

| cycle  | 1  | 2  | 3 | ... |
|--------|----|----|---|-----|
| number | 10 | 0  | 0 | ... |
| output | 0  | 10 | 9 | ... |

Countdown: Multiple Outputs

# Problem: *Outputs* over Multiple Cycles

| cycle  | 1  | 2  | 3 | 4 | ... |
|--------|----|----|---|---|-----|
| number | 10 | 0  | 0 | 0 | ... |
| output | 0  | 10 | 9 | 8 | ... |

Countdown: Multiple Outputs

# Problem: *Outputs over Multiple Cycles*



Countdown: Multiple Outputs

# Which input belongs to which output?



Input/Output Stream

*Inputs over Multiple Cycles*

*Outputs over Multiple Cycles*

# Idea: IODINE

Assign a **color** to all inputs at the same *cycle*  
... and track the flow of **colors** in the circuit

# Assign a **color** to all inputs at the same **cycle**

| cycle   | 1                                                                                     | 2                                                                                     | 3                                                                                     | ... |
|---------|---------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------|-----|
| message |  |  |  | ... |
| key     |  |  |  | ... |
| output  |  |  |  | ... |

... and track the flow of **colors** in the circuit

# Assign a **color** to all inputs at the same **cycle**



... and track the flow of **colors** in the circuit

# Assign a **color** to all inputs at the same **cycle**

| cycle   | 1 | 2 | 3 | 4 | ... |
|---------|---|---|---|---|-----|
| message |   |   |   |   | ... |
| key     |   |   |   |   | ... |
| output  |   |   |   |   | ... |

... and track the flow of **colors** in the circuit

# Assign a **color** to all inputs at the same **cycle**



... and track the flow of **colors** in the circuit

# Assign **color** to inputs at the same **cycle**

| cycle  | 1  | 2 | 3 | 4 | ... |
|--------|----|---|---|---|-----|
| number | 10 |   |   |   | ... |
| output | 0  |   |   |   | ... |

... and track the flow of **colors** in the circuit

# Assign **color** to inputs at the same **cycle**

| cycle  | 1  | 2  | 3 | 4 | ... |
|--------|----|----|---|---|-----|
| number | 10 | 0  |   |   | ... |
| output | 0  | 10 |   |   | ... |

... and track the flow of **colors** in the circuit

# Assign **color** to inputs at the same **cycle**

| cycle  | 1  | 2  | 3 | 4 | ... |
|--------|----|----|---|---|-----|
| number | 10 | 0  | 0 |   | ... |
| output | 0  | 10 | 9 |   | ... |

... and track the flow of **colors** in the circuit

# Assign **color** to inputs at the same **cycle**

| cycle  | 1  | 2  | 3 | 4 | ... |
|--------|----|----|---|---|-----|
| number | 10 | 0  | 0 | 0 | ... |
| output | 0  | 10 | 9 | 8 | ... |

... and track the flow of **colors** in the circuit

# Constant Time Definition via *Coloring*

Constant time:

if for *any* two executions,  
outputs have the **same colors**



# Catching a Timing Attack via Coloring



# Catching a Timing Attack via Coloring



# Catching a Timing Attack via Coloring

Not constant time!



# Catching a Timing Attack via Coloring

Not constant time!



... fixing the circuit, yields

# Catching a Timing Attack via Coloring



... fixing the circuit, yields

# Constant Time via Coloring



... fixing the circuit, yields

# Constant Time via Coloring



# Constant Time via Coloring



# Constant Time via Coloring



# **IODINE**: What's timing in Hardware?

*Specifying Constant time for Hardware*

*Verifying Constant time for Hardware*

*Applying IODINE*

# **IODINE**: What's timing in Hardware?

*Specifying Constant time for Hardware*

*Verifying Constant time for Hardware*

*Applying IODINE*

# Example: MUL x y



fast-path, if x or y is zero

# Example: MUL x y



if ct is set, don't take fast path

# Example: MUL x y



Implemented in ARM DIT

# Example: MUL x y

```
assign iszero = (x==0 || y==0);

always @(posedge clk) begin
    if (iszero && !ct)
        out <= 0;
    else
        out <= flp_res;
end

always @(posedge clk) begin
    flp_res <= ... //compute x*y
end
```

# Always Blocks

```
assign iszero = (x==0 || y==0);
```

```
always @(posedge clk) begin
    if (iszero && !ct)
        out <= 0;
    else
        out <= flp_res;
end
```

```
always @(posedge clk) begin
    flp_res <= ... //compute x*y
end
```

Run in parallel

# Always Blocks

```
assign iszero = (x==0 || y==0);  
  
always @(posedge clk) begin  
    if (iszero && !ct)  
        out <= 0;  
    else  
        out <= flp_res;  
end  
  
always @(posedge clk) begin  
    flp_res <= ... //compute x*y  
end
```

Run in parallel, shared clock

# Always Blocks

```
assign iszero = (x==0 || y==0);

always @ (posedge clk) begin
    if (iszero && !ct)
        out <= 0;
    else
        out <= flp_res;
end

always @ (posedge clk) begin
    flp_res <= ... //compute x*y
end
```

Write to different variables

# Nonblocking Assignment

```
assign iszero = (x==0 || y==0);

always @ (posedge clk) begin
    if (iszero && !ct)
        out <= 0;
    else
        out <= flp_res;
end

always @ (posedge clk) begin
    flp_res <= ... //compute x*y
end
```

LHS gets update in next cycle

# Continuous Assignment

```
assign iszero = (x==0 || y==0);  
  
always @ (posedge clk) begin  
    if (iszero && !ct)  
        out <= 0;  
    else  
        out <= flp_res;  
end  
  
always @ (posedge clk) begin  
    flp_res <= ... //compute x*y  
end
```

LHS updated as soon as RHS changes

# Take fast path, if an input is zero and not ct

```
assign iszero = (x==0 || y==0);

always @ (posedge clk) begin
    if (iszero && !ct)
        out <= 0;
    else
        out <= flp_res;
end

always @ (posedge clk) begin
    flp_res <= ... //compute x*y
end
```

# Take fast path, if an input is zero and not ct

```
assign iszero = (x==0 || y==0);

always @ (posedge clk) begin
    if (iszero && !ct)
        out <= 0;
    else
        out <= flp_res;
end

always @ (posedge clk) begin
    flp_res <= ... //compute x*y
end
```

else, assign result from slow path

# Compute slow path result

```
assign iszero = (x==0 || y==0);

always @(posedge clk) begin
    if (iszero && !ct)
        out <= 0;
    else
        out <= flp_res;
end

always @(posedge clk) begin
    flp_res <= ... //compute x*y
end
```

# Example: MUL x y with ct=0



Slow path

# Example: MUL x y with ct=0

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 1       | 1 |     |   |
| y   | 1       | 1 |     |   |
| out | ⊥       | ⊥ |     |   |

Slow path

# Example: MUL x y with ct=0

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 1       | 1 | ... | 1 |
| y   | 1       | 1 | ... | 1 |
| out | 1       | 1 | ... | 1 |

Slow path

# Example: MUL x y with ct=0

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 1       | 1 | ... | 1 |
| y   | 1       | 1 | ... | 1 |
| out | ⊥       | ⊥ | ... | 1 |

Slow path

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 0       |   |     |   |
| y   | 1       |   |     |   |
| out | ⊥       |   |     |   |

Fast path

# Example: MUL x y with ct=0

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 1       | 1 | ... | 1 |
| y   | 1       | 1 | ... | 1 |
| out | ⊥       | ⊥ | ... | 1 |

Slow path

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 0       | 0 |     |   |
| y   | 1       | 1 |     |   |
| out | ⊥       | 0 |     |   |

Fast path

# Example: MUL x y with ct=0

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 1       | 1 | ... | 1 |
| y   | 1       | 1 | ... | 1 |
| out | ⊥       | ⊥ | ... | 1 |

Slow path

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 0       | 0 | ... | 0 |
| y   | 1       | 1 | ... | 1 |
| out | ⊥       | 0 | ... | 0 |

Fast path

# Example: MUL x y with ct=0

Not constant time!

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 1       | 1 | ... | 1 |
| y   | 1       | 1 | ... | 1 |
| out | ⊥       | ⊥ | ... | 1 |

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 0       | 0 | ... | 0 |
| y   | 1       | 1 | ... | 1 |
| out | ⊥       | 0 | ... | 0 |



# Example: MUL x y with ct=1

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 1       | 1 | ... | 1 |
| y   | 1       | 1 | ... | 1 |
| out | ⊥       | ⊥ | ... | 1 |

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 0       |   |     |   |
| y   | 1       |   |     |   |
| out | ⊥       |   |     |   |

Slow path

Slow path

# Example: MUL x y with ct=1

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 1       | 1 | ... | 1 |
| y   | 1       | 1 | ... | 1 |
| out | ⊥       | ⊥ | ... | 1 |

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 0       | 0 |     |   |
| y   | 1       | 1 |     |   |
| out | ⊥       | ⊥ |     |   |

Slow path

Slow path

# Example: MUL x y with ct=1

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 1       | 1 | ... | 1 |
| y   | 1       | 1 | ... | 1 |
| out | ⊥       | ⊥ | ... | 1 |

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 0       | 0 | 0   | 0 |
| y   | 1       | 1 | 1   | 1 |
| out | ⊥       | ⊥ | ⊥   | 0 |

Slow path

Slow path

# Example: MUL x y with ct=1

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 1       | 1 | ... | 1 |
| y   | 1       | 1 | ... | 1 |
| out | ⊥       | ⊥ | ... | 1 |

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 0       | 0 | 0   | 0 |
| y   | 1       | 1 | 1   | 1 |
| out | ⊥       | ⊥ | ⊥   | 0 |



# Example: MUL x y with ct=1

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 1       | 1 | ... | 1 |
| y   | 1       | 1 | ... | 1 |
| out | ⊥       | ⊥ | ... | 1 |

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 0       | 0 | 0   | 0 |
| y   | 1       | 1 | 1   | 1 |
| out | ⊥       | ⊥ | ⊥   | 0 |



Example: MUL x y with ct=1

# Is constant time!

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 1       | 1 | ... | 1 |
| y   | 1       | 1 | ... | 1 |
| out | ⊥       | ⊥ | ... | 1 |

|     | cycle 1 | 2 | ... | k |
|-----|---------|---|-----|---|
| x   | 0       | 0 | 0   | 0 |
| y   | 1       | 1 | 1   | 1 |
| out | ⊥       | ⊥ | ⊥   | 0 |



# Verifying Constant Time for Hardware



P

P is **constant time**, if  $ct=1$

# Verifying Constant Time for Hardware

| cyc | 1 | 2 | ... | k |
|-----|---|---|-----|---|
| x   | 1 | 1 | ... | 1 |
| y   | 1 | 1 | ... | 1 |
| out | 1 | 1 | ... | 1 |

| cyc | 1 | 2 | ... | k |
|-----|---|---|-----|---|
| x   | 0 | 0 | ... | 0 |
| y   | 1 | 1 | ... | 1 |
| out | 0 | 1 | ... | 0 |



P

# Verifying Constant Time for Hardware

| cyc | 1 | 2 | ... | k |
|-----|---|---|-----|---|
| x   | 1 | 1 | ... | 1 |
| y   | 1 | 1 | ... | 1 |
| out | 1 | 1 | ... | 1 |

| cyc | 1 | 2 | ... | k |
|-----|---|---|-----|---|
| x   | 0 | 0 | ... | 0 |
| y   | 1 | 1 | ... | 1 |
| out | 1 | 1 | ... | 0 |



$P_L$



$P_R$

# Verifying Constant Time for Hardware

| cyc | 1 | 2 | ... | k |
|-----|---|---|-----|---|
| x   | 1 | 1 | ... | 1 |
| y   | 1 | 1 | ... | 1 |
| out | 1 | 1 | ... | 1 |

| cyc | 1 | 2 | ... | k |
|-----|---|---|-----|---|
| x   | 0 | 0 | ... | 0 |
| y   | 1 | 1 | ... | 1 |
| out | 1 | 1 | ... | 0 |



# Verifying Constant Time for Hardware

| cyc | 1 | 2 | ... | k |
|-----|---|---|-----|---|
| x   | 1 | 1 | ... | 1 |
| y   | 1 | 1 | ... | 1 |
| out | 1 | 1 | ... | 1 |

| cyc | 1 | 2 | ... | k |
|-----|---|---|-----|---|
| x   | 0 | 0 | ... | 0 |
| y   | 1 | 1 | ... | 1 |
| out | 1 | 1 | ... | 0 |



# Verifying Constant Time for Hardware

| cyc | 1 | 2 | ... | k |
|-----|---|---|-----|---|
| x   | 1 | 1 | ... | 1 |
| y   | 1 | 1 | ... | 1 |
| out | 1 | 1 | ... | 1 |

| cyc | 1 | 2 | ... | k |
|-----|---|---|-----|---|
| x   | 0 | 0 | ... | 0 |
| y   | 1 | 1 | ... | 1 |
| out | 0 | 1 | ... | 0 |

$P_L$  | |  $P_R$

assert ( $P_L.out.cols = P_R.out.cols$ )

# Verifying Constant Time for Hardware

| cyc | 1 | 2 | ... | k |
|-----|---|---|-----|---|
| x   | 1 | 1 | ... | 1 |
| y   | 1 | 1 | ... | 1 |
| out | 1 | 1 | ... | 1 |

| cyc | 1 | 2 | ... | k |
|-----|---|---|-----|---|
| x   | 0 | 0 | ... | 0 |
| y   | 1 | 1 | ... | 1 |
| out | 0 | 1 | ... | 0 |

assume (  $P_L^{.ct} = 1 \wedge P_R^{.ct} = 1$  )

$P_L$  ||  $P_R$

assert (  $P_L^{.out.cols} = P_R^{.out.cols}$  )

# Verifying Constant Time for Hardware

assume  $(P_L.ct = 1 \wedge P_R.ct = 1)$

 || 

assert  $(P_L.out.cols = P_R.out.cols)$



$\exists:$   ...

Horn Clauses



# Verifying Constant Time for Hardware



# Verifying Constant Time for Hardware

$\text{init}(P) = \text{true}$

# Verifying Constant Time for Hardware

$$\text{init}(P) = \text{true}$$

$$\text{next}(P, P') = \left( ct=0 \wedge \text{iszzero}=1 \rightarrow \text{out}'=0 \right)$$

# Verifying Constant Time for Hardware

$$\text{init}(P) = \text{true}$$

$$\text{next}(P, P') = \left( \begin{array}{l} \text{ct}=0 \wedge \text{iszzero}=1 \rightarrow \text{out}'=0 \\ \wedge \dots \rightarrow \text{out}'=\text{flp\_res} \end{array} \right)$$

# Verifying Constant Time for Hardware

$$\text{init}(P) = \text{true}$$

$$\begin{aligned} \text{next}(P, P') &= \left( \begin{array}{l} \text{ct}=0 \quad \text{iszzero}=1 \rightarrow \text{out}'=0 \\ \wedge \dots \rightarrow \text{out}'=\text{flp\_res} \end{array} \right) \\ &\wedge \quad \text{flp\_res}' = \dots \end{aligned}$$

# Verifying Constant Time for Hardware

$$\text{init}(P) = \text{true}$$

$$\begin{aligned} \text{next}(P, P') &= \left( \begin{array}{l} \text{ct}=0 \quad \text{iszzero}=1 \rightarrow \text{out}'=0 \\ \wedge \dots \rightarrow \text{out}'=\text{flp\_res} \end{array} \right) \\ &\wedge \quad \text{flp\_res}' = \dots \\ &\wedge \quad \text{iszzero} = (\text{x}=0 \vee \text{y}=0) \end{aligned}$$

# Verifying Constant Time for Hardware

$\exists: \text{inv}(\boxed{P_L}, \boxed{P_R})$

$P_L.\text{ct} = 1 \wedge P_R.\text{ct} = 1 \wedge \text{init}(P_L) \wedge \text{init}(P_R) \rightarrow \text{inv}(\boxed{P_L}, \boxed{P_R})$

$(\frac{P_L.\text{ct} = 1}{P_L.\text{ct} = 1}) \wedge \text{inv}(\boxed{P_L}, \boxed{P_R}) \wedge \text{next}(P_L, P_L') \wedge \text{next}(P_R, P_R') \rightarrow \text{inv}(\boxed{P_L'}, \boxed{P_R'})$

$\text{inv}(\boxed{P_L}, \boxed{P_R}) \rightarrow (P_L.\text{out.cols} = P_R.\text{out.cols})$

Horn Clauses

# Verifying Constant Time for Hardware

$\exists: \text{inv}(\boxed{P_L}, \boxed{P_R})$



Horn Clauses

# Verifying Constant Time for Hardware

$\exists: \text{inv}(\boxed{P_L}, \boxed{P_R})$



# **IODINE**: What's timing in Hardware?

*Specifying Constant time for Hardware*

*Verifying Constant time for Hardware*

*Applying IODINE*

# **IODINE**: What's timing in Hardware?

*Specifying Constant time for Hardware*

*Verifying Constant time for Hardware*

*Applying IODINE*

# Applying IODINE: Benchmarks

|                           |   |                                                                                       |              |
|---------------------------|---|---------------------------------------------------------------------------------------|--------------|
| 472 MIPS                  | ✓ |   | Simple CPUs  |
| Yarvi RISC-V              | ✓ |                                                                                       |              |
| Single precision FPU      | ✓ |                                                                                       |              |
| IEEE 754 FPU              | ✗ |  | ALU/FPU      |
| TIS-CT ALU                | ✓ |                                                                                       |              |
| Opencores Shacore SHA-256 | ✓ |                                                                                       |              |
| Fatestudio RSA 4096 RSA   | ✗ |  | Crypto Cores |

# My Journey

- [1] Iodine: What's timing in HW. **Usenix Security'19.**
- [2] Xenon: Help non-experts find assumptions. **CCS'21.**
- [3] LeaVe: Assumptions at Software level. **CCS'23.**  
**Distinguished paper.**



# My Journey

- [1] Iodine: What's timing in HW. **Usenix Security'19.**
- [2] Xenon: Help non-experts find assumptions. **CCS'21.**
- [3] LeaVe: Assumptions at Software level. **CCS'23.**  
**Distinguished paper.**



# Assumptions in IODINE

## Programming Model

- (1) `div`    
  
- (2) `if`    
  
- (3) `a [`    
  

Processors are  
only safe  
under certain assumptions



# Assumptions in IODINE

if  { ... }   
secret



assume ( $P_L \cdot \text{instruction} = P_R \cdot \text{instruction}$ )

assume ( $P_L \cdot pc = P_R \cdot pc$ )

# Assumptions in IODINE

if  { ... }   
secret



visible ( instruction )  
visible ( pc )

# Problem: How to find Assumptions?

Verilog

**visible(a) visible(b)**

# Problem: How to find Assumptions?



# Problem: How to find Assumptions?



# Problem: How to find Assumptions?



Finding missing assumptions  
makes up most of verification effort

# Xenon: Solver-Aided Assumption Synthesis



# Xenon: Solver-Aided Assumption Synthesis



# Xenon: Solver-Aided Assumption Synthesis



# Xenon: Solver-Aided Assumption Synthesis



# Xenon: Solver-Aided Assumption Synthesis



# Xenon: Solver-Aided Assumption Synthesis





Idea: Trace how variables became *non-ct*



Idea: Trace how variables became *non-ct*



Idea: Trace how variables became *non-ct*



Idea: Trace how variables became *non-ct*



Idea: Trace how variables became *non-ct*



Idea: Trace how variables became *non-ct*



Extract order from failed proof attempt

# Fix by adding Assumption



# Fix by adding Assumption



Find Assumptions via Optimization Problem

# Xenon: Solver-Aided Assumption Synthesis



SCARV-CPU

<https://www.scarv.org/>

- 5 stage, in-order
- side-channel hardened
- RISC-V 32-bit integer base

# Xenon: Solver-Aided Assumption Synthesis



# Xenon: User Study

- 3 Tasks: ALU, FPU, and RISC-V core



% of participants who correctly completed

# Xenon: User Study

- 3 Tasks: ALU, FPU, and RISC-V core



Completed tasks over time

# Xenon: User Study

- 3 Tasks: ALU, FPU, and RISC-V core



# My Journey

- [1] Iodine: What's timing in HW. **Usenix Security'19.**
- [2] Xenon: Help non-experts find assumptions. **CCS'21.**
- [3] LeaVe: Assumptions at Software level. **CCS'23.**  
**Distinguished paper.**



# My Journey

- [1] Iodine: What's timing in HW. **Usenix Security'19.**
- [2] Xenon: Help non-experts find assumptions. **CCS'21.**
- [3] LeaVe: Assumptions at Software level. **CCS'23.**  
**Distinguished paper.**



**Problem:** Assumptions about internal processor state

# Problem: Assumptions about internal processor state



hardware



software

Complicated: How to write software that satisfies assumptions?



# LeaVe: Represent assumptions directly in software



hardware



software

...and formally verify that under the assumptions  
hardware doesn't leak



: If no leakage in software → Guaranteed no leakage in hardware

# Example: Simple Processor

- Single register
- Instructions:

ADD imm

MUL imm

CLR

# Example: Simple Processor

ADD 1    MUL 2    CLR

0

# Example: Simple Processor

ADD 1    MUL 2    CLR

---

0

# Example: Simple Processor

ADD 1    MUL 2    CLR

---



# Example: Simple Processor

ADD 1    MUL 2    CLR



# Example: Simple Processor

ADD 1    MUL 2    CLR



# Example: Simple Processor

ADD 1    MUL 2    CLR



# Example: Simple Processor

ADD 1    MUL 2    CLR





Software level specification is much simpler

# Software



```
1 wire [31:0] instr = imem[pc];
2 assign op = instr[7:0];
3 assign imm = instr[31:8];
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
```

always @ (posedge clk) begin  
 pc <= pc + 1;  
end

always @ (posedge clk) begin  
 case(op)  
 'ADD : register <= register + imm;  
 'MUL : register <= register \* imm;  
 'CLR : register <= 0;  
 end

# Software



```
1 wire [31:0] instr = imem[pc];
2 assign op = instr[7:0];
3 assign imm = instr[31:8];
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
```

always @ (posedge clk) begin  
 pc <= pc + 1;  
end

always @ (posedge clk) begin  
 case(op)  
 'ADD : register <= register + imm;  
 'MUL : register <= register \* imm;  
 'CLR : register <= 0;  
 end

# Software



```
1 wire [31:0] instr = imem[pc];
2 assign op = instr[7:0];
3 assign imm = instr[31:8];
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
```

The code is a Verilog-like script for a simple processor. It starts by defining a wire 'instr' of width 31:0, which is assigned the value of memory at address 'pc'. It then extracts the operation code ('op') from bits 7:0 and the immediate value ('imm') from bits 31:8. The script then contains two always blocks. The first always block, triggered by a positive edge of 'clk', increments the program counter ('pc') by 1. The second always block, also triggered by a positive edge of 'clk', performs a case statement based on the value of 'op'. If 'op' is 'ADD', it adds 'imm' to the current value of 'register'. If 'op' is 'MUL', it multiplies 'register' by 'imm'. If 'op' is 'CLR', it sets 'register' to 0. The case block ends with a default case and a final 'end' keyword.

# Software



```
1 wire [31:0] instr = imem[pc];
2 assign op = instr[7:0];
3 assign imm = instr[31:8];
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
```

```
always @(posedge clk) begin
    pc <= pc + 1;
end

always @ (posedge clk) begin
    case(op)
        'ADD : register <= register + imm;
        'MUL : register <= register * imm;
        'CLR : register <= 0;
    end

```

# Hardware



# Hardware



Fetch

Execute

Write-Back



# Hardware



**ADD 1**   MUL 2   CLR   ADD 3

Fetch

Execute

Write-Back



# Hardware



ADD 1   MUL 2   CLR   ADD 3

Fetch

Execute

Write-Back

ADD 1

No-op

No-op



# Hardware



ADD 1   MUL 2   CLR   ADD 3

Fetch

Execute

Write-Back

MUL 2

ADD 1

No-op



# Hardware



ADD 1   MUL 2   CLR   ADD 3

Fetch                      Execute                      Write-Back



# Hardware



ADD 1    MUL 2    CLR    ADD 3

Fetch

Execute

Write-Back

ADD 3

CLR

MUL 2



# Hardware



ADD 1   MUL 2   CLR   ADD 3

Fetch

Execute

Write-Back



# Hardware



ADD 1   MUL 2   CLR   ADD 3

Fetch

Execute

Write-Back



# Hardware



ADD 1   MUL 2   CLR   ADD 3

Fetch

Execute

Write-Back



# Hardware: Leaky Multiplier



Take multiple cycles

Fast path on 0/1 for imm

Time depends on register



How to describe hardware leakage at software level?

The attacker can see execution time



attacker    processor

Which programs can an attacker distinguish?

# Which programs can an attacker distinguish?

ADD 2    MUL 15

---



Left uses **MUL**

ADD 2    ADD 2

---

 slow, right doesn't  fast

# Which programs can an attacker distinguish?

ADD 2    MUL 1

---



Left takes fast path

ADD 2    MUL 15

---



, right doesn't



# Which programs can an attacker distinguish?

ADD 10    MUL 2



Left register value is higher



ADD 2    MUL 2



than right

*Idea:* Describe leakage via leakage monitor

*Idea:* Describe leakage via leakage monitor



# Idea: Describe leakage via leakage monitor



Make sure the hardware doesn't leak more than specified by the monitor

# LeaVe: Capture leakage via leakage monitor

*Defining the leakage monitor*



*Checking the leakage monitor*

*Verifying the leakage monitor*

*Using the leakage monitor*

} done by LeaVe

# *Defining the leakage monitor*



{

,

,

}

# *Defining the leakage monitor*

- Is the operation MUL?



```
{isMul,  
     ,  
 }
```

# *Defining the leakage monitor*

{isMul, fast,  
}



- Is the operation MUL?
- Did we take the fast-path?

# *Defining the leakage monitor*

{isMul, fast, register}



- Is the operation MUL?
- Did we take the fast-path?
- Register value.

# *Defining the leakage monitor*



{isMul, fast, register}

{F, -, -}

ADD 2

{T, F, 2}

MUL 3

{T, T, 2}

MUL 1

# LeaVe: Capture leakage via leakage monitor

*Defining the leakage monitor*



*Checking the leakage monitor*

*Verifying the leakage monitor*

*Using the leakage monitor*

} done by LeaVe

# Checking the leakage monitor

ADD 2  
MUL 15

ADD 2  
ADD 2



then

ADD 2  
MUL 15

ADD 2  
ADD 2



# Checking the leakage monitor



Left uses MUL



, right doesn't



$\neq$



{F, -, -}

ADD 2

{T, F, 2}

MUL 15

{F, -, -}

ADD 2

{F, -, -}

ADD 2

# Checking the leakage monitor



Left takes fast path



, right doesn't



{F, -, -}

ADD 2

{T, T, 2}

MUL 1

{F, -, -}

ADD 2

{T, F, 2}

MUL 15



# Checking the leakage monitor

Left register value is higher ↓ than right →



≠



{F, -, -}

ADD 10

{T, F, 10}

MUL 2

{F, -, -}

ADD 2

{T, F, 2}

MUL 2

# LeaVe: Capture leakage via leakage monitor

*Defining the leakage monitor*



*Checking the leakage monitor*

*Verifying the leakage monitor*

*Using the leakage monitor*

} done by LeaVe

# Verifying the leakage monitor

ADD 2  
MUL 15

ADD 2  
ADD 2



then

ADD 2  
MUL 15

ADD 2  
ADD 2



# Verifying the leakage monitor

ADD 2  
MUL 15

ADD 2  
ADD 2

ADD 2  
MUL 15

ADD 2  
ADD 2



=



then



=



# *Verifying the leakage monitor*



# *Verifying the leakage monitor*



# Verifying the leakage monitor



# Verifying the leakage monitor

assume (  L =  R )



assert (  L =  R )

2+2 copies. Doesn't scale!

# Verifying the leakage monitor

assume (  $L = R$  when  retires) \*



assert (  $L = R$ )

Only 2 copies. Scales!

\* assuming no functional bugs

# LeaVe: Capture leakage via leakage monitor

*Defining the leakage monitor*



*Checking the leakage monitor*

*Verifying the leakage monitor*

*Using the leakage monitor*

} done by LeaVe

# Using the leakage monitor

ADD 2  
MUL 15

ADD 2  
ADD 2



=



then

ADD 2  
MUL 15

ADD 2  
ADD 2



=



# Evaluating LeaVe

Formally verified how to avoid leakage in software on real processors!

| Processor        | Leakage Description | Checking Time (min) |
|------------------|---------------------|---------------------|
| DarkRISC 2 stage |                     | 7.2                 |
| DarkRISC 3 stage |                     | 11.1                |
| Sodor 2 stage    | +                   | 97.8                |
| Ibex-small       | +                   | 118.7               |
| Ibex-slow-mult   | +  +                | 110.5               |
| Ibex-cache       | +  +                | 139.8               |

: Instructions    : Branch    : Address    : Operand

# My Journey

- [1] Iodine: What's timing in HW. **Usenix Security'19.**
- [2] Xenon: Help non-experts find assumptions. **CCS'21.**
- [3] LeaVe: Assumptions at Software level. **CCS'23.**  
**Distinguished paper.**



# My Journey



# Future Work



# Future Work



# Come join!

