

arm



34<sup>th</sup> Chaos Communication Congress

34c3:huzzah!

# How Can You Trust Formally Verified Software?

Alastair Reid

@alastair\_d\_reid

ARM Research

# Arm Processor Architecture

Widely used in many different areas: phones, tablets, IoT, HDD, ...

Important to understand what they do

Important to be able to analyse malware, security analysis, etc.

April 2011: Started work on formal specifications of ARM processor architectures

April 2017: Public release in machine readable form

<https://developer.arm.com/products/architecture/a-profile/exploration-tools>

Working with REMS @ Cambridge Uni to translate ARM spec to SAIL to HOL/OCaml/...

**3UCB:guuuuuh!**

What can you do with an executable processor specification  
How can you trust formally verified software?

# ARM Machine Readable Architecture Specification

## Instructions

Security features: memory protection, exceptions, privilege checks, TrustZone, ...

## Links

- Official ARM release <https://developer.arm.com/products/architecture/a-profile/exploration-tools>
- HTML files (part of official release) <https://www.meriac.com/archex/>
- Tools to dissect the official release (incl. parser) [https://github.com/alastairreid/mra\\_tools](https://github.com/alastairreid/mra_tools)
- Blog article about release <https://alastairreid.github.io/ARM-v8a-xml-release/>
- Papers
  - “Trustworthy Specifications of the ARM v8-A and v8-M architecture,” [FMCAD 2016](#)
  - “End to End Verification of ARM processors with ISA Formal,” [CAV 2016](#)
  - “Who guards the guards? Formal Validation of ARM v8-M Specifications,” [OOPSLA 2017](#)

# SCTLR, System Control Register

The SCTLR characteristics are:

## Purpose

Provides the top level control of the system, including its memory system.



# SCTLR, System Control Register

The SCTLR characteristics are:

## Purpose

Provides the top level control of the system, including its memory system.

## Field descriptions

The SCTLR bit assignments are:



# SCTLR, System Control Register

The SCTLR characteristics are:

## Purpose

Provides the top level control of the system, including its memory system.

## Field descriptions

The SCTLR bit assignments are:



### WXN, bit [19]

Write permission implies XN (Execute-never). For the PL1&0 translation regime, this bit can force all memory regions that are writable to be treated as XN. The possible values of this bit are:

| WXN | Meaning                                                                                                                         |
|-----|---------------------------------------------------------------------------------------------------------------------------------|
| 0   | This control has no effect on memory access permissions.                                                                        |
| 1   | Any region that is writable in the PL1&0 translation regime is forced to XN for accesses from software executing at PL1 or PL0. |

The WXN bit is permitted to be cached in a TLB.

When this register has an architecturally-defined reset value, this field resets to 0.

MRC p15, 0, R0, c1, c0, 0

ORR R0, R0, #0x80000

MCR p15, 0, R0, c1, c0, 0

MRC p15, 0, R0, c1, c0, 0

ORR R0, R0, #0x80000

MCR p15, 0, R0, c1, c0, 0

MRC R0, SCTRLR

ORR R0, R0, #0x80000

MCR R0, SCTRLR

MRC p15, 0, R0, c1, c0, 0

ORR R0, R0, #0x80000

MCR p15, 0, R0, c1, c0, 0

MRC R0, SCTRLR

ORR R0, R0, #0x80000

MCR R0, SCTRLR

SCTRLR.WXN = 1;

```
MRC p15, 0, R0, c1, c0, 0
```

```
ORR R0, R0, #0x80000
```

```
MCR p15, 0, R0, c1, c0, 0
```

```
MRC R0, SCTRLR
```

```
ORR R0, R0, #0x80000
```

```
MCR R0, SCTRLR
```

```
SCTRLR.WXN = 1;
```

Write permission implies XN (Execute-never). For the PL1&0 translation regime, this bit can force all memory regions that are writable to be treated as XN. The possible values of this bit are:

| WXN | Meaning                                                                                                                         |
|-----|---------------------------------------------------------------------------------------------------------------------------------|
| 0   | This control has no effect on memory access permissions.                                                                        |
| 1   | Any region that is writable in the PL1&0 translation regime is forced to XN for accesses from software executing at PL1 or PL0. |

## ADD (immediate)

### ADD (immediate)

Add (immediate) adds a register value and an optionally-shifted immediate value, and writes the result to the destination register.

This instruction is used by the alias [MOV \(to/from SP\)](#).

|    |    |    |    |    |    |    |    |       |    |       |    |    |    |    |    |    |    |    |    |    |    |   |   |   |   |   |   |   |   |   |   |  |
|----|----|----|----|----|----|----|----|-------|----|-------|----|----|----|----|----|----|----|----|----|----|----|---|---|---|---|---|---|---|---|---|---|--|
| 31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23    | 22 | 21    | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |  |
| sf | 0  | 0  | 1  | 0  | 0  | 0  | 1  | shift |    | imm12 |    | Rn |    | Rd |    |    |    |    |    |    |    |   |   |   |   |   |   |   |   |   |   |  |

32-bit (sf == 0)

```
ADD <Wd|WSP>, <Wn|WSP>, #<imm>{, <shift>}
```

64-bit (sf == 1)

```
ADD <Xd|SP>, <Xn|SP>, #<imm>{, <shift>}

integer d = UInt(Rd);
integer n = UInt(Rn);
integer datasize = if sf == '1' then 64 else 32;
bits(datasize) imm;

case shift of
  when '00' imm = ZeroExtend(imm12, datasize);
  when '01' imm = ZeroExtend(imm12:Zero{12}, datasize);
  when '1x' ReservedValue();
```

### Assembler Symbols

|          |                                                                                                             |
|----------|-------------------------------------------------------------------------------------------------------------|
| <Wd WSP> | Is the 32-bit name of the destination general-purpose register or stack pointer, encoded in the 'Rd' field. |
| <Wn WSP> | Is the 32-bit name of the source general-purpose register or stack pointer, encoded in the 'Rn' field.      |
| <Xd SP>  | Is the 64-bit name of the destination general-purpose register or stack pointer, encoded in the 'Rd' field. |
| <Xn SP>  | Is the 64-bit name of the source general-purpose register or stack pointer, encoded in the 'Rn' field.      |
| imm      | Is an unsigned immediate, in the range 0 to 4095, encoded in the 'imm12' field.                             |
| shift    | Is the optional left shift to apply to the immediate, defaulting to LSL #0 and encoded in 'shift':          |

|       |          |
|-------|----------|
| shift | <shift>  |
| 00    | LSL #0   |
| 01    | LSL #12  |
| 1x    | RESERVED |

### Alias Conditions

|                  |                                                                            |
|------------------|----------------------------------------------------------------------------|
| Alias            | Is preferred when                                                          |
| MOV (to/from SP) | shift == '00' && imm12 == '0000000000' && (Rd == '11111'    Rn == '11111') |

  

|                                                                                                                                                                                                               |  |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| Operation                                                                                                                                                                                                     |  |
| bits(datasize) result;<br>bits(datasize) operand1 = if n == 31 then SP[] else X[n];<br><br>(result, =) = AddWithCarry(operand1, imm, '0');<br><br>if d == 31 then<br>SP[] = result;<br>else<br>X[d] = result; |  |

This instruction is used by the alias [MOV \(to/from SP\)](#).

|    |    |    |    |    |    |    |    |       |    |       |    |    |    |    |    |    |    |    |    |    |    |   |   |   |   |   |   |   |   |   |   |
|----|----|----|----|----|----|----|----|-------|----|-------|----|----|----|----|----|----|----|----|----|----|----|---|---|---|---|---|---|---|---|---|---|
| 31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23    | 22 | 21    | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| sf | 0  | 0  | 1  | 0  | 0  | 0  | 1  | shift |    | imm12 |    | Rn |    | Rd |    |    |    |    |    |    |    |   |   |   |   |   |   |   |   |   |   |

### 32-bit (sf == 0)

**ADD <Wd|WSP>, <Wn|WSP>, #<imm>{, <shift>}**

### 64-bit (sf == 1)

**ADD <Xd|SP>, <Xn|SP>, #<imm>{, <shift>}**

## Assembler Symbols

**<Wd|WSP>** Is the 32-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" field.

**<Wn|WSP>** Is the 32-bit name of the source general-purpose register or stack pointer, encoded in the "Rn" field.

**<Xd|SP>** Is the 64-bit name of the destination general-purpose register or stack pointer, encoded in the "Rd" field.

# Assembler / Disassembler

<https://alastairreid.github.io/bidirectional-assemblers/>



**ADD <Xd|SP>, <Xn|SP>, #<imm>{, <shift>}**

[sf:"1"; op:"0"; S:"0"; "10001"; shift:"xx"; imm12:"xxxxxxxxxxxx"; Rn:"xxxxx"; Rd:"xxxxx"]

<->

"ADD" " " <Xd|SP> "," " " <Xn|SP> "," " " [ "#" ] <imm> " " [ "," " " <shift> ]

where

```
<Xd|SP> = RegXSP(UInt(Rd)) ;
<Xn|SP> = RegXSP(UInt(Rn)) ;
<imm>   = UInt(imm12) ;
<shift> = Optional("LSL #0",
                     case shift {
                       '00' <-> "LSL #0";
                       '01' <-> "LSL #12";
                       '1x' <-> RESERVED();
                     });

```

© 2017 Arm Limited

arm

See also: <https://github.com/agustingianni/retools> and <https://github.com/nspin/hs-arm>

Add (immediate) adds a register value and an optionally-shifted immediate value, and writes the result to the destination register.

This instruction is used by the alias [MOV \(to/from SP\)](#).

|    |    |    |    |    |    |    |    |       |    |    |    |    |    |    |       |    |    |    |    |    |    |   |    |   |   |   |    |   |   |   |   |
|----|----|----|----|----|----|----|----|-------|----|----|----|----|----|----|-------|----|----|----|----|----|----|---|----|---|---|---|----|---|---|---|---|
| 31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23    | 22 | 21 | 20 | 19 | 18 | 17 | 16    | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8  | 7 | 6 | 5 | 4  | 3 | 2 | 1 | 0 |
| sf | 0  | 0  | 1  | 0  | 0  | 0  | 1  | shift |    |    |    |    |    |    | imm12 |    |    |    |    |    |    |   | Rn |   |   |   | Rd |   |   |   |   |

```
integer d = UInt(Rd);
integer n = UInt(Rn);
integer datasize = if sf == '1' then 64 else 32;
bits(datasize) imm;

case shift of
    when '00' imm = ZeroExtend(imm12, datasize);
    when '01' imm = ZeroExtend(imm12:Zeros(12), datasize);
    when '1x' ReservedValue();
bits(datasize) result;
bits(datasize) operand1 = if n == 31 then SP[] else X[n];

(result, -) = AddWithCarry(operand1, imm, '0');

if d == 31 then
    SP[] = result;
else
    X[d] = result;
```

|    |    |    |    |    |    |    |    |       |    |    |    |    |    |    |    |    |    |    |    |    |    |   |   |   |   |   |   |   |   |   |   |  |
|----|----|----|----|----|----|----|----|-------|----|----|----|----|----|----|----|----|----|----|----|----|----|---|---|---|---|---|---|---|---|---|---|--|
| 31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23    | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |  |
| sf | 0  | 0  | 1  | 0  | 0  | 0  | 1  | shift |    |    |    |    |    |    |    |    |    |    |    |    |    |   |   |   |   |   |   |   |   |   |   |  |
| op | S  |    |    |    |    |    |    |       |    |    |    |    |    |    |    |    |    |    |    |    |    |   |   |   |   |   |   |   |   |   |   |  |

```

integer d = UInt(Rd);
integer n = UInt(Rn);
integer datasize = if sf == '1' then 64 else 32;
bits(datasize) imm;

case shift of
    when '00' imm = ZeroExtend(imm12, datasize);
    when '01' imm = ZeroExtend(imm12:Zeros(12), datasize);
    when '1x' ReservedValue();
bits(datasize) result;
bits(datasize) operand1 = if n == 31 then SP[] else X[n];

(result, -) = AddWithCarry(operand1, imm, '0');

if d == 31 then
    SP[] = result;
else
    X[d] = result;

```



sf = '0'  
 imm12 = 0x02a  
 shift = '01'  
 Rd = '00101'  
 Rn = '00011'

```

integer d = UInt(Rd);
integer n = UInt(Rn);
integer datasize = if sf == '1' then 64 else 32;
bits(datasize) imm;

case shift of
  when '00' imm = ZeroExtend(imm12, datasize);
  when '01' imm = ZeroExtend(imm12:Zeros(12), datasize);
  when '1x' ReservedValue();
bits(datasize) result;
bits(datasize) operand1 = if n == 31 then SP[] else X[n];

(result, -) = AddWithCarry(operand1, imm, '0');

if d == 31 then
  SP[] = result;
else
  X[d] = result;
  
```



sf = '0'  
 imm12 = 0x02a  
 shift = '01'  
 Rd = '00101'  
 Rn = '00011'

```

integer d = UInt(Rd);
integer n = UInt(Rn);
integer datasize = if sf == '1' then 64 else 32;
bits(datasize) imm;

case shift of
  when '00' imm = ZeroExtend(imm12, datasize);
  when '01' imm = ZeroExtend(imm12:Zeros(12), datasize); imm = 0x0002a000
  when '1x' ReservedValue();
bits(datasize) result;
bits(datasize) operand1 = if n == 31 then SP[] else X[n];

(result, -) = AddWithCarry(operand1, imm, '0');

if d == 31 then
  SP[] = result;
else
  X[d] = result;
  
```



sf = '0'  
 imm12 = 0x02a  
 shift = '01'  
 Rd = '00101'  
 Rn = '00011'

```

integer d = UInt(Rd);
integer n = UInt(Rn);
integer datasize = if sf == '1' then 64 else 32;
bits(datasize) imm;

case shift of
  when '00' imm = ZeroExtend(imm12, datasize);
  when '01' imm = ZeroExtend(imm12:Zeros(12), datasize); imm = 0x0002a000
  when '1x' ReservedValue();

bits(datasize) result;
bits(datasize) operand1 = if n == 31 then SP[] else X[n];
(result, -) = AddWithCarry(operand1, imm, '0');

if d == 31 then
  SP[] = result;
else
  X[d] = result;
  
```

d = 5  
 n = 3  
 datasize = 32

operand1 = 0x00000045  
 result = 0x0002a045

X[5] = 0x0002a045

|       |   |         |    |    |       |
|-------|---|---------|----|----|-------|
| sf    | = | '0'     |    |    |       |
| imm12 | = | 0x02a   | Rd | Rn | imm12 |
| shift | = | '01'    |    |    |       |
| Rd    | = | '00101' |    |    |       |
| Rn    | = | '00011' |    |    |       |

```

integer d = UInt(Rd);
integer n = UInt(Rn);
integer datasize = if sf == '1' then 64 else 32;
bits(datasize) imm;

case shift of
  when '00' imm = ZeroExtend(imm12, datasize);
  when '01' imm = ZeroExtend(imm12:Zeros(12), datasize);
  when '1x' ReservedValue();
bits(datasize) result;
bits(datasize) operand1 = if n == 31 then SP[] else X[n];

(result, -) = AddWithCarry(operand1, imm, '0');

if d == 31 then
  SP[] = result;
else
  X[d] = result;

```



```

integer d = UInt(Rd);
integer n = UInt(Rn);
integer datasize = if sf == '1' then 64 else 32;
bits(datasize) imm;

case shift of
  when '00' imm = ZeroExtend(imm12, datasize);
  when '01' imm = ZeroExtend(imm12:zeros(12), datasize); imm
  when '1x' ReservedValue();

bits(datasize) result;
bits(datasize) operand1 = if n == 31 then SP[] else X[n];

(result, -) = AddWithCarry(operand1, imm, '0');

if d == 31 then
  SP[] = result;
else
  X[d] = result;

```

|          |   |         |
|----------|---|---------|
| sf       | = | '0'     |
| imm12    | = | 0x02a   |
| shift    | = | '01'    |
| Rd       | = | '00101' |
| Rn       | = | '00011' |
| <br>     |   |         |
| d        | = | 5       |
| n        | = | 3       |
| datasize | = | 32      |

```

integer d = UInt(Rd);
integer n = UInt(Rn);
integer datasize = if sf == '1' then 64 else 32;
bits(datasize) imm;

case shift of
  when '00' imm = ZeroExtend(imm12, datasize);
  when '01' imm = ZeroExtend(imm12:Zeros(12), datasize); imm
  when '1x' ReservedValue();

bits(datasize) result;
bits(datasize) operand1 = if n == 31 then SP[] else X[n];
(result, -) = AddWithCarry(operand1, imm, '0');

if d == 31 then
  SP[] = result;
else
  X[d] = result;

```



# Symbolic Representation

Feed to constraint solver (e.g., [Z3 SMT Solver](#))

- What is the output given input Y?
- What input X produces output Y?
- What input X produces intermediate value Y?
- Generate a test input that shows X happening
  - Cf. [KLEE](#) LLVM symbolic execution







Full graph for one path through the ADD instruction: 80-90 nodes

Graph for all paths through entire v8-M specification: 0.5M nodes



# From instructions to programs...



# Architectural Conformance Suite

[https://alastairreid.github.io/papers/FMCAD\\_16/](https://alastairreid.github.io/papers/FMCAD_16/)

## Processor architectural compliance sign-off

### Large

- v8-A 11,000 test programs, > 2 billion instructions
- v8-M 3,500 test programs, > 250 million instructions

### Thorough

- Tests dark corners of specification

# Progress in testing Arm specification



- Does not parse, does not typecheck
- Can't get out of reset
- Can't execute first instruction
- Can't execute first 100 instructions
- ...
- Passes 90% of tests
- Passes 99% of tests
- ...

# Fuzz testing Arm binaries

## *External fuzzing*

- Branches in Arm binary used to guide fuzz tester's choice of inputs
- Finds explicit control flow

## *Internal fuzzing*

- Branches in Arm specification used to guide fuzz tester's choice of inputs
- Finds implicit control flow

(Symbolic execution to escape plateaus)



“End to End Verification of ARM processors with ISA Formal,” [CAV 2016](#)

cf “End-to-end formal ISA verification of RISC-V processors with riscv-formal”, Saal Clarke, 1pm 27<sup>th</sup> December



“End to End Verification of ARM processors with ISA Formal,” [CAV 2016](#)

cf “End-to-end formal ISA verification of RISC-V processors with riscv-formal”, Saal Clarke, 1pm 27<sup>th</sup> December



“End to End Verification of ARM processors with ISA Formal,” [CAV 2016](#)

cf “End-to-end formal ISA verification of RISC-V processors with riscv-formal”, Saal Clarke, 1pm 27<sup>th</sup> December

# Do something awesome!

## Known to work

- Assembler/disassembler
- Interpreter
- Symbolic evaluation
- Generate testcases
- Fuzzing with internal feedback
- Formally validate processor design

## “Should” work

- System register plugin
- Fuzzing with symbolic execution
- (Information flow analysis)
- (Test LLVM IR → ARM backend)
- (Superoptimizer  
<http://www.eecs.qmul.ac.uk/~gretay/papers/onward2017.pdf>)
- (Convert to Coq/HOL/ACL2)

# How can you trust formally verified software?

## Program

# How can you trust formally verified software?

Program Specification

Program

More formal despair: [Denning, Fonseca et al.](#)

More formal hope: [Hyperkernel](#), [Yggdrasil](#), [Milawa](#), [Fiat](#)

# How can you trust formally verified software?

Program Specification

Program

Linux  
specification

More formal despair: [Denning, Fonseca et al.](#)

More formal hope: [Hyperkernel](#), [Yggdrasil](#), [Milawa](#), [Fiat](#)

# How can you trust formally verified software?

Program Specification

Program

Linux  
specification

glibc  
specification

More formal despair: [Denning, Fonseca et al.](#)

More formal hope: [Hyperkernel](#), [Yggdrasil](#), [Milawa](#), [Fiat](#)

# How can you trust formally verified software?

Program Specification

Program

Linux  
specification

glibc  
specification

ISO-C  
specification

More formal despair: [Denning, Fonseca et al.](#)

More formal hope: [Hyperkernel](#), [Yggdrasil](#), [Milawa](#), [Fiat](#)



Do something awesome with the spec

Ask me questions [alastair.reid@arm.com](mailto:alastair.reid@arm.com) [@alastair\\_reid](https://github.com/alastair_reid) <https://alastairreid.github.io>

Talk to me or Milosch Meriac ([@FoolsDelight](https://twitter.com/FoolsDelight)) about [white hacker jobs](#) at ARM

Thanks to those who helped get here

Alasdair Armstrong (Cambridge U.)  
Curtis Dunham (ARM)  
Alex Chadwick (ARM)  
David Gilday (ARM)  
Ali Zaidi (ARM)  
David Hoyes (ARM)  
Anastasios Deligiannis (ARM)  
David Seal (ARM)  
Anthony Fox (Cambridge U.)  
Daniel Bailey (ARM)  
Ashan Pathirane (ARM)  
Erin Shepherd (ARM)  
Belaji Venu (ARM)  
Francois Botman (ARM)  
Bradley Smith (ARM)  
George Hawes (ARM)  
Brian Foley (ARM)  
Graeme Barnes (ARM)

Isobel Hooper (ARM)  
Jack Andrews (ARM)  
Jacob Eapen (ARM)  
Jon French (Cambridge U.)  
Kathy Gray (Cambridge U.)  
Krassy Gochev (ARM)  
Lewis Russell (ARM)  
Matthew Leach (ARM)  
Meenu Gupta (ARM)

Michele Riga (ARM)  
Milosch Meriac (ARM)  
Nigel Stephens (ARM)  
Niyas Sait (ARM)  
Peng Wang (ARM)  
Peter Sewell (Cambridge U.)  
Peter Vrabel (ARM)  
Richard Grisenthwaite (ARM)  
Rick Chen (ARM)

Simon Bellew (ARM)  
Thomas Grocott (ARM)  
Will Deacon (ARM)  
Will Keen (ARM)  
Wojciech Meyer (ARM)  
(and others)