

# Robustness

# Robustness

- We assume the programmer didn't want relaxed behaviors
- Verification problem:
  - Specification: SC version of the program
  - Implementation: TSO version of the program
- The question of robustness can be asked for any two memory models (PSO -> SC), (PSO -> TSO), etc.
- The only practical techniques are for the simpler ones: TSO, PSO

# TSO Robustness

A program P is robust if all of its TSO behaviors coincide with its SC behaviors



# TSO Robustness

A program P is robust if all of its TSO behaviors coincide with its SC behaviors



TSO  
SC

# Shasha & Snir 1988



# Shasha & Snir 1988



P is robust if all its traces contain no happens-before cycles

# Checking TSO Robustness

Bouajanni, Derevenetc, Meyer 2013

- Characterize minimal SC violations under TSO
- Any program containing non-SC computations contains minimal SC violations
- Instrument the program to detect minimal SC-violations (only)
- Model check the program under SC

# Minimal SC-violations

# Minimal SC-violations

- A TSO execution that is non-SC

# Minimal SC-violations

- A TSO execution that is non-SC
- We use the operational model with write buffers

$$\begin{array}{lll}
 (S, T \parallel (B, \mathbf{E}[(\lambda x \ ev)])) & \xrightarrow{\beta} & (S, T \parallel (B, \mathbf{E}[\{x/v\}e])) \\
 (S, T \parallel (B, \mathbf{E}[(\text{ref } v)])) & \xrightarrow{\nu_{p,v}} & (S \cup \{p \mapsto v\}, T \parallel (B, \mathbf{E}[p])) \quad p \notin \text{dom}(S) \\
 (S, T \parallel (B, \mathbf{E}[(p := v)])) & \xrightarrow{\text{wr}_{p,v}} & (S, T \parallel (B \triangleleft [p \mapsto v], \mathbf{E}[\emptyset])) \\
 (S, T \parallel (B, \mathbf{E}[(!p)])) & \xrightarrow{\text{rd}_{p,v}} & (S, T \parallel (B, \mathbf{E}[v])) \quad B(p) = \epsilon \text{ \& } S(p) = v \\
 (S, T \parallel (B, \mathbf{E}[(!p)])) & \xrightarrow{\text{rd}_{p,v}} & (S, T \parallel (B, \mathbf{E}[v])) \quad B(p) = ls :: v \\
 (S, T \parallel (B, \mathbf{E}[\langle \text{wr} | \text{rd} \rangle])) & \xrightarrow{\text{wr}} & (S, T \parallel (B, \mathbf{E}[\emptyset])) \quad \forall p, B(p) = \epsilon \\
 \\ 
 (S, T \parallel ([p \mapsto v] \triangleright B, e)) & \xrightarrow{\text{bu}_{p,v}} & (S[p \mapsto v], T \parallel (B, e)) \quad \textcolor{red}{TSO}
 \end{array}$$

# Minimal SC-violations

- A TSO execution that is non-SC
- We use the operational model with write buffers

|                                                                           |                                 |                                                                                         |
|---------------------------------------------------------------------------|---------------------------------|-----------------------------------------------------------------------------------------|
| $(S, T \parallel (B, \mathbf{E}[(\lambda x \ ev)]))$                      | $\xrightarrow{\beta}$           | $(S, T \parallel (B, \mathbf{E}[\{x/v\}e]))$                                            |
| $(S, T \parallel (B, \mathbf{E}[(\text{ref } v)]))$                       | $\xrightarrow{\nu_{p,v}}$       | $(S \cup \{p \mapsto v\}, T \parallel (B, \mathbf{E}[p])) \quad p \notin \text{dom}(S)$ |
| $(S, T \parallel (B, \mathbf{E}[(p := v)]))$                              | $\xrightarrow{\text{wr}_{p,v}}$ | $(S, T \parallel (B \triangleleft [p \mapsto v], \mathbf{E}[\emptyset]))$               |
| $(S, T \parallel (B, \mathbf{E}[(!p)]))$                                  | $\xrightarrow{\text{rd}_{p,v}}$ | $B(p) = \epsilon \ \& \ S(p) = v$                                                       |
| $(S, T \parallel (B, \mathbf{E}[(!p)]))$                                  | $\xrightarrow{\text{rd}_{p,v}}$ | $B(p) = ls :: v$                                                                        |
| $(S, T \parallel (B, \mathbf{E}[\langle \text{wr}   \text{rd} \rangle]))$ | $\xrightarrow{\text{wr}}$       | $\forall p, B(p) = \epsilon$                                                            |
| $(S, T \parallel ([p \mapsto v] \triangleright B, e))$                    | $\xrightarrow{\text{bu}_{p,v}}$ | $(S[p \mapsto v], T \parallel (B, e)) \quad \textcolor{red}{TSO}$                       |

# Minimal SC-violations

- A TSO execution that is non-SC
- We use the operational model with write buffers
- We consider only the labels

$$(t_0, wr_{x,1}) \cdot (t_0, bu_{x,1}) \cdot (t_0, rd_{y,0}) \cdot (t_1, wr_{y,1}) \cdot (t_1, bu_{y,1}) \cdot (t_1, rd_{x,1})$$

# Minimal SC-violations

- A TSO execution that is non-SC
- We use the operational model with write buffers
- We consider only the labels

$$(t_0, wr_{x,1}) \cdot (t_0, bu_{x,1}) \cdot (t_0, rd_{y,0}) \cdot (t_1, wr_{y,1}) \cdot (t_1, bu_{y,1}) \cdot (t_1, rd_{x,1})$$

No SC-violation

# Minimal SC-violations

- A TSO execution that is non-SC
- We use the operational model with write buffers
- We consider only the labels
- It contains the minimum number of “delays” possible

# Minimal SC-violations

- A TSO execution that is non-SC
- We use the operational model with write buffers
- We consider only the labels
- It contains the minimum number of “delays” possible

$$(t_0, wr_{x,1}) \cdot (t_0, rd_{y,0}) \cdot (t_1, wr_{y,1}) \cdot (t_1, rd_{x,0}) \cdot (t_0, bu_{x,1}) \cdot (t_1, bu_{y,1})$$

# Minimal SC-violations

- A TSO execution that is non-SC
- We use the operational model with write buffers
- We consider only the labels
- It contains the minimum number of “delays” possible

$(t_0, wr_{x,1}) \cdot (t_0, rd_{y,0}) \cdot (t_1, wr_{y,1}) \cdot (t_1, rd_{x,0}) \cdot (t_0, bu_{x,1}) \cdot (t_1, bu_{y,1})$   
SC-violation, but not minimal

# Minimal SC-violations

- A TSO execution that is non-SC
- We use the operational model with write buffers
- We consider only the labels
- It contains the minimum number of “delays” possible

$(t_0, wr_{x,1}) \cdot (t_0, rd_{y,0}) \cdot (t_1, wr_{y,1}) \cdot (t_1, rd_{x,0}) \cdot (t_0, bu_{x,1}) \cdot (t_1, bu_{y,1})$   
SC-violation, but not minimal

$(t_0, wr_{x,1}) \cdot (t_0, rd_{y,0}) \cdot (t_1, wr_{y,1}) \cdot (t_1, bu_{y,1}) \cdot (t_1, rd_{x,0}) \cdot (t_0, bu_{x,1})$

# Minimal SC-violations

- A TSO execution that is non-SC
- We use the operational model with write buffers
- We consider only the labels
- It contains the minimum number of “delays” possible

$(t_0, wr_{x,1}) \cdot (t_0, rd_{y,0}) \cdot (t_1, wr_{y,1}) \cdot (t_1, rd_{x,0}) \cdot (t_0, bu_{x,1}) \cdot (t_1, bu_{y,1})$   
SC-violation, but not minimal

$(t_0, wr_{x,1}) \cdot (t_0, rd_{y,0}) \cdot (t_1, wr_{y,1}) \cdot (t_1, bu_{y,1}) \cdot (t_1, rd_{x,0}) \cdot (t_0, bu_{x,1})$   
SC-violation, minimal

# Minimal SC-violations

- A TSO execution that is non-SC
- We use the operational model with write buffers
- We consider only the labels
- It contains the minimum number of “delays” possible

Theorem: In every minimal violation there is at most one thread that delays stores

# Taxonomy of a TSO violation

We will call the delaying thread the attacker

$$\sigma = \sigma_1 \cdot (t_A, wr_{x,v}) \cdot \sigma_2 \cdot (t_A, rd_{y,v'}) \cdot \color{blue}{\sigma_3} \cdot (t_A, bu_{x,v}) \cdot \sigma_4$$

# Taxonomy of a TSO violation

We will call the delaying thread the attacker



# Taxonomy of a TSO violation

We will call the delaying thread the attacker



# Taxonomy of a TSO violation

We will call the delaying thread the attacker



# Taxonomy of a TSO violation

We will call the delaying thread the attacker



# Taxonomy of a TSO violation

We will call the delaying thread the attacker



# Taxonomy of a TSO violation

We will call the delaying thread the attacker



# In Dekker

( $t_0, wr_{x,1}$ ) • ( $t_0, rd_{y,0}$ ) • ( $t_1, wr_{y,1}$ ) • ( $t_1, bu_{y,1}$ ) • ( $t_1, rd_{x,0}$ ) • ( $t_0, bu_{x,1}$ )

# In Dekker

First delayed write



$(t_0, wr_{x,1}) \cdot (t_0, rd_{y,0}) \cdot (t_1, wr_{y,1}) \cdot (t_1, bu_{y,1}) \cdot (t_1, rd_{x,0}) \cdot (t_0, bu_{x,1})$

# In Dekker

First delayed write



$(t_0, wr_{x,1}) \cdot (t_0, rd_{y,0}) \cdot (t_1, wr_{y,1}) \cdot (t_1, bu_{y,1}) \cdot (t_1, rd_{x,0}) \cdot (t_0, bu_{x,1})$



Last read of t0 before  
buffer update

# In Dekker

First delayed write

subtract with no  
attacker actions

$$(t_0, wr_{x,1}) \cdot (t_0, rd_{y,0}) \cdot (t_1, wr_{y,1}) \cdot (t_1, bu_{y,1}) \cdot (t_1, rd_{x,0}) \cdot (t_0, bu_{x,1})$$

Last read of t0 before  
buffer update



# In Dekker

First delayed write

subtract with no  
attacker actions

$$(t_0, wr_{x,1}) \cdot (t_0, rd_{y,0}) \cdot (t_1, wr_{y,1}) \cdot (t_1, bu_{y,1}) \cdot (t_1, rd_{x,0}) \cdot (t_0, bu_{x,1})$$

Last read of t0 before  
buffer update

update of the  
delayed write

# In Dekker

First delayed write

subtract with no  
attacker actions

$(t_0, wr_{x,1}) \cdot (t_0, rd_{y,0}) \cdot (t_1, wr_{y,1}) \cdot (t_1, bu_{y,1}) \cdot (t_1, rd_{x,0}) \cdot (t_0, bu_{x,1})$

Last read of t0 before  
buffer update

update of the  
delayed write

Now we just need to check if a TSO  
program can be attacked

# In Dekker

First delayed write

subtract with no  
attacker actions

$(t_0, wr_{x,1}) \cdot (t_0, rd_{y,0}) \cdot (t_1, wr_{y,1}) \cdot (t_1, bu_{y,1}) \cdot (t_1, rd_{x,0}) \cdot (t_0, bu_{x,1})$

Last read of t0 before  
buffer update

update of the  
delayed write

Now we just need to check if a TSO  
program can be attacked

We will call the tuple  $A = (t_A, rd, bu)$  an attack,  
and check if it can happen

# Instrumentation

1. We instrument the program
  - We pick an attack  $A = (t_A, rd, bu)$
  - At the point of the write  $(t_A, wr\_,\_)$  involved in the attack we instrument the program to not write in the memory but record it's own writes
  - We check that the read  $(t_A, rd\_,\_)$  is reachable before the action  $(t_A, bu\_,\_)$  happens
2. We use an *SC model checker* to validate the feasibility of the attack in the instrumented program
3. Query all the possible attacks!

Effectively reducing TSO Robustness to  
an SC reachability query!

# Complexity

- finite data domain
  - finite memory
  - finite threads
- 
- PSpace-Complete

# Enforcing Robustness

- Compute all possible attacks
- For each feasible attack compute the set of fences that prevent it
- Find optimal set of fences using Integer Linear Programming (ILP)

# Program Logics for Relaxed Memory

# A Premier on Hoare-Logics

- Also known as (Floyd-)Hoare Logic
- We need a language of assertions describing property of the state at different program points
- Program variables can be used in assertions to relate the program term and the state
- Invariants: Assertions that are true of all the states in a piece of code (e.g. loop invariant, global invariant, etc.)

# Assertions

|                   |                                              |
|-------------------|----------------------------------------------|
| $P ::= True$      | Any possible state                           |
| $\neg P$          |                                              |
| $P \wedge Q$      | All states that satisfy P and Q              |
| $P \vee Q$        |                                              |
| $P \Rightarrow Q$ |                                              |
| $x = v   \dots$   | Logical variables relate values in one state |
| $\exists X, P$    | $\{\exists X, x = X \wedge y = X + 1\}$      |
| $\forall X, P$    |                                              |

## Substitutions

$$\{\exists X, x = X \wedge y = X + 1\}[x \leftarrow 8] = \{\exists X, 8 = X \wedge y = X + 1\} \\ \equiv \{y = 9\}$$

Notation:  $P[x \leftarrow 8] \approx [x/8]P$

# The meaning of Triples



Triples can be composed to prove complex programs

$$\{Pre\} c_0; \{P\} c; \{Q\} c_1; \{Post\}$$

# Check these examples

$$\{x = 3\} \quad x := x + 1 \quad \{x \leq 0\}$$
$$\{x = 3\} \quad x := x + 1 \quad \{x \geq 0\}$$
$$\{x = 3\} \quad x := x + 1; y := x \quad \{y \geq 0\}$$
$$\{\exists X, x = X \wedge y = X + 1\} \quad x := x + 1; y := x \quad \{x = y\}$$
$$\{x = 0\} \text{ if } \textit{true} \text{ then } x := x + 1 \text{ else } x := x - 1 \text{ fi } \{x = 1\}$$
$$\{x = 0\} \text{ if } x = 5 \text{ then } x := x + 1 \text{ else } x := x - 1 \text{ fi } \{x = 1\}$$
$$\{x = 0\} \text{ while } \textit{true} \text{ do } x := x + 1 \text{ od } \{\textit{false}\}$$

# Hoare Logic Rules

# Hoare Logic Rules

$\{P\} \ skip \ {P}$

# Hoare Logic Rules

$$\{P\} \text{ skip } \{P\}$$
$$\{P[x \leftarrow e]\} \ x := e \ \{P\}$$

# Hoare Logic Rules

$$\{P\} \text{ skip } \{P\}$$

$$\{P[x \leftarrow e]\} \ x := e \ \{P\}$$

$$\frac{\{P\} \ c_0 \ \{R\} \quad \{R\} \ c_1 \ \{Q\}}{\{P\} \ c_0; c_1 \ \{Q\}}$$

# Hoare Logic Rules

$$\{P\} \text{ skip } \{P\}$$

$$\{P[x \leftarrow e]\} \ x := e \ \{P\}$$

$$\frac{\{P\} \ c_0 \ \{R\} \quad \{R\} \ c_1 \ \{Q\}}{\{P\} \ c_0; c_1 \ \{Q\}}$$

$$\frac{\{P \wedge b\} \ c_0 \ \{Q_0\} \quad \{P \wedge \neg b\} \ c_1 \ \{Q_1\}}{\{P\} \text{ if } b \text{ then } c_0 \text{ else } c_1 \text{ fi } \{b \Rightarrow Q_0 \wedge \neg b \Rightarrow Q_1\}}$$

# Hoare Logic Rules

$$\{P\} \text{ skip } \{P\}$$

$$\{P[x \leftarrow e]\} \ x := e \ \{P\}$$

$$\frac{\{P\} \ c_0 \ \{R\} \quad \{R\} \ c_1 \ \{Q\}}{\{P\} \ c_0; c_1 \ \{Q\}}$$

$$\frac{\{P \wedge b\} \ c_0 \ \{Q_0\} \quad \{P \wedge \neg b\} \ c_1 \ \{Q_1\}}{\{P\} \text{ if } b \text{ then } c_0 \text{ else } c_1 \text{ fi } \{b \Rightarrow Q_0 \wedge \neg b \Rightarrow Q_1\}}$$

$$\frac{\{P \wedge b\} \ c \ \{P\}}{\{P\} \text{ while } b \text{ do } c \text{ od } \{\neg b \wedge P\}}$$

# Hoare Logic Rules

$$\{P\} \text{ skip } \{P\}$$

$$\{P[x \leftarrow e]\} \ x := e \ \{P\}$$

$$\frac{\{P\} \ c_0 \ \{R\} \quad \{R\} \ c_1 \ \{Q\}}{\{P\} \ c_0; c_1 \ \{Q\}}$$

$$\frac{\{P \wedge b\} \ c_0 \ \{Q_0\} \quad \{P \wedge \neg b\} \ c_1 \ \{Q_1\}}{\{P\} \text{ if } b \text{ then } c_0 \text{ else } c_1 \text{ fi } \{b \Rightarrow Q_0 \wedge \neg b \Rightarrow Q_1\}}$$

$$\frac{\{P \wedge b\} \ c \ \{P\}}{\{P\} \text{ while } b \text{ do } c \text{ od } \{\neg b \wedge P\}}$$

$$\frac{\{P\} \ c \ \{Q\} \quad P' \Rightarrow P \quad Q \Rightarrow Q'}{\{P'\} \ c \ \{Q'\}}$$

# A simple Hoare Logic Proof

$\{x = X \wedge y = Y\}$

`aux = x;`

`x = y;`

`y = aux;`

$\{x = Y \wedge y = X\}$

# A simple Hoare Logic Proof

$\{x = X \wedge y = Y\}$

**aux** = **x**;

$\{x = X \wedge y = Y \wedge \text{aux} = X\}$

**x** = **y**;

**y** = **aux**;

$\{x = Y \wedge y = X\}$

# A simple Hoare Logic Proof

$\{x = X \wedge y = Y\}$

**aux** = **x**;

$\{x = X \wedge y = Y \wedge \text{aux} = X\}$

**x** = **y**;

$\{x = Y \wedge y = Y \wedge \text{aux} = X\}$

**y** = **aux**;

$\{x = Y \wedge y = X\}$

# The Owicky/Gries Method

- Hoare logics is insufficient to prove concurrent programs
- In 1976 Owicky and Gries provide a method to verify concurrent programs (using the cobegin/coend parallel construct)
- First method to be able to prove properties of multiprgrams

# Quick intro to OG (R/G)

$$\frac{\{P_0\} \ C_0 \ \{Q_0\} \quad \{P_1\} \ C_1 \ \{Q_1\} \quad \text{non-interference}}{\{P_0 \wedge P_1\} \ C_0 \| C_1 \ \{Q_0 \wedge Q_1\}}$$

# Quick intro to OG (R/G)

$$\frac{\{P_0\} \ C_0 \ \{Q_0\} \quad \{P_1\} \ C_1 \ \{Q_1\} \quad \text{non-interference}}{\{P_0 \wedge P_1\} \ C_0 \| C_1 \ \{Q_0 \wedge Q_1\}}$$

Non-Interference

- For each triple  $\{p_i\} \ c_i \ \{q_i\}$  occurring in  $\{P_0\} \ C_0 \ \{Q_0\}$
- For each pair  $\{p_j\} \ c_j$  occurring in  $\{P_1\} \ C_1 \ \{Q_1\}$ , where  $c_j$  is an atomic command (write, read, ...)
- Show that:  $p_i$  and  $q_i$  are stable w.r.t.  $c_j$

$$\{p_i \wedge p_j\} \ c_j \ \{p_i\}$$

# Quick intro to OG (R/G)

$$\frac{\{P_0\} \ C_0 \ \{Q_0\} \quad \{P_1\} \ C_1 \ \{Q_1\} \quad \text{non-interference}}{\{P_0 \wedge P_1\} \ C_0 \| C_1 \ \{Q_0 \wedge Q_1\}}$$

Non-Interference

- For each triple  $\{p_i\} \ c_i \ \{q_i\}$  occurring in  $\{P_0\} \ C_0 \ \{Q_0\}$
- For each pair  $\{p_j\} \ c_j$  occurring in  $\{P_1\} \ C_1 \ \{Q_1\}$ , where  $c_j$  is an atomic command (write, read, ...)
- Show that:  $p_i$  and  $q_i$  are stable w.r.t.  $c_j$

$$\{p_i \wedge p_j\} \ c_j \ \{p_i\}$$

Exponential proof obligations  
to number of threads

# To the Board!

$$\{x = 0 \wedge \text{flag} = \text{false}\}$$

$x = 100$                    ||            $\text{if}(\text{flag})$   
 $\text{flag} = \text{true}$            ||            $x = x - 50$

$$\{x \geq 50 \wedge \text{flag} = \text{true}\}$$

# To the Board!

$$\{x = 0 \wedge \text{flag} = \text{false}\}$$

`x = 100`      ||      `if(flag)`  
`flag = true`      ||      `x = x - 50`

$$\{x \geq 50 \wedge \text{flag} = \text{true}\}$$

$$\{x = 0\}$$

**Atomic** →  $x = x + 1 \quad || \quad x = x + 1$   
 $\{x = 2\}$

# To the Board!

$$\{x = 0 \wedge \text{flag} = \text{false}\}$$
$$\begin{array}{ll} x = 100 & \text{if(flag)} \\ \text{flag} = \text{true} & \| \\ & x = x - 50 \end{array}$$
$$\{x \geq 50 \wedge \text{flag} = \text{true}\}$$
$$\{x = 0\}$$

**Atomic**  $\longrightarrow$   $x = x + 1 \parallel x = x + 1$

$$\{x = 2\}$$

Ghost Variables

# OGRA

Owicki-Gries Reasoning for Weak Memory Models  
Lahav, Vafeiadis'15

# OG is unsound for TSO

Lets prove that Dekker cannot happen in SC

# OG is unsound for TSO

Lets prove that Dekker cannot happen in SC

|                              |                                             |
|------------------------------|---------------------------------------------|
| $\{a \neq 0\}$               | $\{a \neq 0\}$                              |
| $x := 1;$                    | $\{\top\}$                                  |
| $\{x \neq 0\}$               | $y := 1;$                                   |
| $a := y$                     | $\{y \neq 0\}$                              |
| $\{x \neq 0\}$               | $b := x$                                    |
| $\{a \neq 0 \vee b \neq 0\}$ | $\{y \neq 0 \wedge (a \neq 0 \vee b = x)\}$ |

# Owicky/Gries for Release

## Acquire

- Release/Acquire is a memory model loosely based on the release/acquire fragment of C++11
- Relelase/Acquire includes TSO behaviors
- For all practical purposes we will think of TSO

# OG à la Rely/Guarantee

$R; G \models \{P\} c \{Q\}$

Guarantees

Set of pairs:  $\{P\} c$

- This thread affects the state by executing  $c$  when  $P$  is true



Relies

Set of pairs of the form:  $R \nearrow Q$

- Other threads should not invalidate  $R$
- while assuming values for variables which do not contradict  $Q$

# Stability Check

$R \nearrow Q$  is stable w.r.t.  $\{P\}$   $x := v$  if

$$R \wedge P \vdash [v/x]R$$

# Stability Check

$R \nearrow Q$  is stable w.r.t.  $\{P\}$   $x := v$  if

$$R \wedge P \vdash [v/x]R$$

$R \nearrow Q$  is stable w.r.t.  $\{P\}$   $x := y$  if

$$R \wedge P \vdash [v_y/x]R$$

for all  $v_y$  with  $Q \wedge P \not\vdash y \neq v_y$

# The Logic

$$(\text{SKIP}) \quad \frac{\{P \uparrow P\} \leq \mathcal{R}}{\mathcal{R}; \emptyset \Vdash \{P\} \text{ skip } \{P\}}$$

$$(\text{CONSEQ}) \quad \frac{P' \vdash P \quad Q \vdash Q' \quad \mathcal{R} \leq \mathcal{R}' \quad \mathcal{G} \leq \mathcal{G}'}{\mathcal{R}'; \mathcal{G}' \Vdash \{P'\} c \{Q'\}}$$

$$(\text{SEQ}) \quad \frac{\mathcal{R}_1; \mathcal{G}_1 \Vdash \{P\} c_1 \{R\} \quad \mathcal{R}_2; \mathcal{G}_2 \Vdash \{R\} c_2 \{Q\} \quad \mathcal{R}_1^R \vdash \mathcal{R}_2^C}{\mathcal{R}_1 \cup \mathcal{R}_2; \mathcal{G}_1 \cup \mathcal{G}_2 \Vdash \{P\} c_1; c_2 \{Q\}}$$

$$(\text{ASSN}_0) \quad \frac{P \vdash Q[v/x] \quad \{P \uparrow P, Q \uparrow (P \vee Q)\} \leq \mathcal{R}}{\mathcal{R}; \{\{P\}x := v\} \Vdash \{P\} x := v \{Q\}}$$

$$(\text{ASSN}_1) \quad \frac{P \vdash Q[e(y)/x] \quad \{P \uparrow P, Q \uparrow (P \vee Q)\} \leq \mathcal{R}}{\mathcal{R}; \{\{P\}x := e(y)\} \Vdash \{P\} x := e(y) \{Q\}}$$

$$(\text{PAR}) \quad \frac{\mathcal{R}_1; \mathcal{G}_1 \Vdash \{P_1\} c_1 \{Q_1\} \quad \mathcal{R}_2; \mathcal{G}_2 \Vdash \{P_2\} c_2 \{Q_2\} \quad Q_1 \wedge Q_2 \vdash Q \quad \mathcal{R}_1; \mathcal{G}_1 \text{ and } \mathcal{R}_2; \mathcal{G}_2 \text{ are non-interfering}}{\mathcal{R}_1 \cup \mathcal{R}_2 \cup \{Q \uparrow (\mathcal{R}_1^R \vee \mathcal{R}_2^R \vee Q)\}; \mathcal{G}_1 \cup \mathcal{G}_2 \Vdash \{P_1 \wedge P_2\} c_1 \parallel c_2 \{Q\}}$$

# Examples

$$\begin{array}{c}
 \{x = 0\} \\
 \{T\} \quad m := 42; \quad \{m = 42\} \\
 \{m = 42\} \\
 x := 1 \quad \{T\} \\
 \{T\} \quad \{a = 42\} \\
 \{a = 42\}
 \end{array}
 \parallel
 \begin{array}{l}
 \{x \neq 0 \rightarrow m = 42\} \\
 \text{while } x = 0 \text{ do skip;} \\
 \{m = 42\} \\
 a := m \\
 \{a = 42\}
 \end{array}$$

$$\begin{array}{c}
 \{(x \neq 1 \wedge a \neq 1) \wedge x \neq 1\} \\
 x := 1 \\
 \{T\}
 \end{array}
 \parallel
 \begin{array}{c}
 \{(x \neq 2 \wedge c \neq 2) \wedge x \neq 2\} \\
 x := 2 \\
 \{T\}
 \end{array}
 \parallel
 \begin{array}{c}
 \{x = a = c = 0\} \\
 \{T\} \\
 a := x; \\
 \{T\} \\
 b := x \\
 \{a = 1 \wedge b = 2 \rightarrow x = 2\}
 \end{array}
 \parallel
 \begin{array}{c}
 \{T\} \\
 c := x; \\
 \{T\} \\
 d := x \\
 \{c = 2 \wedge d = 1 \rightarrow x = 1\}
 \end{array}$$

# Soundness

**Theorem:**

If  $R; G \models \{P\} c \{Q\}$  is derivable, then  $\{P\} c \{Q\}$  is a safe Hoare triple.

# Other formal tools

# Other formal tools

## Counter-Example Guided Fence Insertion under Weak Memory Models

Parosh Aziz Abdulla

Uppsala University  
parosh@it.uu.se

Mohamed Faouzi Atig

Uppsala University  
mohamed\_faouzi.atig@it.uu.se

Yu-Fang Chen

Academia Sinica  
yfc@iis.sinica.edu.tw

Carl Leonardsson

Uppsala University  
carl.leonardsson@it.uu.se

Ahmed Rezine

Uppsala University  
rezine.ahmed@it.uu.se

### Abstract

We give a *sound* and *complete* procedure for fence insertion for concurrent finite-state programs running under the classical TSO memory model. This model allows “write to read” relaxation corresponding to the addition of an unbounded store buffer between each processor and the main memory. We introduce a novel machine model, called the *Single-Buffer* (SB) semantics, and show that the reachability problem for a program under TSO can be reduced to the reachability problem under SB. We present a simple and effective backward reachability analysis algorithm for the latter, and propose a counter-example guided fence insertion procedure. The procedure is augmented by a *placement constraint*, that allows the user to choose the places inside the program where fences may be inserted. For a given placement constraint, the method infers automatically all minimal sets of fences that ensure correctness of the program. We have implemented a prototype and run it successfully on all standard benchmarks, together with several challenging examples that are beyond the applicability of existing methods.

**Categories and Subject Descriptors** D.2.4 [Software Engineering]: Software/Program Verification.

**General Terms** Verification, Theory, Reliability.

**Keywords** Program verification, Relaxed memory models, Infinite-state systems.

### 1. Introduction

**Background** The *Out-of-Order Execution* (*OoOE*) paradigm lets the scheduling of CPU instructions be governed by the availability of their operands rather than by the order in which they are issued [39]. This leads to an efficient use of clock cycles and hence an

work under the Sequential Consistency (SC) model [26] in which the program behaves according to the classical interleaving semantics. However, this is not true any more once we consider concurrent processes that share the memory. In fact, several algorithms that are designed for the synchronization of concurrent processes, such as mutual exclusion and producer-consumer protocols, are not correct in the *OoOE* setting. The inadequacy of the interleaving semantics in the presence of *OoOE* has prompted researchers to introduce *weak* (*or relaxed*) *memory models* by allowing permutations between certain types of memory operations [2, 3, 14]. Weak memory models are used in all major CPU designs including Intel x86 [22, 38], SPARC [40], and PowerPC [21]. Since the weak memory semantics adds more behavior to a program, the program may violate its specification even if it runs correctly under the SC semantics. One way to eliminate the non-desired behaviors resulting from the use of weak memory models is to insert memory *fence* instructions in the program code. A fence instruction, executed by a process, implies that no reordering is allowed between instructions issued before and after the fence instruction.

The most common relaxation corresponds to TSO (for Total Store Ordering) that is adopted by Sun’s SPARC multiprocessors [40]. TSO is the kernel of many common weak memory models and is the latest formalization of the x86-tso memory model [35, 38].

**Challenge** Processor vendors do not provide formal definitions for the memory models of their products, but rather informal documents describing allowed/forbidden behaviors. This is not sufficient for the development of verification tools. Therefore, a substantial research effort has recently been devoted to this issue, resulting in formal (weak) memory models (both axiomatic and operational) for several different types of processors [35, 38]. In this paper, we describe how to insert sets of fences that are sufficient

# Other formal tools

## Counter-Example Guided Fence Insertion under Weak Memory Models

Parosh A

Uppsal  
paros

## Stateless Model Checking for TSO and PSO

Parosh Abdulla, Stavros Aronis, Mohammed Faouzi Atig,  
Bengt Jonsson, Carl Leonardsson, and Konstantinos Sagonas

Dept. of Information Technology, Uppsala University, Sweden

### Abstract

We give a *sound* and *complete* verification procedure for programs under a concurrent finite-state memory model. The procedure is based on a counterexample-guided fence insertion technique. It responds to the user's requests for fences by inserting them at specific locations in the program. The user can choose the number of fences to be inserted. For a given number of fences, the procedure finds all minimal counterexamples. We have implemented the procedure and evaluated it on all standard benchmarks. Our experiments show that our technique reduces the verification effort for relaxed memory models to be almost that for the standard model of sequential consistency. In many cases, our implementation significantly outperforms other comparable tools.

**Categories and Subject Terms:** Software/Programming Languages

**General Terms:** Verification

**Keywords:** Program verification, infinite-state systems

### 1. Introduction

**Background** The verification of concurrent programs is a challenging task due to the scheduling of threads and the interleaving of their operations. This leads to a large number of possible execution sequences, making it difficult to verify the correctness of the program. One approach to address this challenge is to use a state-space search-based verification method, such as model checking or simulation. However, these methods can be computationally expensive, especially for programs with complex memory access patterns.

**Abstract.** We present a technique for efficient stateless model checking of programs that execute under the relaxed memory models TSO and PSO. The basis for our technique is a novel representation of executions under TSO and PSO, called *chronological traces*. Chronological traces induce a partial order relation on relaxed memory executions, capturing dependencies that are needed to represent the interaction via shared variables. They are optimal in the sense that they only distinguish computations that are inequivalent under the widely-used representation by Shasha and Snir. This allows an optimal dynamic partial order reduction algorithm to explore a minimal number of executions while still guaranteeing full coverage. We apply our techniques to check, under the TSO and PSO memory models, LLVM assembly produced for C/pthreads programs. Our experiments show that our technique reduces the verification effort for relaxed memory models to be almost that for the standard model of sequential consistency. In many cases, our implementation significantly outperforms other comparable tools.

### 1 Introduction

Verification and testing of concurrent programs is difficult, since one must consider all the different ways in which instructions of different threads can be interleaved. To make matters worse, most architectures implement *relaxed memory models*, such as TSO and PSO [31,3], which make threads interact in even more and subtler ways than by standard interleaving. For example, a processor may reorder loads and stores by the same thread if they target different addresses, or it may buffer stores in a local queue.

A successful technique for finding concurrency bugs (i.e., defects that arise only under some thread scheduling), and for verifying their absence, is *stateless model checking* (SMC) [15], also known as *systematic concurrency testing* [20,34]. Starting from a test, i.e., a way to run a program and obtain some expected result, which is terminat-

# Other formal tools

## Counter-Example Guided Fence Insertion under Weak Memory Models

Parosh A  
Uppsal  
paros

## Stateless Model Checking for TSO and PSO

Parosh Al  
Bengt Jons  
Dept. of I

### Abstract

We give a *sound* and *complete* verification procedure for programs running under a concurrent finite-state memory model. The verification procedure is based on the reachability analysis of each processor and each thread. We propose a chronological trace of the interaction via distinguishing computation by Shasha and Zohar [21]. Our technique is a combination of effective backward reachability analysis and a counterexample-guided fence insertion procedure. The user can choose the number of fences inserted. For a given number of fences, our technique automatically finds all minimal counterexamples. We have implemented our technique on all standard benchmarks and show that our technique is competitive with respect to coverage. We apply our technique to programs written in C and LLVM as well as to assembly programs. We have also implemented our technique for the POWER memory model.

**Categories and Subject Terms** Software/Program Verification; Verification of parallel programs; Verification of concurrent programs; Verification of distributed systems

**General Terms** Verification

**Keywords** Program verification; Verification of parallel programs; Verification of concurrent programs; Verification of distributed systems; Infinite-state systems

### 1 Introduction

**Background** The verification of programs running under a concurrent shared-memory model is difficult, since one must consider all the different ways in which parallel threads can interact. To make matters worse, current shared-memory multicore processors, such as Intel’s x86, IBM’s POWER, and ARM, [28, 44, 27, 8], achieve higher performance by implementing *relaxed memory models* that allow threads to interact in even subtler ways than by interleaving of their instructions, as would be the case in the model of *sequential consistency* (SC) [31]. Under the relaxed memory model of POWER, loads and stores to different memory locations can be interleaved in arbitrary order, as long as the final state is consistent with the original sequence of operations.

### 1 Introduction

Verification and testing of concurrent programs is difficult, since one must consider all the different ways in which parallel threads can interact. To make matters worse, current shared-memory multicore processors, such as Intel’s x86, IBM’s POWER, and ARM, [28, 44, 27, 8], achieve higher performance by implementing *relaxed memory models* that allow threads to interact in even subtler ways than by interleaving of their instructions, as would be the case in the model of *sequential consistency* (SC) [31]. Under the relaxed memory model of POWER, loads and stores to different memory locations can be interleaved in arbitrary order, as long as the final state is consistent with the original sequence of operations.

A successful technique for verifying programs under some thread scheduling is *stateless model checking* (SMC) [15], also known as *stateless verification*, i.e., a way to run a program under a given memory model without maintaining a state transition graph. Instead, the program is executed under a given memory model, and the verifier monitors the program’s behavior. If the program exhibits a bug, the verifier finds a counterexample, which is a sequence of memory operations that leads to an error. The verifier can then use this counterexample to refine its memory model and repeat the process until no more bugs are found.

## Stateless Model Checking for POWER

Parosh Aziz Abdulla, Mohamed Faouzi Atig, Bengt Jonsson, and Carl Leonardsson

Dept. of Information Technology, Uppsala University, Sweden

**Abstract.** We present the first framework for efficient application of stateless model checking (SMC) to programs running under the relaxed memory model of POWER. The framework combines several contributions. The first contribution is that we develop a scheme for systematically deriving operational execution models from existing axiomatic ones. The scheme is such that the derived execution models are well suited for efficient SMC. We apply our scheme to the axiomatic model of POWER from [7]. Our main contribution is a technique for efficient SMC, called *Relaxed Stateless Model Checking* (RSMC), which systematically explores the possible inequivalent executions of a program. RSMC is suitable for execution models obtained using our scheme. We prove that RSMC is sound and optimal for the POWER memory model, in the sense that each complete program behavior is explored exactly once. We show the feasibility of our technique by providing an implementation for programs written in C/pthreads.

### 1 Introduction

Verification and testing of concurrent programs is difficult, since one must consider all the different ways in which parallel threads can interact. To make matters worse, current shared-memory multicore processors, such as Intel’s x86, IBM’s POWER, and ARM, [28, 44, 27, 8], achieve higher performance by implementing *relaxed memory models* that allow threads to interact in even subtler ways than by interleaving of their instructions, as would be the case in the model of *sequential consistency* (SC) [31]. Under the relaxed memory model of POWER, loads and stores to different memory locations can be interleaved in arbitrary order, as long as the final state is consistent with the original sequence of operations.

# Other formal tools

# Counter-Example Guided Fence Insertion under Weak Memory Models

# Parosh A

# Stateless Model Checking for TSO and PSO

Parosh Al  
Bengt Jons  
Dept. of I

## Abstract

We give a *sound* and *complete* finite-state model of memory. This model is based on responding to the memory access requests of each processor and each machine model, called the reachability problem. We show that the reachability problem is effectively backward recursive. To solve it, we propose a counterexample-directed procedure that augments the user to choose the memory access requests that are inserted. For a given program, our procedure automatically finds all minimal memory access sequences that violate the program. We have implemented our procedure and tested it on all standard benchmarks. The results show that our procedure is able to find memory access sequences that are beyond the reachability of existing tools.

*Categories and Sub-categories*: Software/Programs

## *General Terms* V *Keywords* Program

## 1 Introduction

Verification and testing the different ways in which matters worse, most notably PSO [31,3], which make interleaving. For example, if they target different areas

A successful technique under some thread scheduling (SMC) [15], also known as test, i.e., a way to run a

# Stateless Model Checking for POWER

Parosh Aziz Abdul

Dept.

**Abstract.** We present programs that execute our technique is a *chronological trace*. We relax memory execution to distinguish computation by Shasha and algorithm to explore coverage. We apply LLVM as models, as show that our techniques to be almost the same cases, our implementation is faster than the original LLVM.

**Abstract.** We model checkin POWER. The 1 that we develop els from existin models are we model of POW SMC, called  $\mathcal{R}$  explores the pc execution mod optimal for the behavior is ex providing an ir

## 1 Introduction

Verification and test all the different way current shared-mem and ARM, [28,44,2' ory models that allo their instructions, as Under the relaxed m

# GPS: Navigating Weak Memory with Ghosts, Protocols, and Separation

Aaron Turon    Viktor Vafeiadis    Derek Dreyer  
Max Planck Institute for Software Systems (MPI-SWS)  
`{turon,viktor,dreyer}@mpi-sws.org`

## Abstract

Weak memory models formalize the inconsistent behaviors that one can expect to observe in multithreaded programs running on modern hardware. In so doing, however, they complicate the already-difficult task of reasoning about correctness of concurrent code. Worse, they render impotent the sophisticated formal methods that have been developed to tame concurrency, which almost universally assume a strong (*i.e.*, sequentially consistent) memory model.

This paper introduces **GPS**, the first program logic to provide a full-fledged suite of modern verification techniques—including ghost state, protocols, *and* separation logic—for high-level, structured reasoning about weak memory. We

RAM, threads take turns interacting with it, and any update performed by one thread is immediate to all other threads. Even assuming sequential concurrent program verification is a highly challenging problem, since one must account for the myriad interacting threads. But fortunately there has been tremendous progress in recent years on advanced program logics and verification tools to help tame the complexity of interleaving [9, 11, 13, 15, 24, 29, 30, 36, 37, 39].

Unfortunately, the assumption of sequentiality is unrealistically “strong”: the synchronization overhead it imposes on modern architectures precludes compiler optimizations that reorder memory operations, thus considered by many to be too expensive in terms of performance.

# A word on Decidability

## What's Decidable about Weak Memory Models? (Extended Version)

M. F. Atig<sup>1</sup>, A. Bouajjani<sup>2</sup>, S. Burckhardt<sup>3</sup>, and M. Musuvathi<sup>3</sup>

<sup>1</sup> Uppsala University, Sweden, [mohamed\\_faouzi.atig@it.uu.se](mailto:mohamed_faouzi.atig@it.uu.se)

<sup>2</sup> LIAFA, Paris Diderot Univ. & CNRS, France, [{atig,abou}@liafa.jussieu.fr](mailto:{atig,abou}@liafa.jussieu.fr)

<sup>3</sup> Microsoft Research Redmond, USA, [{sburckha,madanm}@microsoft.com](mailto:{sburckha,madanm}@microsoft.com)

| Memory Model                        | Name | Reach. Problem    |
|-------------------------------------|------|-------------------|
| $\{w \rightarrow r, RLWE\}$         | TSO  | decidable [3]     |
| $TSO \cup \{w \rightarrow w\}$      | -    | decidable [3]     |
| $TSO \cup \{w \rightarrow w, RWF\}$ | PSO  | decidable [new]   |
| $PSO \cup \{r \rightarrow r\}$      | NSW  | decidable [new]   |
| $TSO \cup \{r \rightarrow r/w\}$    | -    | undecidable [3]   |
| $TSO \cup \{r \rightarrow w\}$      | -    | undecidable [new] |
| $NSW \cup \{RRWE\}$                 | -    | undecidable [new] |

**Abstract.** We investigate the decidability of the state reachability problem in finite-state programs running under weak memory models. In [3], we have shown that this problem is decidable for TSO and its extension with the write-to-write order relaxation, but beyond these models nothing is known to be decidable. Moreover, we have shown that relaxing the program order by allowing reads or writes to overtake reads leads to undecidability.

In this paper, we refine these results by sharpening the (un)decidability frontiers on both sides. On the positive side, we introduce a new memory model NSW (for non-speculative writes) that extends TSO with the write-to-write relaxation, the read-to-read relaxation, and support for partial fences. We present a backtrack-free operational model for NSW, and prove that it does not allow causal cycles (thus barring pathological out-of-thin-air effects). On the negative side, we show that adding the read-to-write relaxation to TSO causes undecidability, and that adding non-atomic writes to NSW also causes undecidability.

Our results establish that NSW is the first known hardware-centric memory model that is relaxed enough to permit both delayed execution of writes and early execution of reads for which the reachability problem is decidable.

## 1 Introduction

The memory consistency model (or simply, the memory model) of a shared-memory multiprocessor is a low-level programming abstraction that defines when and in what order writes performed by one processor become visible to other processors. The simplest memory model, sequential consistency [16], requires that the operations performed by the processors should appear as if these operations are interleaved in a consistent global order. Despite its simplicity and appeal, most contemporary hardware platforms support Weak (relaxed) Memory Models for performance reasons [2, 13].

The effects of weak memory models can be counterintuitive and difficult to understand even for very small programs. Not surprisingly, relaxed memory models are an active research area today. Much progress has been made to aid programmers in the

**End of day 3**