# Lecture 3: Verification of Weak Memory Models Part 1: State Reachability Problem

Ahmed Bouajjani

LIAFA, University Paris Diderot - Paris 7

[Atig, B., Burckhardt, Musuvathi, POPL'10, ESOP'12] [Atig, B., Parlato, 2011]

VTSA, MPI-Saarbrücken, September 2012

# Sequential Consistency (SC) model

- Parallel processes with shared memory
- Interleaving (Sequentially Consistent) semantics:
  - Computations of different processes are shuffled
  - Program order is preserved for each process.

## Total Store Ordering (TSO)

- Reads can overtake writes on  $\neq$  variables.
- FIFO buffers where writes are stored to be executed later.
- Reads take values from the main memory if no writes in the buffer on the same variable. Otherwise they get the value of the last write in the buffer on the same variable.

## Write-to-Read Relaxation

$$\begin{array}{rcl} P_1 & : & \mathsf{write}(\mathsf{x},1) & ; & \mathsf{read}(\mathsf{y},0) \\ P_2 & : & \mathsf{read}(\mathsf{x},0) \end{array}$$

A scheduling for SC semantics: 3 steps

$$\begin{array}{rcl} P_1 &:& {\rm write}({\rm x},1)_{(2)} &; & {\rm read}({\rm y},0)_{(3)} \\ P_2 &:& {\rm read}({\rm x},0)_{(1)} \end{array}$$

## Write-to-Read Relaxation

$$\begin{array}{rcl} P_1 & : & \mathsf{write}(\mathsf{x},1) & ; & \mathsf{read}(\mathsf{y},0) \\ P_2 & : & \mathsf{read}(\mathsf{x},0) \end{array}$$

A scheduling for SC semantics: 3 steps

$$P_1$$
 : write(x, 1)<sub>(2)</sub> ; read(y, 0)<sub>(3)</sub>  
 $P_2$  : read(x, 0)<sub>(1)</sub>

Allowing reordering of actions on different variables: 2 steps !

$$\begin{array}{rcl} P_1 & : & {\rm read}(y,0)_{(1)} & ; & {\rm write}(x,1)_{(2)} \\ P_2 & : & {\rm read}(x,0)_{(1)} \end{array}$$

# Relaxed Models

• Read Local Write Early

write (x,d) ; read (x,d)  $\mapsto$  write (x,d)

• (+)  $W \rightarrow R$ : Write to Read write (x,d) ; read (y,d')  $\mapsto$  read (y,d') ; write (x,d)

 $\Rightarrow$  TSO model (Total Store Ordering)

• (+) W  $\rightarrow$  W: Write to Write write (x,d) ; write (y,d')  $\mapsto$  write (y,d') ; write (x,d)

 $\Rightarrow$  PSO model (Partial Store Ordering)

• (+)  $R \rightarrow R/W$ : Read to Read/Write  $\Rightarrow \sim RMO \mod (Relaxed Memory Ordering)$ 

1- Initial state

| x = y = 0                                                 |                                     | thread 1 thread 2     |
|-----------------------------------------------------------|-------------------------------------|-----------------------|
| thread 1                                                  | thread 2                            | $pc_1 = a$ $pc_2 = p$ |
| a: $y = 1$<br>b: $r_1 = x$<br>c: if $(r_1 == 0)$ {<br>d:} | p: x = 1                            | $r_1 = ?$ $r_2 = ?$   |
| b: $r_1 = x$                                              | q: <i>r</i> <sub>2</sub> = <i>y</i> |                       |
| c: if( $r_1 == 0$ ) {                                     | s: if( $r_2 == 0$ ) {               |                       |
| d:                                                        | t:                                  |                       |
| }                                                         | }                                   | shared memory         |
|                                                           |                                     | x = 0 $y = 0$         |

Dekker's mutual exclusion protocol. Fails under Write to Read relaxation.

| x = y = 0                                                |                                     | thread 1     | thread 2                  |
|----------------------------------------------------------|-------------------------------------|--------------|---------------------------|
| thread 1                                                 | thread 2                            | $pc_1 = b$   | $pc_2 = q$                |
| a: $y = 1$<br>b: $r_1 = x$<br>c: if $(r_1 == 0)$ {<br>d: | p: <b>x</b> = 1                     | $r_1 = ?$    | <i>r</i> <sub>2</sub> = ? |
| b: $r_1 = x$                                             | q: <i>r</i> <sub>2</sub> = <i>y</i> |              |                           |
| c: if( $r_1 == 0$ ) {                                    | s: if( $r_2 == 0$ ) {               |              | ( 1)                      |
| d:                                                       | t:                                  | w(y, 1)      | w(x, 1)                   |
| }                                                        | }                                   | shared me    | emory                     |
|                                                          | •                                   | <i>x</i> = 0 | y = 0                     |

Dekker's mutual exclusion protocol. Fails under Write to Read relaxation.

2- Writes are postponed

| x = y = 0                                                |                                     | thread 1     | thread 2      |  |
|----------------------------------------------------------|-------------------------------------|--------------|---------------|--|
| thread 1                                                 | thread 2                            | $pc_1 = c$   | $pc_2 = s$    |  |
| a: $y = 1$<br>b: $r_1 = x$<br>c: if $(r_1 == 0)$ {<br>d: | p: <b>x</b> = 1                     | $r_1 = 0$    | $r_2 = 0$     |  |
| b: $r_1 = x$                                             | q: <i>r</i> <sub>2</sub> = <i>y</i> |              |               |  |
| c: if( $r_1 == 0$ ) {                                    | s: if( $r_2 == 0$ ) {               |              | ( 1)          |  |
| d:                                                       | t:                                  | w(y, 1)      | w(x, 1)       |  |
| }                                                        | }                                   | shared n     | shared memory |  |
|                                                          | •                                   | <i>x</i> = 0 | y = 0         |  |

Dekker's mutual exclusion protocol. Fails under Write to Read relaxation.

3- Reading from memory

| x = y = 0             |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | thread 1             | thread 2      |  |
|-----------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------|---------------|--|
| thread 1              | thread 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | $pc_1 = d$           | $pc_2 = t$    |  |
| a: y = 1              | p: <b>x</b> = 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | $pc_1 = d$ $r_1 = 0$ | $r_2 = 0$     |  |
| b: $r_1 = x$          | q: <i>r</i> <sub>2</sub> = <i>y</i>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                      |               |  |
| c: if( $r_1 == 0$ ) { | p: $x = 1$<br>q: $r_2 = y$<br>s: if( $r_2 == 0$ ) {<br>t:}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                      |               |  |
| d:                    | t:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | w(y, 1)              | w(x, 1)       |  |
| }                     | }                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | shared               | shared memory |  |
|                       | I Contraction of the second seco | <i>x</i> = 0         | y = 0         |  |

Dekker's mutual exclusion protocol. Fails under Write to Read relaxation.

4- Accessing critical sections

# Memory Reordering Fences

• Write-Write Fences (wfence):

Prevent reordering between writes.

• Read-Read Fences (rfence):

Prevent reordering between reads.

• Fences (fence):

Prevent reordering between any two memory operations.

### Program Syntax

- Finite number of shared variables  $\{x, y, x_1...\}$
- Finite data domain  $\{d, d_1, d_2, ...\}$
- Finite number of finite-control processes  $P_1, \ldots, P_n$  with operations:

 $Write(x, d), Wfence, Read(x, d), Rfence, AtomicRW(x, d_1, d_2)$ 

# Safety Verification Problem

For a memory model  $\mu$ , a program P, and a (control + memory) state s

- State Reachability Problem (Safety)
  - s is reachable in P?

# Safety Verification Problem

For a memory model  $\mu$ , a program P, and a (control + memory) state s

- State Reachability Problem (Safety)
  - s is reachable in P?
- Decidability / Complexity ?
- Each process is finite-state
  - For the SC memory model, this problem is PSPACE-complete

# Safety Verification Problem

For a memory model  $\mu$ , a program P, and a (control + memory) state s

- State Reachability Problem (Safety)
  - s is reachable in P?
- Decidability / Complexity ?

Each process is finite-state

- For the SC memory model, this problem is PSPACE-complete
- Nontrivial for weak memory models:

 $Paths_{\mu}(P) = Closure_{\mu}(Paths_{SC}(P))$  is nonregular

• The state reachability problem is decidable for TSO.

- The state reachability problem is decidable for TSO.
- ... but highly complex: Nonprimitive recursive

- The state reachability problem is decidable for TSO.
- ... but highly complex: Nonprimitive recursive
- The repeated state reachability problem is undecidable for TSO

- The state reachability problem is decidable for TSO.
- ... but highly complex: Nonprimitive recursive
- The repeated state reachability problem is undecidable for TSO
- ullet ightarrow Store buffers can simulate lossy channels, and vice-versa.

Decidability Frontier [Atig, B., Burckhardt, Musuvathi, 2012]

• The state reachability problem is undecidable for TSO + R2W

Decidability Frontier [Atig, B., Burckhardt, Musuvathi, 2012]

• The state reachability problem is undecidable for TSO + R2W

• The state reachability problem is decidable for NSW = TSO + W2W + R2R

• When is it possible to reduce TSO verification to SC verification ?

- When is it possible to reduce TSO verification to SC verification ?
- Find restrictions on the explored behaviors such that:

Given a concurrent program P, it is possible to build a concurrent program P' such that: running P with TSO semantics under these restrictions is equivalent to running P' with SC semantics.

- When is it possible to reduce TSO verification to SC verification ?
- Find restrictions on the explored behaviors such that: Given a concurrent program P, it is possible to build a concurrent program P' such that: running P with TSO semantics under these restrictions is equivalent to running P' with SC semantics.
- A notion of Context-Bounded Analysis for TSO

- When is it possible to reduce TSO verification to SC verification ?
- Find restrictions on the explored behaviors such that:

Given a concurrent program P, it is possible to build a concurrent program P' such that: running P with TSO semantics under these restrictions is equivalent to running P' with SC semantics.

- A notion of Context-Bounded Analysis for TSO
- Unbounded number of context-switches: Bounding the age of each write in the buffer in terms of context-switches.

- When is it possible to reduce TSO verification to SC verification ?
- Find restrictions on the explored behaviors such that:

Given a concurrent program P, it is possible to build a concurrent program P' such that: running P with TSO semantics under these restrictions is equivalent to running P' with SC semantics.

- A notion of Context-Bounded Analysis for TSO
- Unbounded number of context-switches: Bounding the age of each write in the buffer in terms of context-switches.
- $\Rightarrow$  Transfer decidability/complexity results from SC to TSO.
- $\Rightarrow$  Use existing tools for concurrent programs under SC.

#### The rest of the lecture

• Decidability and complexity for TSO: Simulations by/of Lossy Channel Systems

#### The rest of the lecture

• Decidability and complexity for TSO: *Simulations by/of Lossy Channel Systems* 

• Decidability and complexity beyond TSO:

- Speculative writes lead to undecidability
- Decidability: deal with reordered reads

#### The rest of the lecture

• Decidability and complexity for TSO: *Simulations by/of Lossy Channel Systems* 

• Decidability and complexity beyond TSO:

- Speculative writes lead to undecidability
- Decidability: deal with reordered reads
- From TSO to SC under bounded analysis
  - 2 notions of bounds
  - ► Store buffers ~→ 2K copies of the globals per thread

### An operational model for TSO

- Each process has a FIFO buffer
- Configuration = control states + memory state + buffers contents
- Write(x,d) is sent to the buffer
- Memory update = execution of a Write taken from some buffer
- Read(x,d) is executed either if
  - The last Write to x in the buffer is Write(x,d) (Read Own Write)
  - The buffer does not contain a Write to x, and Memory(x) = d
- AtomicRW(x,  $d_1$ ,  $d_2$ ) requires that the buffer is empty (~ fence)

Thread 1: 
$$p_0$$
  $w(x,1)$   $p_1$   $w(y,1)$   $p_2$   $w(x,2)$   $p_3$   $w(y,2)$   $p_4$   $w(y,3)$   $p_5$   
Thread 2:  $q_0$   $r(x,2)$   $q_1$   $r(y,0)$   $q_2$ 

Thread 1: 
$$p_0$$
  $w(x,1)$   $p_1$   $w(y,1)$   $p_2$   $w(x,2)$   $p_3$   $w(y,2)$   $p_4$   $w(y,3)$   $p_5$   
Thread 2:  $q_0$   $r(x,2)$   $q_1$   $r(y,0)$   $q_2$ 

Thread 1: 
$$p_0$$
  $w(x,1)$   $p_1$   $w(y,1)$   $p_2$   $w(x,2)$   $p_3$   $w(y,2)$   $p_4$   $w(y,3)$   $p_5$   
Thread 2:  $q_0$   $r(x,2)$   $q_1$   $r(y,0)$   $q_2$ 

Thread 1: 
$$p_0$$
  $w(x,1)$   $p_1$   $w(y,1)$   $p_2$   $w(x,2)$   $p_3$   $w(y,2)$   $p_4$   $w(y,3)$   $p_5$   
Thread 2:  $q_0$   $r(x,2)$   $q_1$   $r(y,0)$   $q_2$ 

$$\begin{array}{c} 1 \\ x \\ w(y,3) \ w(y,2) \ w(x,2) \end{array}$$

Thread 1: 
$$p_0$$
  $w(x,1)$   $p_1$   $w(y,1)$   $p_2$   $w(x,2)$   $p_3$   $w(y,2)$   $p_4$   $w(y,3)$   $p_5$   
Thread 2:  $q_0$   $r(x,2)$   $q_1$   $r(y,0)$   $q_2$ 

Thread 1: 
$$p_0$$
  $w(x,1)$   $p_1$   $w(y,1)$   $p_2$   $w(x,2)$   $p_3$   $w(y,2)$   $p_4$   $w(y,3)$   $p_5$   
Thread 2:  $q_0$   $r(x,2)$   $q_1$   $r(y,0)$   $q_2$ 

Thread 1: 
$$(p_0, w(x, 1), p_1, w(y, 1), p_2, w(x, 2), p_3, w(y, 2), p_4, w(y, 3), p_5)$$
  
Thread 2:  $(q_0, r(x, 2), q_1, r(y, 0), q_2)$ 

Model: The store buffers are considered as perfect FIFO channels

#### Deadlock

2

Thread 1: 
$$(p_0) \xrightarrow{w(x,1)} (p_1) \xrightarrow{w(y,1)} (p_2) \xrightarrow{w(x,2)} (p_3) \xrightarrow{w(y,2)} (p_4) \xrightarrow{w(y,3)} (p_5)$$
  
Thread 2:  $(q_0) \xrightarrow{r(x,2)} (q_1) \xrightarrow{r(y,0)} (q_2)$ 

Thread 1: 
$$(p_0) \xrightarrow{w(x,1)} (p_1) \xrightarrow{w(y,1)} (p_2) \xrightarrow{w(x,2)} (p_3) \xrightarrow{w(y,2)} (p_4) \xrightarrow{w(y,3)} (p_5)$$
  
Thread 2:  $(q_0) \xrightarrow{r(x,2)} (q_1) \xrightarrow{r(y,0)} (q_2)$ 



$$w(y,3) w(y,2) w(x,2) w(1)$$
The store buffer of Thread 1













 $\mathsf{Buffer} = \mathsf{perfect} \ \mathsf{FIFO} \ \mathsf{channel}$ 

w(y, 3) w(y, 2) w(x, 2) w(y, 1) w(x, 1)

Channel= Sequence of memory states + Lossyness

y

0

 $\mathsf{Buffer} = \mathsf{perfect} \ \mathsf{FIFO} \ \mathsf{channel}$ 

w(y,3) w(y,2) w(x,2) w(y,1) w(x,1) w(y,3) w(y,2) w(x,2) w(y,1) w(x,1) 0 yChannel= Sequence of memory states + Lossyness 0

0 y

х

Lossyness= Unobservable memory states

A. Bouajjani (LIAFA, UP7)

 $\mathsf{Buffer} = \mathsf{perfect} \ \mathsf{FIFO} \ \mathsf{channel}$ 

w(y, 3) w(y, 2) w(x, 2) w(y, 1) w(x, 1) w(y, 3) w(y, 2) w(x, 2) w(y, 1) w(x, 1) w(y, 3) w(y, 2) w(y, 2) w(y, 1) w(x, 1)

Lossyness= Unobservable memory states

A. Bouajjani (LIAFA, UP7)

y

0

 $\mathsf{Buffer} = \mathsf{perfect} \ \mathsf{FIFO} \ \mathsf{channel}$ 

0 х w(y,3) w(y,2) w(x,2) w(y,1) w(x,1)0 V Channel = Sequence of memory states + Lossyness 1 х x = 2 x = 2 x = 2y = 3 y = 2 y = 10 y Lossyness= Unobservable memory states

A. Bouajjani (LIAFA, UP7)

Lecture 3: Weak Memory Models

 $\mathsf{Buffer} = \mathsf{perfect} \ \mathsf{FIFO} \ \mathsf{channel}$ 



A. Bouajjani (LIAFA, UP7)

Lecture 3: Weak Memory Models

## From $\mathsf{W}\to\mathsf{R}$ systems to Lossy Channel Systems

 $\mathsf{Buffer} = \mathsf{perfect} \ \mathsf{FIFO} \ \mathsf{channel}$ 

w(y,3) w(y,2) w(x,2) w(y,1) w(y,3) w(y,2) w(x,2) w(y,1) 0 yChannel= Sequence of memory states + Lossyness

Lossyness= Unobservable memory states

A. Bouajjani (LIAFA, UP7)

y

### From $\mathsf{W}\to\mathsf{R}$ systems to Lossy Channel Systems

 $\mathsf{Buffer} = \mathsf{perfect} \ \mathsf{FIFO} \ \mathsf{channel}$ 

1 w(y, 3) w(y, 2) w(x, 2) 1 y

Channel= Sequence of memory states + Lossyness



Lossyness= Unobservable memory states

A. Bouajjani (LIAFA, UP7)

 $\mathsf{Buffer} = \mathsf{perfect} \ \mathsf{FIFO} \ \mathsf{channel}$ 

22 x w(y, 3) w(y, 2)

Channel= Sequence of memory states + Lossyness



Lossyness= Unobservable memory states

A. Bouajjani (LIAFA, UP7)



- Write: Compute a new memory state; send it to the channel
- *Read:* Check the channel/memory
- Memory update: Receive a state; copy it to the memory





- Write: Compute a new memory state; send it to the channel
- *Read:* Check the channel/memory
- Memory update: Receive a state; copy it to the memory

- Problem: Interference between processes ?
- $\bullet \Rightarrow$  Each process guesses occurrences of writes by other processes



- Write: Compute a new memory state; send it to the channel
- *Read:* Check the channel/memory
- Memory update: Receive a state; copy it to the memory
- Guessed Write: Send the guessed state to the channel

- Problem: Interference between processes ?
- $\bullet \Rightarrow$  Each process guesses occurrences of writes by other processes



- Write: Compute a new memory state; send it to the channel
- *Read:* Check the channel/memory
- Memory update: Receive a state; copy it to the memory
- Guessed Write: Send the guessed state to the channel
- $\Rightarrow$  Check that all process agree on the sequence of states Synchronization of the lossy channel machines over send actions

Decidability for the State Reachability Problem

• Thm

The state reachability problem for TSO programs is reducible to the control-state reachability problem for LCS.

Decidability for the State Reachability Problem

• Thm

The state reachability problem for TSO programs is reducible to the control-state reachability problem for LCS.

• Thm ([Abdulla, Jonsson, 1993])

The control-state reachability problem for LCS is decidable

• Corollary

The state reachability problem for TSO systems is decidable.

# From Lossy Channel Systems to $\mathsf{W}\to\mathsf{R}$ systems



- T<sub>1</sub> simulates the lossy channel machine:
  - Send operation: Write operation of T<sub>1</sub> to the variable x
  - Read operation: Read operation of  $T_1$  from the variable y
- $T_2$  transfers the successive values of the variable x to the variable y

# Complexity

#### • Thm

#### Every LCS can be simulated by a TSO program.

# Complexity

Thm

Every LCS can be simulated by a TSO program.

• Thm ([Schnoebelen, 2001])

The control-state reachability problem for LCS is non-primitive recursive

 $\Rightarrow$  Lower bound for the state reachability problem under TSO.

# TSO + R2W: Causality cycles



# TSO + R2W: Causality cycles



- This behavior is possible since writes can overtake reads:
   (2), (3), (4), (1)
- Speculative writes  $\Rightarrow$  causality cycles
  - ▶ (2) is executed assuming that (1) will be executed in the future
  - ▶ (1) is indeed executed, but it is based on a write that depends from (2)

# TSO + R2W: Undecidability



Assume that:  $u_{i_1}u_{i_2}\cdots u_{i_n} = v_{j_1}v_{j_2}\cdots v_{j_m}$  and  $i_1i_2\cdots i_n = j_1j_2\cdots j_m$   $T_1: \mathbf{r}(\mathbf{y}_2, \mathbf{i}_n) w(y_1, \mathbf{i}_n) \mathbf{r}(\mathbf{x}_2, \mathbf{u}_{i_n}) w(\mathbf{x}_1, \mathbf{u}_{i_n}) \cdots \mathbf{r}(\mathbf{y}_2, \mathbf{i}_1) w(y_1, \mathbf{i}_1) \mathbf{r}(\mathbf{x}_2, \mathbf{u}_{i_1}) w(\mathbf{x}_1, \mathbf{u}_{i_1})$  $T_2: \mathbf{r}(y_1, \mathbf{j}_n) w(\mathbf{y}_2, \mathbf{j}_n) \mathbf{r}(\mathbf{x}_1, \mathbf{v}_{j_n}) w(\mathbf{x}_2, \mathbf{v}_{j_n}) \cdots \mathbf{r}(y_1, \mathbf{j}_1) w(\mathbf{y}_2, \mathbf{j}_1) \mathbf{r}(\mathbf{x}_1, \mathbf{v}_{j_1}) w(\mathbf{x}_2, \mathbf{v}_{j_1})$ 

# TSO + R2W: Undecidability



Assume that:  $u_{i_1}u_{i_2}\cdots u_{i_n} = v_{j_1}v_{j_2}\cdots v_{j_m}$  and  $i_1i_2\cdots i_n = j_1j_2\cdots j_m$   $T_1: r(y_2, i_n) r(x_2, u_{i_n})\cdots r(y_2, i_1) r(x_2, u_{i_1})\cdots w(y_1, i_n) w(x_1, u_{i_n})\cdots w(y_1, i_1) w(x_1, u_{i_1})$  $T_2: w(y_2, j_n) w(x_2, v_{j_n})\cdots w(y_2, j_1) w(x_2, v_{j_1})\cdots r(y_1, j_n) r(x_1, v_{j_n})\cdots r(y_1, j_1) r(x_1, v_{j_1})$ 

# TSO + R2W: Undecidability



Assume that:  $u_{i_1}u_{i_2}\cdots u_{i_n} = v_{j_1}v_{j_2}\cdots v_{j_m}$  and  $i_1i_2\cdots i_n = j_1j_2\cdots j_m$   $T_1: r(y_2, i_n) r(x_2, u_{i_n}) \cdots r(y_2, i_1) r(x_2, u_{i_1}) \cdots w(y_1, i_n) w(x_1, u_{i_n}) \cdots w(y_1, i_1) w(x_1, u_{i_1})$  $T_2: w(y_2, j_n) w(x_2, v_{j_n}) \cdots w(y_2, j_1) w(x_2, v_{j_1}) \cdots r(y_1, j_n) r(x_1, v_{j_n}) \cdots r(y_1, j_1) r(x_1, v_{j_1})$ 

 $\Rightarrow$  Reachability TSO + R2W

A. Bouajjani (LIAFA, UP7)

# NSW: Non Speculative Writes

- TSO = Read-Local-Write-Early + W2R
- PSO = TSO + W2W
- NSW = PSO + R2R

# NSW: Non Speculative Writes

- TSO = Read-Local-Write-Early + W2R
- PSO = TSO + W2W
- NSW = PSO + R2R
- Simulation of TSO under PSO: Add a write-write fence (wfence) before each write

# NSW: Non Speculative Writes

- TSO = Read-Local-Write-Early + W2R
- PSO = TSO + W2W
- NSW = PSO + R2R
- Simulation of TSO under PSO: *Add a write-write fence (wfence) before each write*
- Simulation of PSO under NSW:

Add a read-read fence (rfence) before each read





Configuration = control states + memory state + event structures





Writes on x are inserted after the last reads, wfences, and writes on x.





Writes on y are inserted after the last reads, wfences, and writes on y.





Wfences are inserted after the last writes.





Reads on x are inserted after the last writes/reads on x.





Writes on y are inserted after the last reads, wfences, and writes on y.



v = 0

 $q_0$ 



Fences are performed by a process only when its event structure is empty.



 $q_1$ 



Reads on y are inserted after the last writes/reads on y.



 $q_2$ 



Writes on x are inserted after the last reads, wfences, and writes on x.



A. Bouajjani (LIAFA, UP7)



Updates to memory are performed when those writes are minimal.





Reads are validated w.r.t. the memory when they are minimal.





Rfences are performed by a process only if there is no pending reads.





Reads on x are validated immediately with the last write on x (if possible)





Updates to memory are performed when those writes are minimal.





Updates to memory are performed when those writes are minimal.





Reads are validated w.r.t. the memory when they are minimal.





Wfences are removed if they are minimal.





Updates to memory are performed when those writes are minimal.



## From Event Structures to Buffers



## From Event Structures to Buffers



#### Elimination of Reads

Configuration = control states + event structures + memory history buffer.



#### From Event Structures to Buffers



#### From Event Structures to Buffers



### Elimination of Write Fences

Configurations= Control states + Variable/Serial Buffers + History Buffer



Lecture 3: Weak Memory Models

## The State Reachability Problem for NSW

## Decidability of State Reachability

Approach: Well Structured Systems [Abdulla et al., Finkel et al.]

- Well-Quasi Ordering ≤ on Configurations on every sequence c<sub>0</sub>, c<sub>1</sub>, c<sub>2</sub>,..., ∃i < j. c<sub>i</sub> ≤ c<sub>j</sub>
- Monotonicity:
  - $\leq$  is a simulation relation w.r.t. transition relation of the model
- ullet  $\Rightarrow$  Backward reachability analysis terminates

#### Problem: NSW ?

- Sub-word ordering on buffers?
   NSW are Not Monotonic!
- Hard to apply WSS framework to NSW



## $NSW^+$ systems

• NSW  $\equiv$  NSW<sup>+</sup>

• NSW<sup>+</sup>: WSS wrt  $\preceq$ 



Lecture 3: Weak Memory Models

| <ul> <li>NSW ≡ NSW<sup>+</sup></li> <li>NSW<sup>+</sup>: WSS wrt ≤</li> </ul> |                                        | Single Serial Buffer                                            |                                                                    |
|-------------------------------------------------------------------------------|----------------------------------------|-----------------------------------------------------------------|--------------------------------------------------------------------|
| <b>P</b> 0<br><b>q</b> 0                                                      | w(x, 2)<br>w(y, 0)<br>Variable Buffers | x = 1  x = 1 $y = 1  y = 0$ $P2: y  P1: x$ Single Serial Buffer | x = 0<br>y = 0<br>$rac{1}{P1, P2 : x, y}$<br>Memory History Buffer |

Lecture 3: Weak Memory Models

- NSW  $\equiv$  NSW<sup>+</sup>
- NSW<sup>+</sup>: WSS wrt  $\preceq$

Each message in the serial buffer contains a snapshot of memory

| $(p_0)$ $w(y,0)$ $x = 1$ $x = 1$ $x = 0$                   | _ |
|------------------------------------------------------------|---|
| y = 1  y = 0 $y = 0$                                       |   |
| q0         P2:y         P1:x                               | _ |
| <i>P</i> 1, <i>P</i> 2 : <i>x</i> , <i>y</i>               |   |
| Variable Buffers Single Serial Buffer Memory History Buffe | r |

• NSW  $\equiv$  NSW<sup>+</sup>

• NSW<sup>+</sup>: WSS wrt  $\preceq$ 

Unbounded buffers but lossy



- NSW  $\equiv$  NSW<sup>+</sup>
- NSW<sup>+</sup>: WSS wrt  $\preceq$

Processes have different views of memory (the use of pointers)

|                       | w(x, 2)          |                      |                       |
|-----------------------|------------------|----------------------|-----------------------|
| <b>P</b> 0            | w(y, 0)          | x = 1 $x = 1$        |                       |
|                       |                  | y=1 $y=0$            | x = 0                 |
| <b>q</b> <sub>0</sub> |                  | P2:y P1:x            | y = 0                 |
|                       |                  |                      | l<br>P1, P2 : x, y    |
|                       | Variable Buffers | Single Serial Buffer | Memory History Buffer |
|                       |                  |                      |                       |

### State Reachability: Under approximate analysis

- What is a suitable bounding notion ?
- Should allow a compositional reduction to SC
- Should avoid representing the contents of store buffers

### K-round Reachability



### Compositional Reasoning



#### Encoding Store Buffers: The View of a Process



Mask : Var  $\rightarrow \{0,1\}$ Queue : Var  $\rightarrow \mathbb{D} \cup \{1\}$ 

### Simulating Round 1



### Simulating Round 2



A. Bouajjani (LIAFA, UP7)

Lecture 3: Weak Memory Models

September 2012 38 / 42

### Bounding Store Ages



### Bounding Store Ages



Translation: *Mask<sub>j</sub>* and *Queue<sub>j</sub>* are used circularly (modulo K + 1).

#### Consequences

- *K*-round reachability is decidable for boolean concurrent programs with recursive procedure calls.
- *K*-store-age reachability is decidable for boolean concurrent programs with finite-state threads (without recursion).
- These results hold also for programs with parametric/dynamic number of threads. (Reduction to coverability in Petri nets, using [Atig, B., Qadeer, 2009] for programs with recursion)
- It is possible to use existing tools for the analysis/verification/testing of concurrent programs under SC.

### State Reachability: Conclusion

- State Reachability: Decidable for TSO and beyond. Undecidability when speculative writes are allowed.
- But it is a hard problem (nonprimitive recursive when decidable) !
- However, it is possible to have efficient analysis techniques
- Reduction to SC is a promising idea, can be generalized beyond TSO
- Abstraction-based techniques:

e.g., [Kuperstein, Vechev, Yahav, PLDI'11]

• Symbolic techniques:

[Abdulla, Atig, Chen, Leonardson, Rezine, TACAS'12] [Linden, Wolper, SPIN'10-11]

• Other important models: PowerPC, ARM (hardware), C++