

# EE271 - Assignment 4 Write-up

## Catching Difficult Bugs with A-QED

Khushal Sethi

Gongqi Li

### 1 Introduction

In this part of the project, we build an A-QED module and connect it to our rasterizer to perform formal verification. The A-QED module basically sends two inputs, one original and one duplicate, to the rasterizer and monitor the output signal. To ensure that the code is bug-free, the necessary condition is that for the same inputs, the rasterizer should always give out the same outputs.

So we run the BMC together with our A-QED module on the CAD tool called Jasper, which can easily visualize the waveform of all the signals and help us trace back to find the origin of any bugs. In this project, **we found two bugs, one associated with the 'tree\_hash.sv' module and the other with the 'hash\_jtree.sv' module.**

Details are shown in section 2.

### 2 Catching Bugs

#### 1. Bug 1: 'hit\_valid\_R18H' violation of the Functional Consistency



- From the waveform above, we see that a bug arises at clk period 15.
- The first output subsequence, which corresponds to the original input, is at clk 10, while the second output subsequence, corresponding to the duplicate input, is at clk 12 -- matching the region where ‘halt\_RnnnnL\_d5’ is low.

However, we also observe that the ‘hit\_valid\_R18H’ is 0 at clk 12, meaning that there is no valid hit for the second subsequence, this is a direct violation of the Functional Consistency.

Then we check if the generation of ‘hit\_valid\_R18H’ contains any bugs.

- After tracing back following the route, ‘hit\_valid\_R18H’ -> ‘sampletest.hit\_valid\_R16H’ -> ‘sampletest.validSamp\_R16H’ -> ‘hash\_jtree.validSamp\_R16H’ -> ‘hash\_jtree.validSamp\_R16H\_retime’ -> ‘hash\_jtree.validSamp\_R14H’, We found a mismatch between ‘validSamp\_R14H’ and the translated ‘validSamp\_R16H’ as shown in the waveform above.
- A detailed investigation of the code showed that the error is within the module ‘hash\_jtree.sv’, where a **very suspicious signal ‘gate\_RnnH’** being the input of the d-flip flop from ‘sample\_R16S\_retime’ to ‘sample\_R16S’, and ‘validSamp\_R16H\_retime’ to ‘validSamp\_R16H’.
- Since the enable **‘gate\_RnnH’ is not always ‘1b1’, there are cases that ‘validSamp\_R16H’ and ‘sample\_R16S’ are not updated as required**, just as what is being **red-circled in the wave form above**.
- A-QED is able to find bugs in this while formal verification cannot, because A-QED uses BMC which is significantly more thorough.
- A-QED found a short counterexample to this error, which was not found by running test vectors in verification earlier or in formal verification, **no formal verification properties were checking ‘validSamp\_R16H\_retime’ to ‘validSamp\_R16H’ and test vectors do not cover this case when inputs arrive concurrently under stress**.

**Code fix :**

```

  ... ...
  ... @@ -248,7 +248,8 @@ module hash_jtree
  248 248      (
  249 249          .clk    (clk),
  250 250          .reset (rst),
  251 251      - .en    (!(!gate_RnnH & Gate_RnnH)),
  251 251      + // .en    (!(!gate_RnnH & Gate_RnnH)),
  252 252      + .en    (1'b1),
  253 253          .in    (sample_R16S_retime),
  254 254          .out   (sample_R16S)
  255 255      );
  ... ...
  ... @@ -262,7 +263,8 @@ module hash_jtree
  262 263      (
  263 264          .clk    (clk),
  264 265          .reset (rst),
  265 265      - .en    (!(!gate_RnnH & Gate_RnnH)),
  266 266      + // .en    (!(!gate_RnnH & Gate_RnnH)),
  267 267      + .en    (1'b1),
  268 268          .in    (validSamp_R16H_retime),
  269 269          .out   (validSamp_R16H)
  270 270      );
  ... ...

```

## 2. Bug 2: Extrinsic signals ‘Mask\_RnnH’ and ‘mask\_R1H’



- After fixing the previous bug, we reran the BMC and did see that **the mismatch between ‘validSamp\_R14H’ and ‘validSamp\_R16H’ disappeared as the waveform shown above.** However, in this case, ‘hit\_valid\_R18H’ did not return a high for the first subsequence, while only returning high for the second subsequence.

- As a result, the 'hit\_R18S\_orig\_sequence' is always 0. We then try to figure out why the first peak of 'hit\_valid\_R18H' disappeared.
- We trace back following the route:**  
 'hit\_valid\_R18H' -> 'sampletest.hit\_valid\_R16H' -> 'sampletest.edge\_chk1\_R16' ->  
 'sampletest.dist\_lg\_R16S' -> 'sampletest.edge\_R16S' -> 'sampletest.tri\_shift\_R16S' ->  
 'sampletest.sample\_R16S' -> 'hash\_jtree.sample\_R16S' ->hash\_jtree.sample\_R16S\_retime'  
 -> 'hash\_jtree.sample\_jitted\_R14S' -> 'hash\_jtree.jitt\_val\_R14H' -> 'tree\_hash.out\_RnnH'
- As illustrated in the waveform bellowed, a **mismatch arises traveling from 'sample\_R14S' to 'sample\_jitted\_R14S'**. The reason is that the module for the **jittering process does not give out the same output for the same input**.
- Looking into the **'tree\_hash.sv'** module, we found that the **output is dependent on extrinsic signals 'Mask\_RnnH' and 'mask\_R1H'. We thus removed these dependencies**.
- A-QED found a short counter example where **'Mask\_RnnH' and 'mask\_R1H' cause the output to change from 'sample\_R14S' to 'sample\_jitted\_R14S'. After some clock cycles 'mask\_R1H' changes the output bits**, when both are issued concurrently.
- This bug was not found in formal verification or in test\_vector verification because the properties of 'sample\_R14S' to 'sample\_jitted\_R14S' are not formally checked under concurrent execution. The test vectors also don't cover cases of scenarios where 'Mask\_RnnH' and 'mask\_R1H' combine to differ the output.



## Code fix :

rtl/tree\_hash.sv

View file @ 99ab7d05

```
@@ -89,8 +89,8 @@ module tree_hash
...
89   89     assign arr16_RnnH[7:0] = arr32_RnnH[7:0] ^ arr32_RnnH[23:16] ; // 0 = 0 ^ 2
90   90     assign arr16_RnnH[15:8] = arr32_RnnH[15:8] ^ arr32_RnnH[31:24] ; // 1 ^ 3
91   91
92   92     assign out_RnnH[OUT_WIDTH-1:0] = (( arr16_RnnH[7:0] ^ arr16_RnnH[15:8] ) & mask_RnnH[7:0]) | (Mask_RnnH
93   93       & mask_R1H);
+     assign out_RnnH[OUT_WIDTH-1:0] = (( arr16_RnnH[7:0] ^ arr16_RnnH[15:8] ) & mask_RnnH[7:0]);
+     // assign out_RnnH[OUT_WIDTH-1:0] = (( arr16_RnnH[7:0] ^ arr16_RnnH[15:8] ) & mask_RnnH[7:0]) |
94   94       (Mask_RnnH & mask_R1H);
95   95       dff #(
96   96         .WIDTH(1),
...
...
```