

# Hardware/Software Co-verification Using Path-based Symbolic Execution

Rajdeep Mukherjee  
Cadence Design Systems, USA  
mrajdeep@cadence.com

Saurabh Joshi  
IIT, Hyderabad  
sbjoshi@iith.ac.in

John O’Leary  
Intel Corporation, USA  
john.w.oleary@intel.com

Daniel Kroening  
University of Oxford, UK  
kroening@cs.ox.ac.uk

Tom Melham  
University of Oxford, UK  
tom.melham@cs.ox.ac.uk

## ABSTRACT

Conventional tools for formal hardware/software co-verification use bounded model checking techniques to construct a single monolithic propositional formula. Formulas generated in this way are extremely complex and contain a great deal of irrelevant logic, hence are difficult to solve even by the state-of-the-art Satisfiability (SAT) solvers. In a typical hardware/software co-design the firmware only exercises a fraction of the hardware state-space, and we can use this observation to generate simpler and more concise formulas. In this paper, we present a novel verification algorithm for hardware/software co-designs that identify partitions of the firmware and the hardware logic pertaining to the feasible execution paths by means of path-based symbolic simulation with custom path-pruning, property-guided slicing and incremental SAT solving. We have implemented this approach in our tool COVERIF. We have experimentally compared COVERIF with HW-CBMC, a monolithic BMC based co-verification tool, and observed an average speed-up of 5 $\times$  over HW-CBMC for proving safety properties as well as detecting critical co-design bugs in an open-source Universal Asynchronous Receiver Transmitter design and a large SoC design.

## 1 INTRODUCTION

In modern embedded system development, software and hardware components are designed and implemented in parallel. Hardware/software *co-verification* is performed throughout the design cycle to ensure that both components work correctly together.

Before *Register Transfer Level* (RTL) code exists for the hardware components, engineers write abstract models of the proposed hardware; such models are commonly known as *Transaction Level Models* (TLM) [2]. TLMs are typically implemented using the SystemC TLM library [20] or as plain C programs. TLMs capture enough functionality of the hardware (HW) to enable executing and debugging of the software (SW) or firmware (FW) before the RTL is available [2, 3], but TLMs are always incomplete. Co-verification of the TLM and the SW is typically performed by testing [1, 2]. We use the term FW and SW interchangeably in this paper.

Once RTL coding for the hardware components is complete (that is, post-RTL), hardware/software co-verification becomes more complex. Unlike the TLM, which only captures limited design functionality, the RTL code describes the cycle-accurate behavior of the final HW, and contains many extra-functional artefacts related to

power, area, and timing. Because of the RTL’s detail and complexity, the effectiveness of testing is severely limited in post-RTL co-verification. Formal verification is mandatory to ensure correctness. Note that whenever we refer to “hardware” from this point onwards, we mean an RTL implementation and not a TLM.

The verification of SW written in C/C++ together with RTL coded in Verilog/VHDL is very challenging. First, there is a timing mismatch between the synchronous clock-driven HW model and asynchronous event-driven SW model. For example, the FW running on a processor could be much faster or slower than the HW model it interacts with. Second, there are no standard languages or techniques for specifying properties of HW/SW co-designs. Third, the hardware is highly concurrent and the software are frequently multi-threaded; leading to a large number of event interleavings to analyze. Finally, there are few automated formal HW/SW co-verification tools that support co-designs implemented in C/C++ and Verilog.

Recently, Mukherjee et al. [15] presented a formal HW/SW co-verification tool, called HW-CBMC, that constructs a combined HW-SW model through in-tandem symbolic execution of the SW and the RTL code. SAT-based Bounded Model Checking (BMC) [6] is used to prove safety properties of the combined model. We will refer to the combined HW-SW model as the *co-verification model*.

In this paper, we build on the observation that monolithic BMC of the co-verification model leads to propositional SAT formulas containing much irrelevant logic. The size and complexity of the formulas pose difficulties for SAT solvers, making the approach ineffective for practical co-verification problems. An effective and practical co-verification solution must reason only about “relevant” interactions between the SW and HW. The notion of relevance stems from a few sources: 1) the property or *co-specification* to be proved, 2) the behavior of the software, and 3) environmental assumptions. First, the co-specification in a HW/SW co-design captures the design intent that is to be verified. A scalable HW/SW co-verification tool need only check those parts of the co-verification model that pertain to the given co-specification model. Second, the SW in a typical HW/SW co-design only exercises a fragment of the HW state-space [8, 18, 19]. Formal tools may use this fact to verify only the HW functionality exercised by the SW, ignoring or abstracting the rest. This approach can generate much simpler and concise formulas than those arising in monolithic BMC – most importantly, formulas that are more readily solved by state-of-the-art SAT solvers. Finally, assumptions about the environment of a HW/SW co-design may be exploited to further constrain the verification state-space.

In this paper, we present a novel verification algorithm for HW/SW co-designs that identifies partitions of the SW and the HW logic pertaining to the feasible execution paths by means of path-based symbolic execution with custom path-pruning, property-driven slicing, and incremental SAT solving (see Section 3). We employ these techniques in our tool COVERIF to demonstrate that the domain-specific optimizations in COVERIF lead to more scalable reasoning for HW/SW co-designs compared to HW-CBMC.

## 1.1 Contributions

In this paper, we present a novel verification algorithm for HW/SW co-designs called, COVERIF, using path-based symbolic simulation with custom path-pruning, property-guided slicing, and incremental SAT solving techniques. COVERIF supports HW designs in Verilog RTL (IEEE SystemVerilog 2005 standards) and SW in ANSI-C (C89, C99 standards). We experimentally compare two approaches for formal HW/SW co-verification – 1) the *monolithic* approach used in HW-CBMC, and 2) the *path-based* approach of COVERIF. We study an open-source Universal Asynchronous Receiver Transmitter (UART) design and a large SoC design, and we find that COVERIF is 5× faster than HW-CBMC for proving safety properties as well as for detecting critical co-design bugs.

## 2 WORKING EXAMPLE

Figure 1 gives a fragment of a SW driver for a UART design. The main module of the UART SW, shown on the right side of Figure 1, begins by resetting the UART HW which is followed by a *wb\_idle()* function (explained next). The SW implements Linux style *inb()* and *outb()* functions which further invoke the *wb()* class of functions to communicate with the UART HW. The SW then configures the UART in *loopback* mode using *outb()* function calls. The *wb()* class of *interface functions*, shown on the left side of Figure 1, communicate with the *wishbone* bus interface to set/reset (wiggle) the UART input ports and read/write data through the bus interface. The calls to the top-level UART module is represented by *UART\_top()* function. We verify that the transmitted data is the same as the received data in the *loopback* mode. To do so, we place an assertion given by the *assert()* statement (marked in red) inside the main logic of the UART SW, on the right-side of Figure 1.

## 3 PROPOSED METHODOLOGY

Figure 2 shows our proposed verification methodology, as implemented in COVERIF. We now describe each step in detail.

**Step 1: Generating Software Netlist from HW** A HW circuit, given in Verilog RTL, is automatically synthesized into a cycle-accurate and bit-precise *software netlist* [14, 16, 17] model following synthesis semantics. The software netlist model is represented as a C program<sup>1</sup> which retains the word-level structure as well as module hierarchy of the Verilog RTL. In contrast with conventional RTL synthesis into a netlist, our software netlist exists solely to facilitate hardware/software co-verification. Figure 3 shows an example of Verilog RTL circuit containing both sequential and combinational elements. Column 2,3 in Figure 3 shows the formal semantics of the Verilog RTL and the synthesized HW respectively. The equivalent software netlist model is shown in column 4.

<sup>1</sup><http://www.cprover.org/hardware/v2c/>

| Wishbone Interface                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | Main Module                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <pre> <b>typedef unsigned char u8;</b> <b>unsigned char inb</b> <b>(unsigned long port)</b> {     <b>return wb_read(port);</b> } <b>void outb</b> (<b>u8 value,</b> <b>unsigned long port</b>) {     <b>wb_write(port, value);</b> } <b>void wb_reset</b>(<b>void</b>) {     <b>rst_i = 1;</b>     <b>UART_top();</b>     <b>rst_i = 0;</b>     <b>stb_i = 0;</b>     <b>cyc_i = 0;</b> } <b>void wb_idle</b>() {     <b>UART_top();</b> } <b>void wb_write</b>(<b>_u32 addr,</b>     <b>_u8 b</b>) {     <b>addr_i = addr;</b>     <b>dat_i = b;</b>     <b>we_i = 1;</b>     <b>cyc_i = 1;</b>     <b>stb_i = 1;</b>     <b>UART_top();</b>     <b>we_i = 0;</b>     <b>cyc_i = 0;</b>     <b>stb_i = 0;</b> } </pre> | <pre> <b>int main() {</b>     <b>wb_reset();</b>     <b>wb_idle();</b>     // Configure the uart     <b>outb(0x13, UART_MC);</b>     <b>outb(0x80, UART_CM3);</b>     <b>outb(0x00, UART_CM2);</b>     <b>outb(0x00, UART_CM1);</b>     <b>outb(0x00, UART_CR);</b>     <b>outb(0x03, UART_IE);</b>     <b>char tx_b[] = "Hello_world";</b>     <b>_u8 status = 0;</b>     <b>char rx_b[100];</b>     <b>int i=0,c=0,d=0;</b>     // data transfer in loopback     <b>for (i=0; i&lt;1990; i++) {</b>         <b>if (irq_o){</b>             <b>status=inb(UART_IS)&amp;0x0c;</b>             <b>if (istatus==0xc){</b>                 //it was a tx_empty interrupt                 <b>outb(*(&gt;tx_b+c),UART_TR); c++;</b>             <b>}</b>         <b>else{ //status==0x4</b>             //it was an rx_data interrupt             <b>rx_b[d] = inb(UART_TR); d++;</b>         <b>}</b>     <b>}</b>     <b>// no interrupt.</b>     <b>wb_idle();</b>     <b>wb_idle();</b> } // property <b>for(i=0; i&lt;=10; i++)</b> <b>assert(rx_b[i] == tx_b[i]);</b> } </pre> |

Figure 1: Software driver of UART



Figure 2: Path-based HW/SW Co-verification Flow in COVERIF

**Step 2: Sequentializing Interactions of Firmware and Software Netlist** Concurrency is a key problem for co-verification. Testing concurrent threads requires exploring all possible interleavings between HW and SW threads. The number of interleavings could potentially be exponential. However, we observe a specific interaction pattern, which resembles a producer-consumer relationship. That is, a FW thread is mostly independent of the HW thread it interacts with [1, 2]. Specifically, a FW thread is only responsible for configuring the HW by writing to memory mapped registers,

| Verilog                                                                                                                                                                                                                                 | Formal Semantics                                                                                                                                                                                                                                                                                                                | Synthesized Hardware | Software netlist                                                                                                                                                                                                                                                                                                                      |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <pre> module top(clk, a); input clk, a; reg b,d,e; wire c,cond; assign c = e ? 1'b0:d; assign cond = a; always @(posedge clk) begin   b&lt;=a;   if(cond &amp;&amp; b)     e&lt;=b;   else     e&lt;=0;   d&lt;=c; end endmodule </pre> | <p><b>Combinational Logic</b></p> $\forall t, c(t) = \text{if } e(t) \text{ then } 0 \text{ else } d(t)$ $\forall t, \text{cond}(t) = a(t)$ <p><b>Sequential Logic</b></p> $\forall t, b(t+1) = a(t)$ $\forall t, e(t+1) = \text{if}(\text{cond}(t) \wedge b(t)) \text{ then } b(t) \text{ else } 0$ $\forall t, d(t+1) = c(t)$ |                      | <pre> struct state_elements_top {   unsigned int b, d, e; }; struct state_elements_top u1; void top(_Bool clk, unsigned a) {   _Bool c,cond;   _Bool b_old=u1.b, d_old=u1.d;   _Bool e_old=u1.e;   cond = a;   c = (u1.e)?0:u1.d;   u1.b = a;   if(cond &amp;&amp; b_old)     u1.e = b_old;   else     u1.e = 0;   u1.d = c; } </pre> |

Figure 3: Circuit to Software Netlist

or polling the interrupt status register for data transmission, or receiving incoming data packets. Furthermore, we observe producer-consumer interaction patterns in many practical industrial co-designs from IBM [8, 18, 19], RockBox Media Player [1], and others co-designs. In this paper, we verify co-designs that exhibit producer-consumer interaction behavior. A *co-verification model*,  $M_{seq}$ , is constructed through sequential composition of the FW and its interacting software netlist.

**Step 4: Property Driven Slicing of Co-verification Model** A property-driven slicing is performed on the unified co-verification model,  $M_{seq}$ . This step is purely syntactic, meaning that we perform a backward dependency analysis starting from the property which only preserve those program fragments that are relevant to a given property. The sliced program is then passed to the symbolic execution engine for path-based exploration.

**Step 5: Co-verification Using Path-based Symbolic Execution** Given a co-verification model,  $M_{seq}$ , a scenario,  $S$  typically represented by assume properties, and a co-design property expressed as  $assert(c)$  (where  $c$  is a condition stated in terms of variables in  $M_{seq}$ ) as input, COVERIF performs path-based exploration of  $M_{seq}$  to automatically check its validity using backend solvers. If the condition  $c$  does not hold, then  $M_{seq}$  is said to have violated the property.

A typical path-based symbolic execution engine might explore a path until it come to an  $assert(c)$  statement. This whole path can then be posed as a query to a SAT solver to see if the assertion is violated at that point. If the path is infeasible, the assertion holds trivially. If a large number of paths are infeasible, symbolic execution may waste time exploring them. COVERIF employs an eager infeasibility check to prune infeasible paths, as well as incremental encoding that makes it easier for the underlying SAT solver.

Alg. 1 shows the overall algorithm of COVERIF. States mentioned in the algorithm are all symbolic states, which are quantifier-free predicates characterizing a set of program states. Symbolic execution starts with an initial symbolic state  $I(x)$ , a quantifier-free predicate over program variables  $x$ , and the first statement  $stmt$  to be executed. Note that we assume all program variables have finite bit-width and thus can be represented as bit-vectors. Every statement acts as a state transformer during the symbolic execution.  $worklist$  maintains the set of symbolic states, along with the corresponding  $stmt$  that should execute next. Assumptions can be

Algorithm 1: Co-verification Using Path-based Symbolic Execution

---

```

input :Co-verification Model  $M_{seq}$  with properties specified with
        $assert(c)$ , scenario specified with  $assume(c)$ 
output :The status (Safe or Unsafe) and a counterexample if
       Unsafe
/* The initial state
1  $S_0 \leftarrow I(x)$ 
2  $stmt \leftarrow$  first statement
3  $worklist.put(S_0, stmt)$ 
4 while not  $worklist.empty()$  do
5    $(S, stmt) \leftarrow worklist.get()$ 
6   if  $stmt$  is an  $assume(c)$  then
7      $stmt \leftarrow$  statement after  $assume(c)$ 
8     if  $isFeasible(S \wedge c)$  then  $worklist.put(S \wedge c, stmt)$ 
9   else if  $stmt$  is a branch with condition  $c$  then
10     $stmt_f \leftarrow$  first statement after  $stmt$  if branch is not taken
11     $stmt_t \leftarrow$  first statement after  $stmt$  if branch is taken
12    if  $isFeasible(S \wedge c)$  then  $worklist.put(S \wedge c, stmt_t)$ 
13    if  $isFeasible(S \wedge \neg c)$  then
14       $worklist.put(S \wedge \neg c, stmt_f)$ 
15   else if  $stmt$  is an  $assert(c)$  then
16      $stmt \leftarrow$  statement after  $assert(c)$ 
17     if  $isFeasible(S \wedge \neg c)$  then
18       print Unsafe
19       return Counterexample
20   else  $worklist.put(S \wedge c, stmt)$ 
21 else
22    $S \leftarrow symex(S, stmt)$ 
23    $stmt \leftarrow$  the next statement in control flow after  $stmt$ 
24   if  $stmt \neq \perp$  then  $worklist.put(S, stmt)$ 
25
26 end
27 return Safe
28 end

```

---

specified in the program using  $assume(c)$  statements, where  $c$  is the condition expressed in terms of program variables. Assumptions restrict the search to only those states for which the condition  $c$  holds

at the program point where *assume(c)* is encountered. For example, suppose  $(x! = 0)$  characterizes the set of states that has been discovered to be reachable so far by a verification tool. Here,  $x$  is a program variable. Upon encountering *assume( $x > 0$ )*, the set of states reachable at the point of assumption is shrunk to only those that satisfy  $(x > 0)$ . A user can specify assumptions to restrict the verification to only certain regions of the program’s state space.

COVERIF performs a feasibility check at an *assume* statement or a branch (Line 6 and 9) to ensure that only feasible symbolic states are kept in the *worklist*. This ensures that the infeasibility is detected as early as possible. If an assertion is violated, then a counterexample is detected and Alg. 1 terminates (Line 14). In all other cases, *symex(S, stmt)* performs one step of symbolic execution by executing *stmt* from the symbolic state *S* (Line 22). If no further statement remains to be executed along the path that is being explored, then *stmt* is assigned the value  $\perp$ . The symbolic state is put in the *worklist* only if there are further statements remaining (Line 24).

The feasibility checks shown as *isFeasible* pose a query to the underlying SAT solver. Note that Alg. 1 does not refer to how the methods *worklist.put* and *worklist.get* work. In principle, one can use any search heuristic to select which symbolic state to explore further from the *worklist*. In the current version COVERIF employs a depth first strategy of exploration.

Apart from the eager infeasibility check, another crucial optimization is the use of incremental SAT solving. During the symbolic execution, only one solver instance is maintained while going down a single path. Thus, when making a feasibility check from one branch  $b_1$  to another branch  $b_2$  along a single path, only the program segment from  $b_1$  to  $b_2$  is encoded as a constraint and added to the existing solver instance. This results in speeding up the process of feasibility check of the symbolic state at  $b_2$  as the feasibility check at  $b_1$  was *true*. A new solver instance is used to explore a different path, after the current path is detected as infeasible.

The eager infeasibility check restricts the search to explore only those SW/HW interactions which are feasible under a given scenario. In our experiments, we find this optimization has a large effect on runtimes. Though COVERIF poses many queries to the SAT solver, each query is relatively simple due to two reasons: the resultant formula encodes only a single path, and exploration along a path only needs to encode and solve for the path segment (along with the existing constraints) from the last point of query.

## 4 HW-CBMC: MONOLITHIC HW/SW CO-VERIFICATION

We briefly describe the working of the HW-SW co-verification tool, HW-CBMC. In contrast to the path-based approach, in HW-CBMC, the symbolic execution of HW and SW models are clearly separate and the two flows meet only at the solver phase, where a complex monolithic formula is generated in Conjunctive Normal Form (CNF) from the HW and SW designs. This complex formula is passed on to the solver for verification purpose. Furthermore, HW-CBMC provides specific handshake primitives such as *next\_timeframe()* and *set\_inputs()* to model FW-HW communications.

## 5 PROPERTIES

Lack of support for property specifications in a HW/SW co-design is one of the stumbling blocks for the application of formal techniques in co-verification. We express a co-design property in C language, which is discussed next.

Figure 4 shows an example of a temporal property for the UART HW. The System Verilog Assertion (SVA) is shown on the left and the equivalent property in C semantics is shown on the right. The temporal delay (###2) of the SVA on the left is simulated by the calls to the top-level UART module in the software netlist, which is represented by *UART\_top()* function in the right column. Figure 5

| System Verilog Assertions                                                                      | Assertion (in C)                                                                                      |
|------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------|
| P1: assert property<br>(@(#(posedge clk))<br>ack == 1-><br>(valid == 1 &&<br>##2 empty == 0)); | bool Property_P1() {<br>assert (!ack    valid);<br>UART_top();<br>UART_top();<br>assert (empty==0); } |

Figure 4: Sample property of UART HW

shows few examples of co-design properties that specify the interaction between the FW and HW components of the UART design. In *Property\_P2()*, the SW event *outb(UART\_TR, 0x0c)* triggers the HW event *ack\_o* (marked in bold) after one clock cycle. Whereas in *Property\_P3()*, the antecedent *tx\_empty()* is a HW event and *send\_data* is a SW event (marked in bold).

| HW/SW Co-specification                                                                          |                                                                                          |
|-------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------|
| bool Property_P2() {<br>if (outh(UART_TR,0x0c)){<br>UART_top();<br>assert( <b>ack_o==1</b> );}} | bool Property_P3() {<br>assert(!tx_empty()   <br>(( <b>send_data&amp;0x1==1</b> ));<br>} |

Figure 5: Properties capturing FW/HW interactions

## 6 EXPERIMENTAL RESULTS

We report experimental results for SW-HW co-verification of a UART and a SoC design. All our experiments were performed on an Intel Xeon 3.0 GHz machine with 48 GB RAM. All times reported are in seconds. The timeout for all our experiments was set to 2 hours. The performance of bit-level and word-level flow in HW-CBMC are similar. So, we only report bit-level results for HW-CBMC. MiniSAT-2.2.0 [7] was used as underlying SAT solver with HW-CBMC and COVERIF. The focus of our experiments is to compare the performance of COVERIF against HW-CBMC for verification of an UART design and a SoC design. We distribute our tool COVERIF, along with the HW/SW co-design benchmarks here <sup>2</sup>.

**Comparison with Other HW/SW Co-verification Tools:** Despite extensive use of model checking and other formal methods in SW verification or HW verification domain, building automated HW/SW co-verification tools using formal methods has received little attention in the past. Other than HW-CBMC [15], we are not aware of any other automated formal co-verification tool in the literature that can readily accept co-designs written in C/C++ and Verilog RTL. Hence, we only compare our results with HW-CBMC in this paper.

<sup>2</sup><https://drive.google.com/open?id=1Y2HkfIWwf6YkJgl24OXrA-dG2rX-92bm>

| Circuit                                     | Verilog LOC | Latches(L)/ FF | Input Ports | Output Ports | GATE Count | Firmware (LOC) |
|---------------------------------------------|-------------|----------------|-------------|--------------|------------|----------------|
| Universal Asynchronous Receiver Transmitter |             |                |             |              |            |                |
| UART                                        | 1200        | 356            | 12          | 9            | 413        | 528            |
| System-On-a-Chip                            |             |                |             |              |            |                |
| SoC                                         | 3567        | 840            | 14          | 11           | 945        | 734            |

Table 1: Design statistics for UART and SoC Design

## 6.1 HW/SW co-verification of UART

**About UART:** A UART core is used for asynchronous transmission and reception of data which provides serial communication capabilities with a modem or other external devices. The UART is compliant with industry standards for UART and interfaces with the wishbone bus. The design statistics for UART is shown in Table 1. The UART core is configured in 3 different operating modes, namely—*transmission without interrupt enabled* (Scenario A), *transmission with interrupt enabled* (Scenario B) which transmit non-deterministic data through the serial output while the receiver module is inactive. In Scenario C, the UART is configured in *loopback mode with interrupt enabled* in which both the transmitter and the receiver are active. The data-width varies in each mode, ranging from 8 bits to 64 bits.

**Discussion of Result:** Table 2 reports the run times for bounded safety proofs of co-design properties in UART core. Column 1 in Table 2 gives the name of the scenario, Column 2 gives the maximum loop unroll bound of the firmware, column 3-7 present the runtime using HW-CBMC, total/feasible path counts, COVERIF, respectively. Table 2 shows that COVERIF dominates HW-CBMC in all scenarios (marked in bold). COVERIF is on average 8× faster than HW-CBMC, both for proving safety as well as detecting bugs. Thus, COVERIF outperforms HW-CBMC in all scenarios.

We verified a total of 39 properties of the UART design. Table 2 reports some of the representative properties. In Scenario A and scenario B, we verified whether the transmitted data (32-bit or 64-bit) is available through the serial output port after a pre-determined number of clock cycles. In both of the configurations, COVERIF is able to prune the receiver logic since the SW only exercises the transmitter module by appropriately configuring the memory-mapped registers. In Scenario C, we verified whether the transmitted data matches the data received when the UART is configured in loopback mode. We found several bugs in the open source UART obtained from <http://www.opencores.org>. The bottom part of Table 2 reports the runtimes for detecting data-path and control-path bugs.

## 6.2 HW/SW Co-verification of System-on-Chip

**About the SoC:** We obtained an open source System-on-Chip design from [21]. It consists of an 8051 micro-controller, a memory arbiter, an external memory (XRAM) and cryptographic accelerators, as shown in Figure 6. The design statistics for SoC is given in Table 1. The accelerator implements encryption/decryption using the Advanced Encryption Standard (AES). A separate module that interfaces the AES to the 8051 micro-controller using a memory-mapped I/O interface. The micro-controller communicates with the

| Scenario                                                          | Bound | Monolithic |                      | Path-based    |              |                     | Verification Results |  |
|-------------------------------------------------------------------|-------|------------|----------------------|---------------|--------------|---------------------|----------------------|--|
|                                                                   |       | HW-CBMC    |                      | COVERIF       |              | $M_{seq}$           |                      |  |
|                                                                   |       | Bit-level  | Total/Feasible Paths | %-age Pruning | Time         |                     |                      |  |
| non-deterministic data but deterministic control (Scenario A)     |       |            |                      |               |              |                     |                      |  |
| transmit (32)                                                     | 250   | 15.02      | 247104/224           | 99.90         | <b>1.13</b>  | Safe ( $\psi_c$ )   |                      |  |
| transmit (64)                                                     | 500   | 23.87      | 247104/324           | 99.86         | <b>1.61</b>  | Safe ( $\psi_t$ )   |                      |  |
| non-deterministic data and non-deterministic control (Scenario B) |       |            |                      |               |              |                     |                      |  |
| trans_intr (32)                                                   | 250   | 14.86      | 247104/295           | 99.88         | <b>1.49</b>  | Safe ( $\psi_t$ )   |                      |  |
| trans_intr (64)                                                   | 500   | 24.14      | 247104/362           | 99.85         | <b>1.81</b>  | Safe ( $\psi_t$ )   |                      |  |
| non-deterministic data and non-deterministic control (scenario C) |       |            |                      |               |              |                     |                      |  |
| loopback (8)                                                      | 230   | 52.06      | 247104/354           | 99.85         | <b>3.95</b>  | Safe ( $\psi_t$ )   |                      |  |
| loopback (16)                                                     | 500   | 122.12     | 247104/690           | 99.72         | <b>12.89</b> | Safe ( $\psi_c$ )   |                      |  |
| loopback (32)                                                     | 650   | 170.62     | 247104/1282          | 99.48         | <b>21.85</b> | Safe ( $\psi_c$ )   |                      |  |
| loopback (64)                                                     | 1300  | 409.71     | 247104/2566          | 98.96         | <b>62.31</b> | Safe ( $\psi_t$ )   |                      |  |
| detecting data-path bugs in transmission mode w/o interrupt       |       |            |                      |               |              |                     |                      |  |
| transmit (64)                                                     | 520   | 28.43      | 247104/324           | 99.86         | <b>1.12</b>  | Unsafe ( $\psi_t$ ) |                      |  |
| detecting control bugs with interrupt enabled                     |       |            |                      |               |              |                     |                      |  |
| transmit (64)                                                     | 520   | 31.35      | 247104/362           | 99.85         | <b>1.05</b>  | Unsafe ( $\psi_c$ ) |                      |  |
| detecting control bugs in loopback mode                           |       |            |                      |               |              |                     |                      |  |
| loopback (64)                                                     | 1300  | 443.15     | 247104/2566          | 98.96         | <b>62.34</b> | Unsafe ( $\psi_t$ ) |                      |  |

Table 2: Verification of UART (All time in seconds)



Figure 6: SoC Design obtained from [21]

| Scenario                                             | Bound | Monolithic |                      | Path-based   |                     | Verification Result |  |
|------------------------------------------------------|-------|------------|----------------------|--------------|---------------------|---------------------|--|
|                                                      |       | HW-CBMC    |                      | COVERIF      |                     |                     |  |
|                                                      |       | Bit-level  | Total/Feasible Paths | %-age        | Time                |                     |  |
| non-deterministic data and non-deterministic control |       |            |                      |              |                     |                     |  |
| data_transfer                                        | 20    | 86.18      |                      | <b>17.42</b> | Safe ( $\psi_t$ )   |                     |  |
| AES_feedback                                         | 30    | 102.92     |                      | <b>56.29</b> | Safe ( $\psi_c$ )   |                     |  |
| non-deterministic data and non-deterministic control |       |            |                      |              |                     |                     |  |
| write_XRAM                                           | 20    | 92.63      |                      | <b>14.78</b> | Unsafe ( $\psi_c$ ) |                     |  |
| DMA                                                  | 32    | 128.63     |                      | <b>68.19</b> | Safe ( $\psi_t$ )   |                     |  |

Table 3: Verification times for SoC Design (All time in Seconds)

accelerators and the XRAM by reading or writing to XRAM addresses. The arbitration of these module is done by the memory arbiter module. The FW initiates the operation in the SoC by first writing to an initial memory-mapped register. The FW implements Linux-style *inb()* and *outb()* functions calls, which are used to communicate with the HW ports. The FW writes a sequence of non-deterministic data to the XRAM port and then reads the data from the same port. The cryptographic accelerators use direct memory access to fetch the data from the external memory. The completion of the operation is determined by polling the appropriate memory-mapped registers in the FW.

**Discussion of Result:** Table 3 gives the runtimes for the bounded safety proofs of the SoC design. Column 1 gives the name of the scenario, column 2 report the maximum loop unroll bound of the FW. Column 3-5 present the verification runtimes using HW-CBMC, COVERIF and the verification outcome respectively. The result in Table 3 shows that COVERIF is approximately 2× faster than HW-CBMC for proving safety. For detecting bugs, the speedup is 6× for

COVERIF over HW-CBMC. In the case of SoC scenario (data\_transfer), the SW exercises only the micro-controller and transfer sequence of bytes to the XRAM port bypassing peripherals connected to other ports such as hardware accelerator. This scenario allows path-based symbolic execution engine in COVERIF to prune the logic for the accelerator and generate only relevant verification conditions for the micro-controller and XRAM. It is important to note that forward symbolic execution without these optimizations timeout for all the benchmarks.

We verifies a total of 19 properties for the SoC co-design. Due to space limitations, we report 4 representative properties in Table 3. We check whether the acknowledgement for data transmission and data reception arrives from the micro-controller in the correct cycle. We verify whether the non-deterministic data transmitted through *outb()* is the same as the data received through *inb()*. We also verify that reading/writing to the appropriate memory-mapped registers produce the correct result during the data transmission phase. We found one critical *control bug* in the SoC design. The bug is manifested when memory arbiter hardware wrongly arbitrates the port selection thereby forcing the write strobe for the external RAM to be LOW. This violates the data transfer protocol in the SoC design.

## 7 LIMITATIONS OF PROPOSED APPROACH

The primary motivation for constructing a sequential single threaded unified co-verification model is to avoid enumerating exponential number of interleavings between the HW and SW threads. We have shown that this is extremely beneficial for co-designs that exhibits producer-consumer relationship. Such interaction pattern is prevalent in many practical co-designs [1, 8, 18, 19]. However, the proposed verification approach is not applicable for co-designs that exhibits true concurrency [5], that is, when a SW and its interacting HW threads do not exhibit producer-consumer relationship. In this case, it is imperative to consider all possible interleavings between participating threads in an efficient manner.

## 8 RELATED WORK

Previous work [2, 3, 9, 22] for co-verification have addressed the problem at the pre-RTL phase. However, we address the co-verification problem at the post-RTL phase [11, 13] where a key risk is divergence of the HW RTL from the behavior expected by the SW. Generating a unified co-verification model is a well-known technique in HW/SW co-verification. Notably, Kurshan et al. modeled HW and SW using finite state machines [10], Monniaux in [12] modeled HW and SW as C programs which are formally pushdown systems (PDS), Li et al. in [11] used Buchi Automata to abstractly model a hardware and PDS to abstractly model a software to generate a unified SW-HW model, called Buchi Pushdown System (*BPDS*). In this paper, we construct a unified sequential co-verification model in C language.

Common practice in industry for system-level co-verification is to either use emulators/accelerators or Instruction Set Simulators (ISS) [4]. However, no rigorous formal verification effort is performed at the post-RTL phase to ensure the validity of the SW-HW interactions.

## 9 CONCLUDING REMARKS

In this paper, we presented a formal HW/SW co-verification tool called COVERIF. In a typical HW/SW co-design, the software only exercises a fragment of the HW state-space. This renders many interactions between HW and SW modules infeasible. Our general observation is that the bounded model checking technique in HW-CBMC cannot prune irrelevant logic, and hence generates formulas that are extremely difficult to solve with a SAT/SMT solver. In contrast, the path-based exploration strategy in COVERIF is able to automatically prune design logic with respect to a given configuration (scenario), owing to domain-specific optimizations such as eager path pruning combined with incremental SAT solving and property-guided slicing. Our experiments show that COVERIF is on average 5× faster than HW-CBMC for proving safety as well as for finding critical bugs. In the future, we plan to extend COVERIF to support HW/SW co-designs that exhibit further interaction patterns as well as implement efficient domain-specific path-merging techniques.

## REFERENCES

- [1] S. Ahn and S. Malik. Modeling firmware as service functions and its application to test generation. In *HVC*, pages 61–77, 2013.
- [2] S. Ahn and S. Malik. Automated firmware testing using firmware-hardware interaction patterns. In *CODES+ISSS*, pages 1–25, 2014.
- [3] S. Ahn, S. Malik, and A. Gupta. Completeness bounds and sequentialization for model checking of interacting firmware and hardware. In *CODES+ISSS*, pages 202–211, 2015.
- [4] J. Andrews. *Co-verification of Hardware and Software for ARM SoC Design*. Elsevier, 2004.
- [5] J.-M. Bergé, O. Levia, and J. Rouillard. *Hardware/Software Co-design and Co-verification*. Springer, 1997.
- [6] A. Biere, A. Cimatti, E. M. Clarke, O. Strichman, and Y. Zhu. Bounded model checking. *Advances in Computers*, 58:117–148, 2003.
- [7] N. Eén and A. Biere. Effective preprocessing in SAT through variable and clause elimination. In *SAT*, pages 61–75, 2005.
- [8] H. Giefer, R. Polig, and C. Hagleitner. Accelerating arithmetic kernels with coherent attached FPGA coprocessors. In *DATE*, pages 1072–1077, 2015.
- [9] A. Horn, M. Tautschnig, C. G. Val, L. Liang, T. Melham, J. Grundy, and D. Kroening. Formal co-validation of low-level hardware/software interfaces. In *FMCAD*, pages 121–128, 2013.
- [10] R. P. Kurshan, V. Levin, M. Minea, D. A. Peled, and H. Yenigün. Combining software and hardware verification techniques. *FMSD*, 21(3):251–280, 2002.
- [11] J. Li, F. Xie, T. Ball, V. Levin, and C. McGarvey. An automata-theoretic approach to hardware/software co-verification. In *FASE*, pages 248–262, 2010.
- [12] D. Monniaux. Verification of device drivers and intelligent controllers: a case study. In *EMSOFT*, pages 30–36, 2007.
- [13] R. Mukherjee, P. Dasgupta, A. Pal, and S. Mukherjee. Formal verification of hardware/software power management strategies. In *VLSI*, pages 326–331, 2013.
- [14] R. Mukherjee, D. Kroening, and T. Melham. Hardware verification using software analyzers. In *ISVLSI*, 2015.
- [15] R. Mukherjee, M. Purandare, R. Polig, and D. Kroening. Formal techniques for effective co-verification of hardware/software co-designs. In *Proceedings of the 54th Annual Design Automation Conference, DAC 2017, Austin, TX, USA, June 18-22, 2017*, pages 35:1–35:6, 2017.
- [16] R. Mukherjee, P. Schrammel, D. Kroening, and T. Melham. Unbounded safety verification for hardware using software analyzers. In *DATE*, 2016.
- [17] R. Mukherjee, M. Tautschnig, and D. Kroening. v2c – a Verilog to C translator. In *TACAS*, LNCS. Springer, 2016.
- [18] R. Polig, K. Atasu, L. Chiticariu, C. Hagleitner, H. P. Hofstee, F. R. Reiss, H. Zhu, and E. Sitaridi. Giving text analytics a boost. *Micro*, 34(4):6–14, 2014.
- [19] R. Polig, K. Atasu, H. Giefer, and L. Chiticariu. Compiling text analytics queries to FPGAs. In *FPL*, pages 1–6, 2014.
- [20] A. Rose, S. Swan, J. Pierce, and J. Fernandez. Transaction level modeling in SystemC. In *Open SystemC Initiative*, 2005.
- [21] P. Subramanyan, Y. Vizel, S. Ray, and S. Malik. Template-based synthesis of instruction-level abstractions for SoC verification. In *FMCAD*, pages 160–167, 2015.
- [22] F. Xie, G. Yang, and X. Song. Component-based hardware/software co-verification. In *MEMOCODE*, pages 27–36, 2006.