



**Department of Electrical and Computer Engineering**

**San Francisco State University**

**Fall 2024**

**Final Project Report on ASIC Implementation of Motion Estimator  
in 14nm Finfet Technology Presented by**

| <b>Name</b>                               | <b>SFSUID</b> |
|-------------------------------------------|---------------|
| SeyedParsa Mirfasihi                      | 923636204     |
| Vyshnavi Shekhar Byrapatna<br>Somashekhar | 923840668     |

Under the guidance of  
Dr. Hamid Mahmoodi  
Department of ECE  
San Francisco State University



## Table of Contents

|                       |
|-----------------------|
| Problem Analysis..... |
| Hardware Design.....  |
| Verification.....     |
| Synthesis.....        |
| PhysicalDesign.....   |
| Sign-Off.....         |
| Discussion.....       |
| Conclusion.....       |



- **Objective:** The objective of this project is the RTL2GDSII implementation of Motion Estimator design used for lowering data size while retaining video quality in video compression in cutting-edge 14nm Finfet technology.

- Design Specifications:

- 16x16 Reference Block
- 31x31 Search Window
- Grey-scale coded pixels
- Clocks available : 1 GHz
- 14nm Finfet library



Figure 1 - Motion Estimator, Black Box Representation



## Memory Elements:

```
/* Module For Reference Block (Memory) */
module ROM_R #(parameter RMEM_WIDTH = 255 , parameter SMEM_WIDTH = 1023) (clock, AddressR, R);
    input clock;
    input [7:0] AddressR;
    output [7:0] R;

    reg [7:0] R;
    reg [7:0] Rmem[0:255];

    //always @(posedge clock) R <= Rmem[AddressR];
    always @(*) R = Rmem[AddressR];

endmodule
```

### Module ROM\_R:

This module defines a simple Read-Only Memory (ROM) block in Verilog, specifically for a reference block in a digital system. The module has two parameters, inputs, and an output:

#### Parameters:

1. **RMEM\_WIDTH** (default = 255): Sets the width (or address range) of the ROM, indicating the number of memory locations available.
2. **SMEM\_WIDTH** (default = 1023): This parameter appears in the definition, but it's unused within the code. It could be reserved for future functionality or as part of an unused specification.

#### Ports:

1. **clock** (input): This input is currently not used in the code but may be intended for synchronization in future development.
2. **AddressR** (input [7:0]): An 8-bit input address for accessing specific memory locations within the ROM.
3. **R** (output [7:0]): An 8-bit output that provides the data stored in the ROM at the specified address.

#### Internal Registers:

1. **R** (reg [7:0]): Temporarily holds the data value corresponding to the specified address. However, it's declared as reg in the code but is not used for storage since the always block directly assigns R to the output.
2. **Rmem** (reg [7:0] Rmem [0:255]): An 8-bit wide, 256-location ROM array that holds the memory contents. Each element in this array corresponds to a memory address, allowing 256 different 8-bit data values to be stored.



## Functionality:

The always block is sensitive to any changes (indicated by `@(*)`) in the inputs, meaning it will execute whenever AddressR changes. In this block:

- The output R is assigned the value stored in Rmem at the address specified by AddressR.  
This enables read-only access to the ROM data.

```
/* Module For Search Block (Memory) */
module ROM_S #(parameter RMEM_WIDTH = 255 , parameter SMEM_WIDTH = 1023) (clock, AddressS1, AddressS2, S1, S2);
    input clock;
    input [9:0] AddressS1, AddressS2;
    output [7:0] S1, S2;

    reg [7:0] S1, S2;
    reg [7:0] Smem[0:1023];

    /*always @(posedge clock)
    begin
        S1 <= Smem[AddressS1];
        S2 <= Smem[AddressS2];
    end*/
    
    always @(*)
    begin
        S1 = Smem[AddressS1];
        S2 = Smem[AddressS2];
    end

endmodule
```

## Module ROM\_S:

This module defines a memory block used for a search operation in a digital system, likely a ROM-based design. It includes two parameters, several inputs and outputs, and an internal memory array.

### Parameters:

1. **RMEM\_WIDTH** (default = 255): Specifies the width (or address range) of the ROM, representing the maximum addressable memory locations.
2. **SMEM\_WIDTH** (default = 1023): Defines the width of a separate memory segment but appears unused in this specific code.

### Ports:

1. **clock** (input): Although the clock signal is defined, it is not currently used within the code. It might be intended for future development to synchronize read operations.
2. **AddressS1, AddressS2** (input [9:0]): 10-bit input addresses used to access specific memory locations. With 10-bit addressing, this module can theoretically access up to 1024 locations.



3. **S1, S2** (output [7:0]): These 8-bit outputs represent the data retrieved from memory locations specified by AddressS1 and AddressS2.

### Internal Registers:

1. **S1, S2** (reg [7:0]): Temporarily holds the data output from the memory at the specified addresses. However, these are directly assigned to outputs and are used to reflect the memory values in real-time.
2. **Smem** (reg [7:0] Smem [0:1023]): An 8-bit wide, 1024-location ROM array to store the memory contents. Each element in Smem represents a memory location, allowing storage of 1024 distinct 8-bit data values.

### Functionality:

The always block is sensitive to any changes in the inputs (indicated by `@(*)`), meaning it will execute whenever AddressS1 or AddressS2 changes. Within this block:

- S1 is assigned the value stored in Smem at the address specified by AddressS1.
- S2 is assigned the value stored in Smem at the address specified by AddressS2.

This enables the module to perform two independent read operations simultaneously, allowing S1 and S2 to output data from different memory locations.

---

```
// First initialize all registers
$readmemh("ref.txt", memR_u.Rmem);
$readmemh("search.txt", memS_u.Smem);
clock = 1'b0;
start = 1'b0;

@(posedge clock); #10;
start = 1'b1;

$display("##### Reference Memory #####");
for (i=0; i<(RMEM_WIDTH+1); i=i+1) begin
    if (i==0)
        $write("%h", memR_u.Rmem[i]);
    else if (i%16==0)
        $write("\n%h", memR_u.Rmem[i]);
    else
        $write(" %h", memR_u.Rmem[i]);
end

$display("\n##### Search Memory #####");
for (i=0; i<(SMEM_WIDTH+1); i=i+1) begin
    if (i==0)
        $write("%h", memS_u.Smem[i]);
    else if (i%32==0)
        $write("\n%h", memS_u.Smem[i]);
    else
        $write(" %h", memS_u.Smem[i]);
end
```



## Memory Elements:

- Result is imported from MobaXterm

The screenshot shows a terminal window titled "2. mahmoodi.rcc.sfsu.edu (92363)" displaying memory dump results. The left pane shows a file tree with various files and folders related to a simulation project. The right pane displays two sections of memory dump: "Reference Memory" and "Search Memory", both showing binary data (ff, 00, 00, 00, ...) repeated in a grid pattern. At the bottom of the terminal window, it says "Simulation completed successfully." followed by "\$finish called from file "./top\_tb\_direct.v", line 91. \$finish at simulation time 82240 VCS Simulation Report".

UNREGISTERED VERSION - Please support MobaXterm by subscribing to the professional edition here: <https://mobaxterm.mobatek.net>





## Processing Elements:

```
/* Module For Processing Element (PE) */
module PE (clock, R, S1, S2, S1S2mux, newDist, Accumulate, Rpipe);
    input clock;
    input [7:0] R, S1, S2; // memory inputs
    input S1S2mux, newDist; // control input
    output [7:0] Accumulate, Rpipe;
    reg [7:0] Accumulate, AccumulateIn, difference, difference_temp, Rppipe;
    reg Carry;

    always @(posedge clock) Rppipe <= R;
    always @(posedge clock) Accumulate <= AccumulateIn;

    always @(R or S1 or S2 or S1S2mux or newDist or Accumulate)
        begin
            difference = R - (S1S2mux? S1:S2);
            difference_temp = - difference;
            if (difference<0)
                begin
                    difference = difference_temp;
                end
            {Carry,AccumulateIn} = Accumulate + difference;
            if (Carry == 1) AccumulateIn = 8'hFF; // saturated
            if (newDist == 1) AccumulateIn = difference;
            end
        endmodule
```

### Module PE (Processing Element):

This module defines a Processing Element (PE), a fundamental unit in many hardware designs for processing data. It operates on input data, performs arithmetic operations, and accumulates results based on control signals. This module is particularly structured to support operations involving two memory inputs and an accumulation process.

#### Ports

##### 1. Inputs:

- **clock**: Clock signal for synchronous operations within the module.
- **R** (input [7:0]): An 8-bit input data from memory (possibly a reference value).
- **S1, S2** (input [7:0]): Two 8-bit inputs, which are memory values for processing.
- **S1S2mux**: Control signal to select between S1 and S2.
- **newDist**: Control signal to reset the accumulation.



## 2. Outputs:

- **Accumulate** (output [7:0]): Holds the accumulated result after processing.
- **Rpipe**: A pipelined register for storing the value of R for the next clock cycle.

## Internal Registers

1. **Accumulate** (reg [7:0]): Used internally to store accumulated results.
2. **AccumulateIn** (reg [7:0]): Temporary register to hold intermediate accumulation values.
3. **difference** and **difference\_temp** (reg [7:0]): Used to store the calculated difference between R and the selected input (S1 or S2).
4. **Carry**: Used for overflow detection during addition in accumulation.

## Functionality

The module contains three main always blocks that control the flow of data and operations:

### 1. Pipeline Register Update:

- Rpipe is updated with the value of R on each positive edge of clock. This provides pipelining, allowing R to be delayed by one clock cycle for future use.

### 2. Accumulation Register Update:

- Accumulate is updated with AccumulateIn on each positive edge of clock, ensuring that accumulation is synchronized with the clock signal.

### 3. Combinational Logic for Processing:

- This block is triggered whenever any relevant inputs change. It performs the following steps:

- **Difference Calculation**: The module calculates the difference between R and either S1 or S2, based on the value of S1S2mux. If S1S2mux is 1, S1 is selected; otherwise, S2 is selected.
- **Absolute Value**: If the difference is negative, difference\_temp is used to store the positive equivalent, ensuring that only the absolute difference is accumulated.
- **Accumulation**: AccumulateIn is updated by adding the calculated difference to the current accumulated value. The result also considers Carry to handle overflow.
- **Saturation**: If Carry is 1, indicating overflow, AccumulateIn is set to 8'hFF, representing the maximum value (saturated result).
- **Reset Condition**: If newDist is 1, AccumulateIn is reset to the current difference, starting a new accumulation.



```
initial
begin
    $vcdpLusion;
    // First setup up to monitor all inputs and outputs
    clock = 1'b0;
    start = 1'b0;

    @(posedge clock); #10;
    start = 1'b1;

    R = 10; S1 = 0; S2 = 4; S1S2mux = 0; newDist = 1; // To check S1S2mux Functionality
    @(posedge clock);
    S1S2mux = 1; newDist = 0; // To check S1S2mux Functionality
    @(posedge clock);
    S1S2mux = 0; newDist = 0;

    repeat(5) @(posedge clock);
    newDist = 1; // Accumulation
    repeat(2) @(posedge clock); // Check Accumulation Sent out
    start = 1'b0;

    R = 2;
    repeat(5) @(posedge clock); // Carry Operation

    $display("All tests completed\n\n");
    $finish;
end
```

- Result is imported from MobaXterm

```
If you would like to temporarily disable this message, set
the VCS_LIC_EXPIRE_WARNING environment variable to the number of days
before expiration that you want this message to start (the minimum is 0).
VCD+ Writer W-2024.09-1_Full64 Copyright (c) 1991-2024 by Synopsys Inc.
At time          0, R =  x, S1 =  x, S2 =  x, S1S2mux = x, newDist = x, Accumulate =  x, Rpipe =  x
At time         15, R =  10, S1 =  0, S2 =  4, S1S2mux = 1, newDist = 0, Accumulate =  6, Rpipe =  10
At time         25, R =  10, S1 =  0, S2 =  4, S1S2mux = 0, newDist = 0, Accumulate =  16, Rpipe =  10
At time         35, R =  10, S1 =  0, S2 =  4, S1S2mux = 0, newDist = 0, Accumulate =  22, Rpipe =  10
At time         45, R =  10, S1 =  0, S2 =  4, S1S2mux = 0, newDist = 0, Accumulate =  28, Rpipe =  10
At time         55, R =  10, S1 =  0, S2 =  4, S1S2mux = 0, newDist = 0, Accumulate =  34, Rpipe =  10
At time         65, R =  10, S1 =  0, S2 =  4, S1S2mux = 0, newDist = 0, Accumulate =  40, Rpipe =  10
At time         75, R =  10, S1 =  0, S2 =  4, S1S2mux = 0, newDist = 1, Accumulate =  46, Rpipe =  10
At time         85, R =  10, S1 =  0, S2 =  4, S1S2mux = 0, newDist = 1, Accumulate =  6, Rpipe =  10
At time         95, R =   2, S1 =  0, S2 =  4, S1S2mux = 0, newDist = 1, Accumulate =  6, Rpipe =  2
At time        105, R =   2, S1 =  0, S2 =  4, S1S2mux = 0, newDist = 1, Accumulate = 254, Rpipe =  2
All tests completed

$finish called from file "./PE_tb.v", line 57.
$finish at simulation time           145
                           V C S   S i m u l a t i o n   R e p o r t
Time: 145
CPU Time:      0.130 seconds;      Data structure size:  0.0Mb
Tue Nov  5 14:42:49 2024
[923636204@mahmoodi ~/ENGR852FINALPROJECT]$ █
```



## PROCESSING ELEMENT TOTAL(PETOTAL):

```
/* Module For Total 16 Processing Elements (PETotal)*/
module PETotal (clock, R, S1, S2, S1S2mux, newDist, Accumulate);
    input clock;
    input [7:0] R, S1, S2; // memory inputs
    input [15:0] S1S2mux, newDist; // control input
    output [127:0] Accumulate;

    wire [7:0] Rpipe0, Rpipe1, Rpipe2, Rpipe3, Rpipe4, Rpipe5, Rpipe6, Rpipe7, Rpipe8, Rpipe9, Rpipe10, Rpipe11, Rpipe12, Rpipe13, Rpipe14;
    PE pe0 (clock, R, S1, S2, S1S2mux[0], newDist[0], Accumulate[7:0], Rpipe0);
    PE pe1 (clock, Rpipe0, S1, S2, S1S2mux[1], newDist[1], Accumulate[15:8], Rpipe1);
    PE pe2 (clock, Rpipe1, S1, S2, S1S2mux[2], newDist[2], Accumulate[23:16], Rpipe2);
    PE pe3 (clock, Rpipe2, S1, S2, S1S2mux[3], newDist[3], Accumulate[31:24], Rpipe3);
    PE pe4 (clock, Rpipe3, S1, S2, S1S2mux[4], newDist[4], Accumulate[39:32], Rpipe4);
    PE pe5 (clock, Rpipe4, S1, S2, S1S2mux[5], newDist[5], Accumulate[47:40], Rpipe5);
    PE pe6 (clock, Rpipe5, S1, S2, S1S2mux[6], newDist[6], Accumulate[55:48], Rpipe6);
    PE pe7 (clock, Rpipe6, S1, S2, S1S2mux[7], newDist[7], Accumulate[63:56], Rpipe7);
    PE pe8 (clock, Rpipe7, S1, S2, S1S2mux[8], newDist[8], Accumulate[71:64], Rpipe8);
    PE pe9 (clock, Rpipe8, S1, S2, S1S2mux[9], newDist[9], Accumulate[79:72], Rpipe9);
    PE pe10 (clock, Rpipe9, S1, S2, S1S2mux[10], newDist[10], Accumulate[87:80], Rpipe10);
    PE pe11 (clock, Rpipe10, S1, S2, S1S2mux[11], newDist[11], Accumulate[95:88], Rpipe11);
    PE pe12 (clock, Rpipe11, S1, S2, S1S2mux[12], newDist[12], Accumulate[103:96], Rpipe12);
    PE pe13 (clock, Rpipe12, S1, S2, S1S2mux[13], newDist[13], Accumulate[111:104], Rpipe13);
    PE pe14 (clock, Rpipe13, S1, S2, S1S2mux[14], newDist[14], Accumulate[119:112], Rpipe14);
    PEend pe15 (clock, Rpipe14, S1, S2, S1S2mux[15], newDist[15], Accumulate[127:120]);
endmodule
```



### **Module PETOTAL:**

The PETotal module defines a system with 16 Processing Elements (PEs). Each PE is instantiated with specific inputs and outputs, which together form the complete PETotal structure.

### **Key Components:**

#### **Inputs:**

- clock: A clock signal for synchronization.
- R, S1, S2: Control or configuration signals.
- S1S2mux and newDist: 16-bit control input arrays used to provide values to each PE module.

#### **Outputs:**

- Accumulate: A 128-bit wide output that collects results from each of the 16 PEs.

#### **Internal Wiring:**

- Rpipe[0:15]: An array of 8-bit wires that connect between various PEs, likely used to pass data from one PE to another in a pipelined structure.

#### **PE Instantiation:**

Each PE (Processing Element) is instantiated with specific arguments:

- Each PE receives the clock, R, S1, and S2 signals.
- Each PE instance connects to a unique S1S2mux and newDist input (e.g., S1S2mux[0], newDist[0] for PE pe0).
- The Accumulate output is split into 8-bit chunks, with each PE writing to a distinct segment of this 128-bit output.

This configuration appears to implement a parallel processing structure where each PE performs operations on the S1S2mux and newDist inputs, and their results are accumulated in the Accumulate output.



```
initial
begin
    $vcdppluson;
    // First setup up to monitor all inputs and outputs
    clock = 1'b0;
    start = 1'b0;
    @(posedge clock);
    start = 1'b1;

    R = 10; S1 = 0; S2 = 4;

    S1S2mux = 16'hFFE1; // 1111111111110001
    newDist = 16'hFFFF;

    repeat(16)
        @(posedge clock);

    S1S2mux = 16'hE2E1; // 1110001011100001
    newDist = 16'hEEEE;

    repeat(16)
        @(posedge clock);
    start = 1'b0;
    repeat(5)
        @(posedge clock);

    $display("All tests completed\n\n");
    $finish;
end
```

- Result is imported from MobaXterm

```
If you would like to temporarily disable this message, set
the VCS_LIC_EXPIRE_WARNING environment variable to the number of days
before expiration that you want this message to start (the minimum is 0).
VCD+ Writer W-2024-09-1_Full64 Copyright (c) 1991-2024 by Synopsys Inc.
At time          0, R = xx, S1 = xx, S2 = xx, S1S2mux = xxxx, newDist = xxxx, Accumulate =xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
At time         15, R = 0a, S1 = 00, S2 = 04, S1S2mux = ffe1, newDist = ffff, Accumulate =xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx060a
At time         25, R = 0a, S1 = 00, S2 = 04, S1S2mux = ffe1, newDist = ffff, Accumulate =xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx060a
At time         35, R = 0a, S1 = 00, S2 = 04, S1S2mux = ffe1, newDist = ffff, Accumulate =xxxxxxxxxxxxxxxxxxxxxxxx06060a
At time         45, R = 0a, S1 = 00, S2 = 04, S1S2mux = ffe1, newDist = ffff, Accumulate =xxxxxxxxxxxxxxxxxxxxxx0606060a
At time         55, R = 0a, S1 = 00, S2 = 04, S1S2mux = ffe1, newDist = ffff, Accumulate =xxxxxxxxxxxxxxxxxxxxxx0606060a
At time         65, R = 0a, S1 = 00, S2 = 04, S1S2mux = ffe1, newDist = ffff, Accumulate =xxxxxxxxxxxxxxxxxxxxxx0a0606060a
At time         75, R = 0a, S1 = 00, S2 = 04, S1S2mux = ffe1, newDist = ffff, Accumulate =xxxxxxxxxxxxxxxx0a0606060a
At time         85, R = 0a, S1 = 00, S2 = 04, S1S2mux = ffe1, newDist = ffff, Accumulate =xxxxxxxxxxxxxxxx0a0a0a0606060a
At time         95, R = 0a, S1 = 00, S2 = 04, S1S2mux = ffe1, newDist = ffff, Accumulate =xxxxxxxxxxxxxx0a0a0a0606060a
At time        105, R = 0a, S1 = 00, S2 = 04, S1S2mux = ffe1, newDist = ffff, Accumulate =xxxxxxxxxxxxxx0a0a0a0a0606060a
At time        115, R = 0a, S1 = 00, S2 = 04, S1S2mux = ffe1, newDist = ffff, Accumulate =xxxxxxxx0a0a0a0a0a0a0606060a
At time        125, R = 0a, S1 = 00, S2 = 04, S1S2mux = ffe1, newDist = ffff, Accumulate =xxxxxxxx0a0a0a0a0a0a0606060a
At time        135, R = 0a, S1 = 00, S2 = 04, S1S2mux = ffe1, newDist = ffff, Accumulate =xxxxxxxx0a0a0a0a0a0a0606060a
At time        145, R = 0a, S1 = 00, S2 = 04, S1S2mux = ffe1, newDist = ffff, Accumulate =xxxxxxxx0a0a0a0a0a0a0606060a
At time        155, R = 0a, S1 = 00, S2 = 04, S1S2mux = ffe1, newDist = ffff, Accumulate =xx0a0a0a0a0a0a0a0a0a0606060a
At time        165, R = 0a, S1 = 00, S2 = 04, S1S2mux = e2e1, newDist = eeee, Accumulate =0a0a0a0a0a0a0a0a0a0606060a
At time        175, R = 0a, S1 = 00, S2 = 04, S1S2mux = e2e1, newDist = eeee, Accumulate =0a0a0a1006060a100a0a0a06060614
At time        185, R = 0a, S1 = 00, S2 = 04, S1S2mux = e2e1, newDist = eeee, Accumulate =0a0a0a1606060a160a0a0a120606061e
At time        195, R = 0a, S1 = 00, S2 = 04, S1S2mux = e2e1, newDist = eeee, Accumulate =0a0a0a1c06060a1c0a0a1806060628
At time        205, R = 0a, S1 = 00, S2 = 04, S1S2mux = e2e1, newDist = eeee, Accumulate =0a0a0a2206060a220a0a0a1e06060632
At time        215, R = 0a, S1 = 00, S2 = 04, S1S2mux = e2e1, newDist = eeee, Accumulate =0a0a0a2806060a280a0a0a240606063c
At time        225, R = 0a, S1 = 00, S2 = 04, S1S2mux = e2e1, newDist = eeee, Accumulate =0a0a0a2e06060a2e0a0a0a2a06060646
At time        235, R = 0a, S1 = 00, S2 = 04, S1S2mux = e2e1, newDist = eeee, Accumulate =0a0a0a3406060a340a0a0a3006060650
At time        245, R = 0a, S1 = 00, S2 = 04, S1S2mux = e2e1, newDist = eeee, Accumulate =0a0a0a3a06060a3a0a0a0360606065a
At time        255, R = 0a, S1 = 00, S2 = 04, S1S2mux = e2e1, newDist = eeee, Accumulate =0a0a0a4006060a400a0a0a03c06060664
At time        265, R = 0a, S1 = 00, S2 = 04, S1S2mux = e2e1, newDist = eeee, Accumulate =0a0a0a4606060a460a0a0a420606066e
At time        275, R = 0a, S1 = 00, S2 = 04, S1S2mux = e2e1, newDist = eeee, Accumulate =0a0a0a4c06060a4c0a0a0a4806060678
At time        285, R = 0a, S1 = 00, S2 = 04, S1S2mux = e2e1, newDist = eeee, Accumulate =0a0a0a5206060a520a0a0a4e06060682
At time        295, R = 0a, S1 = 00, S2 = 04, S1S2mux = e2e1, newDist = eeee, Accumulate =0a0a0a5806060a580a0a0a540606068c
At time        305, R = 0a, S1 = 00, S2 = 04, S1S2mux = e2e1, newDist = eeee, Accumulate =0a0a0a5e06060a5e0a0a0a5a06060696
At time        315, R = 0a, S1 = 00, S2 = 04, S1S2mux = e2e1, newDist = eeee, Accumulate =0a0a0a6406060a640a0a0a60060606a0
All tests completed

$finish called from file "./PE_TOTAL_tb.v", line 48.
$finish at simulation time           325
V C S   S i m u l a t i o n   R e p o r t
Time: 325
CPU Time:      0.130 seconds;      Data structure size:  0.0Mb
Tue Nov  5 21:30:18 2024
[923636204@mahmoodi ~]/ENGR852FINALPROJECT]$ █
```



## Comparator Unit:

```
/* Module For Comparator Unit */
module Comparator (clock, CompStart, PEout, PReady, vectorX, vectorY, BestDist, motionX, motionY);
    input clock;
    input CompStart; // goes high when distortion calculation start
    input [8*16-1:0] PEout; // outputs of PEs as one long vector
    input [15:0] PReady; // goes high when that PE has a new distortion
    input [3:0] vectorX, vectorY; // motion vector being evaluated
    output [7:0] BestDist; // best distortion vector so far
    output [3:0] motionX, motionY; // best motion vector so far
    reg [7:0] BestDist, newDist;
    reg [3:0] motionX, motionY;
    reg newBest;
    integer n;

    always @ (posedge clock)
        if (CompStart == 0) BestDist <= 8'hFF; // initialize to highest value
        else if (newBest == 1)
            begin
                BestDist <= newDist;
                motionX <= vectorX;
                motionY <= vectorY;
            end
    always @ (BestDist or PEout or PReady or CompStart)
    begin
        newDist = 8'hFF;
        for (n = 0; n <= 15; n = n+1)
        begin
            if (PReady[n] == 1)
                case (n)
                    4'b0000: newDist = PEout[7:0];
                    4'b0001: newDist = PEout[15:8];
                    4'b0010: newDist = PEout[23:16];
                    4'b0011: newDist = PEout[31:24];
                    4'b0100: newDist = PEout[39:32];
                    4'b0101: newDist = PEout[47:40];
                    4'b0110: newDist = PEout[55:48];
                    4'b0111: newDist = PEout[63:56];
                    4'b1000: newDist = PEout[71:64];
                    4'b1001: newDist = PEout[79:72];
                    4'b1010: newDist = PEout[87:80];
                    4'b1011: newDist = PEout[95:88];
                    4'b1100: newDist = PEout[103:96];
                    4'b1101: newDist = PEout[111:104];
                    4'b1110: newDist = PEout[119:112];
                    4'b1111: newDist = PEout[127:120];
                    default: newDist = 8'hFF;
                endcase
            end
            if ((|PReady == 0) || (CompStart == 0)) newBest = 0; // no PE is ready
            else if (newDist < BestDist) newBest = 1;
            else newBest = 0;
        end
    endmodule
```



```

// Test sequence
initial begin
    $vcdppluson;
    // First setup up to monitor all inputs and outputs
    clock = 1'b0;
    CompStart = 1'b0;

    @(posedge clock);
    CompStart = 1'b1;

    // Setup test vectors and output readiness
    vectorX = 0; vectorY = 0; PEready = 16'h0001; PEout = 128'h00000000000000000000000000000000FFF;
    @(posedge clock);
    @(posedge clock);

    // Change conditions to simulate different PE responses
    vectorX = 2; vectorY = 3;
    @(posedge clock);
    @(posedge clock);
    vectorX = 4; vectorY = 5; PEready = 16'h0002;
    @(posedge clock);
    @(posedge clock);
    vectorX = 8; vectorY = 7; PEready = 16'h0003;
    @(posedge clock);
    @(posedge clock);
    vectorX = 2; vectorY = 6; PEready = 16'h0004;
    @(posedge clock);
    @(posedge clock);
    CompStart = 0;
    repeat(5) @(posedge clock);
    CompStart = 1'b0;

    $display("All tests completed\n\n");
    $finish;
end

```

```

If you would like to temporarily disable this message, set
the VCS_LIC_EXPIRE_WARNING environment variable to the number of days
before expiration that you want this message to start (the minimum is 0).
VCD+ Writer W-2024.09-1_Full64 Copyright (c) 1991-2024 by Synopsys Inc.
At time      0, CompStart = 0, vectorX = x, vectorY = x, PEready = xxxx, PEout =xxxxxxxxxxxxxxxxxxxxxxxxxxxxx, BestDist = xx, motionX = x, motionY = x
At time      5, CompStart = 1, vectorX = 0, vectorY = 0, PEready = 0001, PEout = 0000000000000000000000000000ffff, BestDist = xx, motionX = x, motionY = x
At time     25, CompStart = 1, vectorX = 2, vectorY = 3, PEready = 0001, PEout = 0000000000000000000000000000ffff, BestDist = xx, motionX = x, motionY = x
At time     45, CompStart = 1, vectorX = 4, vectorY = 5, PEready = 0002, PEout = 0000000000000000000000000000ffff, BestDist = xx, motionX = x, motionY = x
At time     65, CompStart = 1, vectorX = 8, vectorY = 7, PEready = 0003, PEout = 0000000000000000000000000000ffff, BestDist = xx, motionX = x, motionY = x
At time    85, CompStart = 1, vectorX = 2, vectorY = 6, PEready = 0004, PEout = 0000000000000000000000000000ffff, BestDist = xx, motionX = x, motionY = x
At time   105, CompStart = 0, vectorX = 2, vectorY = 6, PEready = 0004, PEout = 0000000000000000000000000000ffff, BestDist = ff, motionX = x, motionY = x
All tests completed

$finish called from file "./Comparator_tb.v", line 64.
$finish at simulation time           155
          V C S   S i m u l a t i o n   R e p o r t
Time: 155
CPU Time:    0.130 seconds;      Data structure size:  0.0Mb
Tue Nov  5 22:21:51 2024
[932636204@mohamedi:~/ENGR852ETNAI PROJECT] 4

```





# TOP MODULE

```

/* Module For Top Level Hierarchy */
module top #(parameter RMem_WIDTH = 255, parameter SMem_WIDTH = 1023 )(clock, start, BestDist, motionX, motionY, AddressR, AddressS1, AddressS2, R, S1, S2, completed);

timescale 1ns/10ps

module top_testbench () :
    wire [7:0] BestDist;
    wire [7:0] motionX, motionY;

    // Regs for all DUT inputs:
    reg [7:0] clock;
    reg [7:0] start;

    reg completed;
    reg [7:0] Rmem[255:0];
    reg [10:0] Smem[1023:0];
    integer i;
    integer j;
    integer signed x, y;
    wire [7:0] R, S1, S2;
    wire [7:0] Address;
    wire [8:0] AddressS1, AddressS2;

    // (dut means device under test)
    top dut (
        .BestDist(BestDist),
        .motionX(motionX),
        .motionY(motionY),
        .clock(clock),
        .start(start),
        .completed(completed),
        .Rmem(Rmem),
        .Smem(Smem),
        .i(i),
        .j(j),
        .x(x),
        .y(y),
        .R(R),
        .S1(S1),
        .S2(S2),
        .Address(Address),
        .AddressS1(AddressS1),
        .AddressS2(AddressS2),
        .ctrl_u(.clock(clock), .start(S1S2mux), .newDist, .PReady, .VectorX, .VectorY, .AddressR, .AddressS1, .AddressS2, .completed),
        .PEtotal(pe_u(.clock(clock), .start(S1S2mux), .newDist, .CompStart, .Accumulate)),
        .PEtotal(pe_u(.clock(clock), .start(S1S2mux), .newDist, .Accumulate))
    );
endmodule

//References and search memories
ROM_R memR_u(.clock(clock), .AddressR(AddressR), .R(R));
ROM_S memS_u(.clock(clock), .AddressS1(AddressS1), .AddressS2(AddressS2), .S1(S1), .S2(S2));

// Setup clock to automatically strobe with a period of 20.
always #10 clock = ~clock;

initial
begin
    $vcdpluson;
    // First setup up to monitor all inputs and outputs
    monitor ("time=85 ns, clock =#b, start =#b, BestDist =#d, motionX =#d, motionY =#d, count =#d", #time, clock, start, BestDist[7:0], motionX[7:0], motionY[7:0], dut.ctrl_u.count[11:0]);

    // First initialize all registers
    $readmem("ref.txt", memR_u.Rmem);
    $readmem("search.txt", memS_u.Smem);
    clock = #1'b0;
    start = #1'b0;
    #(posedge clock);#10;
    start = #1'b1;
end

```

## Timing Analysis:

### SETUP

| Des/Clust/Port                                       | Wire Load Model | Library               |
|------------------------------------------------------|-----------------|-----------------------|
| top                                                  | 8000            | saed14hvt_ss0p72v125c |
| Point                                                | Incr            | Path                  |
| clock ideal_clock1 (rise edge)                       | 0.00            | 0.00                  |
| comp_u/motionY_reg[1]/CK (SAEDSLVT14_FDP_CBO_V2LP_1) | 0.00            | 0.00                  |
| ctl_u/count/reg[4]/Q (SAEDSLVT14_FDP_CBO_V2LP_1)     | 0.00            | 0.00 r                |
| ctl_u/U86/X (SAEDSLVT14_INV_1)                       | 0.02            | 0.02 f                |
| ctl_u/U78/X (SAEDSLVT14_ND2B_U_0P5)                  | 0.01            | 0.05 f                |
| ctl_u/U77/X (SAEDSLVT14_OR2_1)                       | 0.02            | 0.07 f                |
| ctl_u/U92/X (SAEDSLVT14_NR2_MM_1)                    | 0.02            | 0.09 r                |
| ctl_u/U40/X (SAEDSLVT14_AN3_0P75)                    | 0.03            | 0.12 r                |
| ctl_u/U50/X (SAEDSLVT14_ND2_CDC_1)                   | 0.01            | 0.15 r                |
| ctl_u/U23/X (SAEDSLVT14_ND2_MM_1)                    | 0.03            | 0.16 f                |
| ctl_u/PReady[6] (control)                            | 0.00            | 0.16 r                |
| comp_u/Pready[6] (Comparator)                        | 0.00            | 0.16 r                |
| comp_u/U120/X (SAEDSLVT14_NR3_0P75)                  | 0.03            | 0.19 f                |
| comp_u/U119/X (SAEDSLVT14_ND2B_U_0P5)                | 0.01            | 0.20 r                |
| comp_u/U110/X (SAEDSLVT14_AN2B_MM_1)                 | 0.03            | 0.23 f                |
| comp_u/U56/X (SAEDRV14_A0222_1)                      | 0.04            | 0.28 f                |
| comp_u/U77/X (SAEDSLVT14_OA121_0P75)                 | 0.02            | 0.33 r                |
| comp_u/U17/X (SAEDSLVT14_NR2_CDC_1)                  | 0.02            | 0.34 f                |
| comp_u/U74/X (SAEDSLVT14_A0221_0P5)                  | 0.02            | 0.36 r                |
| comp_u/U149/X (SAEDSLVT14_OA221_1)                   | 0.02            | 0.38 r                |
| comp_u/U47/X (SAEDSLVT14_A0221_U_0P5)                | 0.02            | 0.39 r                |
| comp_u/U24/X (SAEDSLVT14_OA221_0P5)                  | 0.02            | 0.41 r                |
| comp_u/U21/X (SAEDSLVT14_A0221_0P5)                  | 0.02            | 0.43 r                |
| comp_u/U53/X (SAEDSLVT14_OA221_U_0P5)                | 0.02            | 0.45 r                |
| comp_u/U39/X (SAEDSLVT14_OA221_0P5)                  | 0.01            | 0.46 f                |
| comp_u/U50/X (SAEDSLVT14_OA221_0P5)                  | 0.03            | 0.49 f                |
| comp_u/U15/X (SAEDSLVT14_INV_1)                      | 0.02            | 0.51 r                |
| comp_u/U14/X (SAEDSLVT14_INV_1)                      | 0.01            | 0.52 f                |
| comp_u/U9/X (SAEDSLVT14_INV_1)                       | 0.02            | 0.53 r                |
| comp_u/U47/X (SAEDHVT14_ND2_CDC_1)                   | 0.02            | 0.55 f                |
| comp_u/U46/X (SAEDSLVT14_OA121_0P75)                 | 0.01            | 0.56 r                |
| comp_u/motionY_reg[1]/D (SAEDSLVT14_FDP_V2_1)        | 0.01            | 0.57 r                |
| data_arrival_time                                    | 0.57            |                       |
| clock ideal_clock1 (rise edge)                       | 1.00            | 1.00                  |
| clock network delay (ideal)                          | 0.00            | 1.00                  |
| comp_u/motionY_reg[1]/CK (SAEDSLVT14_FDP_V2_1)       | 0.00            | 1.00 r                |
| library setup time                                   | -0.02           | 0.98                  |
| data required time                                   |                 | 0.98                  |
| data required time                                   |                 | 0.98                  |
| data arrival time                                    |                 | -0.57                 |
| slack (MET)                                          | 0.41            |                       |

### HOLD

| *****                                                            |                 |                       |
|------------------------------------------------------------------|-----------------|-----------------------|
| Report : timing                                                  | -path full      |                       |
|                                                                  | -delay min      |                       |
|                                                                  | -max_paths 1    |                       |
| Design : top                                                     |                 |                       |
| Version: 0-2018.06-SP4                                           |                 |                       |
| Date : Fri Nov 8 13:34:41 2024                                   |                 |                       |
| *****                                                            |                 |                       |
| Operating Conditions: ss0p72v125c Library: saed14hvt_ss0p72v125c |                 |                       |
| Wire Load Model Mode: top                                        |                 |                       |
| Startpoint: pe_u/pe0/Rpipe_reg[0]                                |                 |                       |
| (rising edge-triggered flip-flop clocked by ideal_clock1)        |                 |                       |
| Endpoint: pe_u/pe1/Rpipe_reg[0]                                  |                 |                       |
| (rising edge-triggered flip-flop clocked by ideal_clock1)        |                 |                       |
| Path Group: ideal_clock1                                         |                 |                       |
| Path Type: min                                                   |                 |                       |
| Des/Clust/Port                                                   | Wire Load Model | Library               |
| top                                                              | 8000            | saed14hvt_ss0p72v125c |
| Point                                                            | Incr            | Path                  |
| clock ideal_clock1 (rise edge)                                   | 0.00            | 0.00                  |
| clock network delay (ideal)                                      | 0.00            | 0.00                  |
| pe_u/pe0/Rpipe_reg[0]/CK (SAEDSLVT14_FDP_V2_1)                   | 0.00            | 0.00 r                |
| pe_u/pe0/Rpipe_reg[0]/Q (SAEDSLVT14_FDP_V2_1)                    | 0.02            | 0.02 r                |
| pe_u/pe0/Rpipe[0] (PE_0)                                         | 0.00            | 0.02 r                |
| pe_u/pe1/R[0] (PE_14)                                            | 0.00            | 0.02 r                |
| pe_u/pe1/Rpipe_reg[0]/D (SAEDSLVT14_FDP_V2_1)                    | 0.01            | 0.02 r                |
| clock ideal_clock1 (rise edge)                                   | 0.00            | 0.00                  |
| clock network delay (ideal)                                      | 0.00            | 0.00                  |
| pe_u/pe1/Rpipe_reg[0]/CK (SAEDSLVT14_FDP_V2_1)                   | 0.00            | 0.00 r                |
| library hold time                                                | 0.00            | 0.00                  |
| data required time                                               |                 | 0.00                  |
| -----                                                            |                 |                       |
| data required time                                               |                 | 0.00                  |
| data arrival time                                                |                 | -0.02                 |
| -----                                                            |                 |                       |
| slack (MET)                                                      | 0.02            |                       |



### Schematic View

```

///////////////////////////////////////////////////////////////////
// Created by: Synopsys DC Expert(TM) in wire load mode
// Version   : 0-2018.06-SP4
// Date     : Mon Nov  4 14:20:30 2024
///////////////////////////////////////////////////////////////////

module control_DW01_inc_0 ( A, SUM );
  input [12:0] A;
  output [12:0] SUM;

  wire [12:2] carry;

  SAEDSLVT14_E02_V1_0P75 U1 ( .A1(carry[12]), .A2(A[12]), .X(SUM[12]) );
  SAEDSLVT14_ADDH_0P5 U1_1_6 ( .A(A[6]), .B(carry[6]), .CO(carry[7]), .S(
    SUM[6]) );
  SAEDSLVT14_ADDH_0P5 U1_1_2 ( .A(A[2]), .B(carry[2]), .CO(carry[3]), .S(
    SUM[2]) );
  SAEDSLVT14_ADDH_0P5 U1_1_1 ( .A(A[1]), .B(A[0]), .CO(carry[2]), .S(SUM[1]) );
  SAEDSLVT14_ADDH_0P5 U1_1_5 ( .A(A[5]), .B(carry[5]), .CO(carry[6]), .S(
    SUM[5]) );
  SAEDSLVT14_ADDH_0P5 U1_1_7 ( .A(A[7]), .B(carry[7]), .CO(carry[8]), .S(
    SUM[7]) );
  SAEDSLVT14_ADDH_0P5 U1_1_10 ( .A(A[10]), .B(carry[10]), .CO(carry[11]), .S(
    SUM[10]) );
  SAEDSLVT14_ADDH_0P5 U1_1_9 ( .A(A[9]), .B(carry[9]), .CO(carry[10]), .S(
    SUM[9]) );
  SAEDSLVT14_ADDH_0P5 U1_1_4 ( .A(A[4]), .B(carry[4]), .CO(carry[5]), .S(
    SUM[4]) );
  SAEDSLVT14_ADDH_0P5 U1_1_11 ( .A(A[11]), .B(carry[11]), .CO(carry[12]), .S(
    SUM[11]) );
  SAEDSLVT14_ADDH_0P5 U1_1_8 ( .A(A[8]), .B(carry[8]), .CO(carry[9]), .S(
    SUM[8]) );
  SAEDSLVT14_ADDH_0P5 U1_1_3 ( .A(A[3]), .B(carry[3]), .CO(carry[4]), .S(
    SUM[3]) );
  SAEDSLVT14_INV_1 U2 ( .A(A[0]), .X(SUM[0]) );
endmodule

```

```

module top ( clock, start, BestDist, motionX, motionY, AddressR, AddressS1,
            AddressS2, R, S1, S2, completed );
    output [7:0] BestDist;
    output [3:0] motionX;
    output [3:0] motionY;
    output [7:0] AddressR;
    output [9:0] AddressS1;
    output [9:0] AddressS2;
    input [7:0] R;
    input [7:0] S1;
    input [7:0] S2;
    input clock, start;
    output completed;
    wire CompStart;
    wire [15:0] SIS2mux;
    wire [15:0] newDist;
    wire [15:0] PReady;
    wire [3:0] VectorX;
    wire [3:0] VectorY;
    wire [127:0] Accumulate;
    wire SYNOPSYS_UNCONNECTED_0, SYNOPSYS_UNCONNECTED_1,
    SYNOPSYS_UNCONNECTED_2;

control ctl_u (.clock(clock), .start(start), .SIS2mux({$SIS2mux[15:1],
    SYNOPSYS_UNCONNECTED_0}), .newDist(newDist), .CompStart(CompStart),
    .PReady(PReady), .VectorX(VectorX), .VectorY(VectorY), .AddressR(
        AddressR),
    .AddressS1({AddressS1[9:0], SYNOPSYS_UNCONNECTED_1,
        AddressS1[3:0]}), .AddressS2({AddressS2[9:5], SYNOPSYS_UNCONNECTED_2,
        AddressS2[3:0]}), .completed(completed) );
PEtotal_p_u (.clock(clock), .R(R), .S1(S1), .S2(S2), .SIS2mux({
    SIS2mux[15:1], AddressS2[4]}), .newDist(newDist), .Accumulate(
    Accumulate) );
Comparator comp_u (.clock(clock), .CompStart(CompStart), .PEnat(Accumulate),
    .PReady(PReady), .vectorX(VectorX), .vectorY(VectorY), .BestDist(
    BestDist),
    .motionX(motionX), .motionY(motionY));
SAEQLVLT14_TIE1_4_U3 ( .X(AddressS2[4]) );
SAEQLVLT14_TIE0_VI_2_U4 ( .X(AddressS1[4]) );
endmodule

```

```
#####
# Created by write_sdc on Fri Nov  8 13:40:15 2024
#####
set sdc_version 2.1

set_units -time ns -resistance MOhm -capacitance fF -voltage V -current uA
create_clock [get_ports clock] -name ideal_clock1 -period 1 -waveform {0 0.5}
```

Gate-level Netlist

## SDC Constraints



\*\*\*\*\*  
Report : area  
Design : top  
Version: O-2018.06-SP4  
Date : Fri Nov 8 13:48:25 2024  
\*\*\*\*\*

Library(s) Used:

saed14slvt\_ss0p72v125c (File: /packages/process\_kit/generic/generic\_14nm/stdcell\_slvt/db\_ccs/saed14slvt\_ss0p72v125c.db)  
saed14lvt\_ss0p72v125c (File: /packages/process\_kit/generic/generic\_14nm/stdcell\_lvt/db\_ccs/saed14lvt\_ss0p72v125c.db)  
saed14rvt\_ss0p72v125c (File: /packages/process\_kit/generic/generic\_14nm/stdcell\_rvt/db\_ccs/saed14rvt\_ss0p72v125c.db)  
saed14hvt\_ss0p72v125c (File: /packages/process\_kit/generic/generic\_14nm/stdcell\_hvt/db\_ccs/saed14hvt\_ss0p72v125c.db)

Number of ports: 2100  
Number of nets: 3430  
Number of cells: 1290  
Number of combinational cells: 961  
Number of sequential cells: 277  
Number of macros/black boxes: 0  
Number of buf/inv: 297  
Number of references: 5  
  
Combinational area: 476.589590  
Buf/Inv area: 52.924799  
Noncombinational area: 310.666801  
Macro/Black Box area: 0.000000  
Net Interconnect area: 849.952362  
  
Total cell area: 787.256391  
Total area: 1637.208753  
1

## Area Report

\*\*\*\*\*  
Report : power  
-analysis\_effort low  
Design : top  
Version: O-2018.06-SP4  
Date : Fri Nov 8 13:56:11 2024  
\*\*\*\*\*

Design Wire Load Model Library  
-----  
top 8000 saed14hvt\_ss0p72v125c

Global Operating Voltage = 0.72  
Power-specific unit information :  
Voltage Units = 1V  
Capacitance Units = 1.000000fF  
Time Units = 1ns  
Dynamic Power Units = 1uW (derived from V,C,T units)  
Leakage Power Units = 1pW

Cell Internal Power = 206.8719 uW (83%)  
Net Switching Power = 43.2645 uW (17%)

Total Dynamic Power = 250.1364 uW (100%)

Cell Leakage Power = 73.3639 uW

| Power Group   | Internal Power | Switching Power | Leakage Power | Total Power | ( % )     | Attrs |
|---------------|----------------|-----------------|---------------|-------------|-----------|-------|
| io_pad        | 0.0000         | 0.0000          | 0.0000        | 0.0000      | ( 0.00%)  |       |
| memory        | 0.0000         | 0.0000          | 0.0000        | 0.0000      | ( 0.00%)  |       |
| black_box     | 0.0000         | 0.0000          | 0.0000        | 0.0000      | ( 0.00%)  |       |
| clock_network | 0.0000         | 0.0000          | 0.0000        | 0.0000      | ( 0.00%)  |       |
| register      | 170.5364       | 6.8656          | 2.8989e+07    | 206.3906    | ( 63.80%) |       |
| sequential    | 0.0000         | 0.0000          | 0.0000        | 0.0000      | ( 0.00%)  |       |
| combinational | 36.3354        | 36.3989         | 4.4375e+07    | 117.1097    | ( 36.20%) |       |
| Total         | 206.8718 uW    | 43.2645 uW      | 7.3364e+07 pW | 323.5002 uW |           |       |

## Power Report



## Physical Design:

### - Layout view

The displayed layout is the post-placement and routing phase in the physical design process. It shows the arrangement of standard cells, nets, and clock trees on a silicon die.



**- Clock Tree Synthesis (CTS):** Checking clock signal propagation across different levels.

**- Routing and Congestion Analysis:** Identifying high-density areas where routing congestion or delays might occur.

**- Timing Closure and Power Analysis:** Ensuring timing constraints and low power requirements are met.



## Complete Physical Design:

- Flow consists of:

- Partitioning
- Floor Planning
- Placement
- Routing

Clock Tree



## Physical Design Timing Report:

```
setup_route.rpt      hold_route.rpt

Report : timing
  -path_type full
  -delay_type max
  -max_paths 1
  -report_by design
Design : S-2021.06-SP5-1
Version: S-2021.06-SP5-1
Date  : Sat Nov 23 01:52:32 2024
*****+
Startpoint: ct_u/count_reg[3] (rising edge-triggered flip-flop clocked by ideal_clock1)
Endpoint: comp_u/motiony_reg[2] (rising edge-triggered flip-flop clocked by ideal_clock1)
Mode: func
Corner: slow
Scenario: sync_slow
Path Group: ideal_clock1
Path Type: max
*****+
Point           Incr   Path
clock ideal_clock1 (rise edge)    0.00  0.00
clock network delay (propagated)  0.01  0.01
ct_u/count_reg[3]/O (SAEDRV14_FSDPQ_V2LP_0P5)  0.00  0.01 r
ct_u/count_reg[3]/O (SAEDRV14_FSDPQ_V2LP_0P5)  0.05  0.08 r
ct_u/u78/X (SAEDRV14_INV_S_BPS)  0.05  0.17 r
ct_u/u78/X (SAEDRV14_INV2_MM_BPS)  0.06  0.18 r
ct_u/u26/X (SAEDRV14_INV_S_BPS)  0.02  0.20 r
ct_u/u26/X (SAEDRV14_INV2_MM_BPS)  0.03  0.23 r
ct_u/u74/X (SAEDRV14_INV_S_BPS)  0.02  0.25 r
ct_u/u74/X (SAEDRV14_INV2_MM_BPS)  0.03  0.27 r
ct_u/u33/X (SAEDRV14_INV2_MM_BPS)  0.04  0.29 r
ct_u/u17/X (SAEDRV14_NRC_1)  0.04  0.33 f
comp_u/u39/X (SAEDRV14_INV_S_BPS)  0.05  0.35 f
comp_u/u39/X (SAEDRV14_ND3_BPS)  0.05  0.41 r
comp_u/u45/X (SAEDRV14_ND3_BPS)  0.03  0.44 f
comp_u/u45/X (SAEDRV14_INV_S_BPS)  0.04  0.47 r
comp_u/u42/X (SAEDRV14_DA122_BPS)  0.01  0.47 r
comp_u/u48/X (SAEDRV14_DA221_U_BPS)  0.02  0.49 r
comp_u/u48/X (SAEDRV14_DA221_U_BPS)  0.01  0.50 r
comp_u/u25/X (SAEDRV14_DA221_U_BPS)  0.01  0.51 r
comp_u/u24/X (SAEDRV14_DA221_U_BPS)  0.01  0.52 r
comp_u/u44/X (SAEDRV14_DA221_U_BPS)  0.01  0.54 f
comp_u/u44/X (SAEDRV14_A0121_BPS)  0.01  0.54 f
comp_u/u43/X (SAEDRV14_DA221_U_BPS)  0.03  0.58 f
comp_u/u43/X (SAEDRV14_INV_S_BPS)  0.06  0.63 r
comp_u/u48/X (SAEDRV14_A0122_BPS)  0.01  0.69 f
comp_u/u47/X (SAEDRV14_A0121_BPS)  0.01  0.71 r
comp_u/motiony_reg[2]/D (SAEDRV14_FDP_V2LP_1)  0.00  0.71 r
data arrival time
*****+
clock ideal_clock1 (rise edge)    3.80  3.80
clock network delay (propagated)  0.01  3.81
comp_u/motiony_reg[2]/CK (SAEDRV14_FDP_V2LP_1)  0.00  3.81 r
library setup time               -0.01  3.00
data required time               3.00
data required time               3.80
data arrival time                -0.71
slack (MET)                      3.10
*****+
1
```

Setup

```
setup_route.rpt      hold_route.rpt

*****
Report : timing
  -path_type full
  -delay_type min
  -max_paths 1
  -report_by design
Design : S-2021.06-SP5-1
Version: S-2021.06-SP5-1
Date  : Sat Nov 23 01:53:27 2024
*****+
Startpoint: pe_u/pes/Rpipe_reg[5] (rising edge-triggered flip-flop clocked by ideal_clock1)
Endpoint: pe_u/pe6/Rpipe_reg[5] (rising edge-triggered flip-flop clocked by ideal_clock1)
Mode: func
Corner: slow
Scenario: func_slow
Path Group: ideal_clock1
Path Type: min
*****+
Point           Incr   Path
clock ideal_clock1 (rise edge)    0.00  0.00
clock network delay (propagated)  0.00  0.00
pe_u/pe5/Rpipe_reg[5]/CK (SAEDRV14_FDP_V2LP_0P5)  0.00  0.00 r
pe_u/pe5/Rpipe_reg[5]/Q (SAEDRV14_FDP_V2LP_0P5)  0.02  0.03 f
pe_u/pe6/Rpipe_reg[5]/D (SAEDRV14_FDP_V2LP_0P5)  0.00  0.03 f
data arrival time
*****+
clock ideal_clock1 (rise edge)    0.00  0.00
clock network delay (propagated)  0.02  0.02
pe_u/pe6/Rpipe_reg[5]/CK (SAEDRV14_FDP_V2LP_0P5)  0.00  0.02 r
library hold time                 0.01  0.02
data required time                0.02
data arrival time                 -0.03
slack (MET)                      0.00
*****+
1
```

Hold



## QoR and Utility Report:

```
*****
Report : qor
Design : top
Version: S-2021.06-SP5-1
Date   : Sat Nov 23 01:51:33 2024
*****



Scenario      'func_fast'
Timing Path Group 'ideal_clock1'
-----
Levels of Logic:    22
Critical Path Length: 0.65
Critical Path Slack: 3.14
Critical Path Clk Period: 3.80
Total Negative Slack: 0.00
No. of Violating Paths: 0
Worst Hold Violation: 0.00
Total Hold Violation: 0.00
No. of Hold Violations: 0
-----
Scenario      'func_slow'
Timing Path Group 'ideal_clock1'
-----
Levels of Logic:    22
Critical Path Length: 0.69
Critical Path Slack: 3.10
Critical Path Clk Period: 3.80
Total Negative Slack: 0.00
No. of Violating Paths: 0
Worst Hold Violation: 0.00
Total Hold Violation: 0.00
No. of Hold Violations: 0
-----



Cell Count
-----
Hierarchical Cell Count: 52
Hierarchical Port Count: 2164
Leaf Cell Count: 1243
Buf//Inv Cell Count: 302
Buf Cell Count: 5
Inv Cell Count: 297
CT Buf/Inv Cell Count: 0
Combinational Cell Count: 966
Single-bit Isolation Cell Count: 0
Multi-bit Isolation Cell Count: 0
Isolation Cell Banking Ratio: 0.00%
Single-bit Level Shifter Cell Count: 0
Multi-bit Level Shifter Cell Count: 0
Level Shifter Cell Banking Ratio: 0.00%
Single-bit ELS Cell Count: 0
Multi-bit ELS Cell Count: 0
ELS Cell Banking Ratio: 0.00%
Sequential Cell Count: 277
Integrated Clock-Gating Cell Count: 0
Sequential Macro Cell Count: 0
Single-bit Sequential Cell Count: 277
Multi-bit Sequential Cell Count: 0
Sequential Cell Banking Ratio: 0.00%
BitsPer flop: 1.00
Macro Count: 0
-----



Area
-----
Combinational Area: 483.21
Hierarchical Area: 318.57
Buf//Inv Area: 55.41
Total Buffer Area: 2.66
Total Inverter Area: 52.75
Macro/Black Box Area: 0.00
Net Area: 0
Net XLength: 7845.41
Net YLength: 7570.95
Cell Area (netlist): 793.87
Cell Area (netlist and physical only): 793.87
Net Length: 15416.36
-----



Design Rules
-----
Total Number of Nets: 1489
Nets with Violations: 0
Max Trans Violations: 0
Max Cap Violations: 0
-----



1
```

```
*****
Report : report_utilization
Design : top
Version: S-2021.06-SP5-1
Date   : Sat Nov 23 01:52:02 2024
*****



Utilization Ratio: 0.7098
Utilization options:
- Area calculation based on: site_row of block_top
- Categories of objects excluded: hard_macros macro_keepouts soft_macros io_cells hard_blockages
Total Area: 1106.2260
Total Capacity Area: 1106.2260
Total Area of cells: 785.1696
Area of excluded objects:
- hard_macros : 0.0000
- macro_keepouts : 0.0000
- soft_macros : 0.0000
- io_cells : 0.0000
- hard_blockages : 0.0000

Utilization of site-rows with:
- Site 'unit': 0.7098
0.7098
```

## 1. QoR Analysis:

Both func\_fast and func\_slow corners meet all timing constraints, with no violations. The design is timing-optimized with sufficient slack in both corners.

## 2. Utility Report:

The design utilizes approximately 71% of the available area, indicating efficient placement with room for optimization or expansion.

No excluded regions or hard/soft blockages simplify routing and minimize congestion.



## Physical Design Power and Area:

```
*****
Report : area
Design : top
Version: 0-2018.06-SP4
Date   : Tue Nov 19 20:55:33 2024
*****  
  
Library(s) Used:  
  
saed14slvt_ff0p88v125c (File: /packages/process_kit/generic/generic_14nm/stdcell_slvt/db_ccs/saed14slvt_ff0p88v125c.db)
saed14lvt_ff0p88v125c (File: /packages/process_kit/generic/generic_14nm/stdcell_lvt/db_ccs/saed14lvt_ff0p88v125c.db)
saed14rvt_ff0p88v125c (File: /packages/process_kit/generic/generic_14nm/stdcell_rvt/db_ccs/saed14rvt_ff0p88v125c.db)
saed14hvt_ff0p88v125c (File: /packages/process_kit/generic/generic_14nm/stdcell_hvt/db_ccs/saed14hvt_ff0p88v125c.db)  
  
Number of ports:          2100
Number of nets:           3432
Number of cells:          1292
Number of combinational cells: 963
Number of sequential cells: 277
Number of macros/black boxes: 0
Number of buf/inv:          300
Number of references:      5  
  
Combinational area:      477.433190
Buf/Inv area:             53.546399
Noncombinational area:    310.666801
Macro/Black Box area:     0.000000
Net Interconnect area:    851.032465  
  
Total cell area:          788.099991
Total area:                1639.132456
1
```

```
Loading db file '/packages/process_kit/generic/generic_14nm/stdcell_slvt/db_ccs/saed14slvt_ff0p88v125c.db'
Loading db file '/packages/process_kit/generic/generic_14nm/stdcell_lvt/db_ccs/saed14lvt_ff0p88v125c.db'
Loading db file '/packages/process_kit/generic/generic_14nm/stdcell_hvt/db_ccs/saed14hvt_ff0p88v125c.db'
Loading db file '/packages/process_kit/generic/generic_14nm/stdcell_rvt/db_ccs/saed14rvt_ff0p88v125c.db'
Information: Propagating switching activity (low effort zero delay simulation). (PWR-6)
Warning: Design has unannotated primary inputs. (PWR-414)
Warning: Design has unannotated sequential cell outputs. (PWR-415)
*****  
  
Report : power
        -analysis_effort low
Design : top
Version: 0-2018.06-SP4
Date   : Tue Nov 19 20:56:24 2024
*****  
  
Library(s) Used:  
  
saed14slvt_ff0p88v125c (File: /packages/process_kit/generic/generic_14nm/stdcell_slvt/db_ccs/saed14slvt_ff0p88v125c.db)
saed14lvt_ff0p88v125c (File: /packages/process_kit/generic/generic_14nm/stdcell_lvt/db_ccs/saed14lvt_ff0p88v125c.db)
saed14rvt_ff0p88v125c (File: /packages/process_kit/generic/generic_14nm/stdcell_rvt/db_ccs/saed14rvt_ff0p88v125c.db)
saed14hvt_ff0p88v125c (File: /packages/process_kit/generic/generic_14nm/stdcell_hvt/db_ccs/saed14hvt_ff0p88v125c.db)  
  
Operating Conditions: ff0p88v125c Library: saed14hvt_ff0p88v125c
Wire Load Model Mode: top
Design      Wire Load Model      Library
top          8000                 saed14hvt_ff0p88v125c  
  
Global Operating Voltage = 0.88
Power-specific unit information :
  Voltage Units = 1V
  Capacitance Units = 1.000000ff
  Time Units = 1ns
  Dynamic Power Units = 1uW   (derived from V,C,T units)
  Leakage Power Units = 1pW  
  
Cell Internal Power = 89.2658 uW (82%)
Net Switching Power = 20.1140 uW (18%)
Total Dynamic Power = 109.3797 uW (100%)
Cell Leakage Power = 699.1082 uW  
  
Power Group      Internal Power      Switching Power      Leakage Power      Total Power ( % ) Attrs
io_pad            0.0000              0.0000              0.0000              0.0000 ( 0.00% )
memory           0.0000              0.0000              0.0000              0.0000 ( 0.00% )
black_box         0.0000              0.0000              0.0000              0.0000 ( 0.00% )
clock_network    0.0000              0.0000              0.0000              0.0000 ( 0.00% )
register          73.7606             3.3155             1.6030e+08            237.3800 ( 29.36% )
sequential         0.0000              0.0000              0.0000              0.0000 ( 0.00% )
combinational    15.5952             16.7984             5.3880e+08            571.1080 ( 70.64% )
Total             89.2658 uW          20.1140 uW          6.9911e+08 pW          808.4880 uW
1
```



## Physical Design Parasitic and Netlist:

```

/*SPF "1481-1998"
*DESIGN "top"
*DATE "Sat Nov 23 02:06:39 2024"
*VENDOR "Synopsys, Inc."
*PROGRAM "icc2 -packages/synopsis/icc2/S-2021.06-SP5-1/linux64/nwtn/bin/icc2_exec"
*VERSION "S-2021.06-SP5-1 Oct 15, 2022"
*DESIGN_FLOW "ICC2 SPEF DR"
*DIVIDER /
*DELIMITER :
*BUS_DELIMITER [ ]
#T_UNIT 1 NS
#C_UNIT 1 FF
#R_UNIT 1 OHM
#L_UNIT 1 HENRY

// XY_UNIT 1 UM
// PARASITIC_TECH tlu_max at 125.000 degree

*NAME_MAP
#1 clock
#2 start
#3 BestDist[7]
#4 BestDist[6]
#5 BestDist[5]
#6 BestDist[4]
#7 BestDist[3]
#8 BestDist[2]
#9 BestDist[1]
#10 BestDist[0]
#11 motionX[3]
#12 motionX[2]
#13 motionX[1]
#14 motionX[0]
#15 motionY[3]
#16 motionY[2]
#17 motionY[1]
#18 motionY[0]
#19 AddressS[7]
#20 AddressS[6]
#21 AddressS[5]
#22 AddressS[4]
#23 AddressS[3]
#24 AddressS[2]
#25 AddressS[1]
#26 AddressS[0]
#27 AddressS1[9]
#28 AddressS1[8]
#29 AddressS1[7]
#30 AddressS1[6]

// IC Compiler II Version S-2021.06-SP5-1 Verilog Writer
// Generated on 11/23/2024 at 2:10:25
// Library Name: libraryname
// Block Name: top
// User Label:
// Write Command: write_verilog -top_module first ./outputs/top.v
module top ( clock , start , BestDist , motionX , motionY , AddressR ,
             AddressS1 , AddressS2 , R , S1 , S2 , completed );
  input clock ;
  input start ;
  output [7:0] BestDist ;
  output [3:0] motionX ;
  output [3:0] motionY ;
  output [7:0] AddressR ;
  output [9:0] AddressS1 ;
  output [9:0] AddressS2 ;
  input [7:0] R ;
  input [7:0] S1 ;
  input [7:0] S2 ;
  output completed ;

  wire [15:1] S1S2mux ;
  wire [15:0] newDist ;
  wire [15:0] PReady ;
  wire [3:0] VectorX ;
  wire [3:0] VectorY ;
  wire [127:0] Accumulate ;

control ctl_u (.clock ( ZCTSNET_2 ) , .start ( start ) ,
               .S1S2mux ( { S1S2mux[15] , S1S2mux[14] , S1S2mux[13] , S1S2mux[12] ,
                           S1S2mux[11] , S1S2mux[10] , S1S2mux[9] , S1S2mux[8] , S1S2mux[7] ,
                           S1S2mux[6] , S1S2mux[5] , S1S2mux[4] , S1S2mux[3] , S1S2mux[2] ,
                           S1S2mux[1] , SYNOPSYS_UNCONNECTED_1 } ) ,
               .newDist ( newDist ) , .CompStart ( CompStart ) , .PReady ( PReady ) ,
               .AddressS1 ( { AddressS1[9] , AddressS1[8] , AddressS1[7] , AddressS1[6] ,
                             AddressS1[5] , SYNOPSYS_UNCONNECTED_2 , AddressS1[3] , AddressS1[2] ,
                             AddressS1[1] , AddressS1[0] } ) ,
               .AddressS2 ( { AddressS2[9] , AddressS2[8] , AddressS2[7] , AddressS2[6] ,
                             AddressS2[5] , SYNOPSYS_UNCONNECTED_3 , AddressS2[3] , AddressS2[2] ,
                             AddressS2[1] , AddressS2[0] } ) ,
               .completed ( completed ) , .ZCTSNET_0 ( ZCTSNET_3 ) ) ;
PEtotal pe_u (.clock ( ZCTSNET_0 ) , .R ( R ) , .S1 ( S1 ) , .S2 ( S2 ) ,
              .S1S2mux ( { S1S2mux[15] , S1S2mux[14] , S1S2mux[13] , S1S2mux[12] ,
                           S1S2mux[11] , S1S2mux[10] , S1S2mux[9] , S1S2mux[8] , S1S2mux[7] ,
                           S1S2mux[6] , S1S2mux[5] , S1S2mux[4] , S1S2mux[3] , S1S2mux[2] ,
                           S1S2mux[1] , SYNOPSYS_UNCONNECTED_1 } ) ,
              .PEtotal_pe_u ( .clock ( ZCTSNET_0 ) , .R ( R ) , .S1 ( S1 ) , .S2 ( S2 ) ,
                             .S1S2mux ( { S1S2mux[15] , S1S2mux[14] , S1S2mux[13] , S1S2mux[12] ,
                                           S1S2mux[11] , S1S2mux[10] , S1S2mux[9] , S1S2mux[8] , S1S2mux[7] ,
                                           S1S2mux[6] , S1S2mux[5] , S1S2mux[4] , S1S2mux[3] , S1S2mux[2] ,
                                           S1S2mux[1] , SYNOPSYS_UNCONNECTED_1 } ) ,
                             .completed ( completed ) , .ZCTSNET_0 ( ZCTSNET_3 ) ) ;

```

Parasitic File

Netlist

## Post-Layout Timing Verification Using Prime-Time:

```

*****
Report: timing
-path_type full
-delay_type max
-max_paths 1
-sort_by slack
Design : top
Version: S-2021.06-SP4
Date   : Wed Dec 4 13:52:37 2024
*****



Startpoint: ctl_u/count_reg[3]
            (rising edge-triggered flip-flop clocked by ideal_clock1)
Endpoint: comp_u/motionX[2]
            (rising edge-triggered flip-flop clocked by ideal_clock1)
Last common pin: clock
Path Group: ideal_clock1
Path Type: max

Point                                     incr      Path
clock ideal_clock1 (rise edge)           0.00      0.00
clock network delay (propagated)        0.01      0.01
ctl_u/count_reg[3]/CK (SAEDRV14_FSDPO_V2LP_0P5) 0.00      0.01 r
ctl_u/count_reg[3]/O (SAEDRV14_FSDPO_V2LP_0P5) 0.07 & 0.08 r
ctl_u/U02/X (SAEDRV14_INV_S_0P5)       0.05 & 0.14 f
ctl_u/U03/X (SAEDRV14_INV_S_0P5)       0.05 & 0.17 f
ctl_u/U06/X (SAEDRV14_INV_S_0P5)       0.02 & 0.22 f
ctl_u/U11/X (SAEDRV14_NR2_MM_0P5)      0.03 & 0.26 r
ctl_u/U13/X (SAEDRV14_NR2_MM_0P5)      0.04 & 0.32 r
ctl_u/U17/X (SAEDRV14_NR2_1)           0.04 & 0.35 f
comp_u/U128/X (SAEDRV14_NR2_1)          0.03 & 0.39 f
comp_u/U140/X (SAEDRV14_NR2_1)          0.03 & 0.41 r
comp_u/U145/X (SAEDRV14_NR2_1)          0.03 & 0.46 f
comp_u/U144/X (SAEDRV14_A0221_0P5)     0.02 & 0.49 f
comp_u/U146/X (SAEDRV14_A0221_0P5)     0.01 & 0.57 r
comp_u/U48/X (SAEDRV14_A0221_U_0P5)    0.02 & 0.51 r
comp_u/U27/X (SAEDRV14_A0221_U_0P5)    0.01 & 0.52 r
comp_u/U28/X (SAEDRV14_A0221_U_0P5)    0.01 & 0.55 r
comp_u/U24/X (SAEDRV14_A0221_U_0P5)    0.01 & 0.55 r
comp_u/U45/X (SAEDRV14_A0221_U_0P5)    0.01 & 0.57 r
comp_u/U44/X (SAEDRV14_A0221_U_0P5)    0.01 & 0.67 r
comp_u/U36/X (SAEDRV14_A0221_U_0P5)    0.03 & 0.61 r
comp_u/U37/X (SAEDRV14_INV_0P5)         0.00 & 0.69 r
comp_u/U48/X (SAEDRV14_NR2_MM_0P5)      0.04 & 0.73 f
comp_u/motionX[2]/CK (SAEDRV14_A0121_0P7) 0.00 & 0.77 r
comp_u/motionX[2]/O (SAEDRV14_FDP_V2LP_1) 0.00 & 0.74 r
data arrival time                         0.74

clock ideal_clock1 (rise edge)           3.80      3.80
clock network delay (propagated)        0.01      3.81
clock reconvergence pessimism          0.00      3.81
comp_u/motionX[2]/CK (SAEDRV14_FDP_V2LP_1) -0.01      3.80
library setup time                      3.80
data required time                      3.80
data arrival time                        -0.74
slack (MET)                            3.06


```

Setup

```

*****
Report: timing
-path_type full
-delay_type min
-max_paths 1
-sort_by slack
Design : top
Version: S-2021.06-SP4
Date   : Wed Dec 4 13:54:47 2024
*****



Startpoint: pe_u/p5/Rpipe_reg[5]
            (rising edge-triggered flip-flop clocked by ideal_clock1)
Endpoint: pe_u/p6/Rpipe_reg[5]
            (rising edge-triggered flip-flop clocked by ideal_clock1)
Last common pin: clock
Path Group: ideal_clock1
Path Type: min

Point                                     incr      Path
clock ideal_clock1 (rise edge)           0.00      0.00
clock network delay (propagated)        0.00      0.00
pe_u/p5/Rpipe_reg[5]/CK (SAEDRV14_FDP_V2LP_0P5) 0.00      0.00 r
pe_u/p5/Rpipe_reg[5]/O (SAEDRV14_FDP_V2LP_0P5) 0.02 & 0.03 f
pe_u/p6/Rpipe_reg[5]/D (SAEDRV14_FDP_V2LP_0P5) 0.00 & 0.03 f
data arrival time                         0.03

clock ideal_clock1 (rise edge)           0.00      0.00
clock network delay (propagated)        0.02      0.02
clock reconvergence pessimism          0.00      0.02
pe_u/p6/Rpipe_reg[5]/CK (SAEDRV14_FDP_V2LP_0P5) 0.02 & 0.02 r
library hold time                      0.01      0.02
data required time                      0.02
data arrival time                        -0.03
slack (MET)                            0.00


```

Hold



SAN FRANCISCO  
STATE UNIVERSITY

DEPARTMENT OF ENGINEERING  
ELECTRICAL ENGINEERING  
1600 Holloway Avenue  
San Francisco, CA 94132  
[sfsu.edu](http://sfsu.edu)

*Thank you*