

# **Reducing 8 taps FIR filter power consumption: Low Power Challenge**

Matteo Matta, Fabio Piras



Which architecture?

- 2 MUL ?
- 4 MUL ?
- PIPELINED ?
- PARALLEL ?

→ Fixed throughput!  
( $36 \times 10^6$  output/s)

---

The higher the slack,  
the higher the margin  
to reduce power  
+  
lower power  
consumption at  
beginning

**PIPELINED**  
or  
**PARALLEL** ?

So...

Which architecture?

- 2 MUL ?
- 4 MUL ?
- PIPELINED ?
- PARALLEL ?



Fixed throughput!  
( $36 \times 10^6$  output/s)



The higher the slack,  
the higher the margin  
to reduce power  
+  
lower power  
consumption at  
beginning

**PIPELINED**  
So...  
or  
**PARALLEL**

?

**PIPELINED !**

# Pipeline Architecture



$\text{THR} = 36 \times 10^6 \text{ output/cc}$

$T_{\text{clock}} = 27.7 \text{ ns}$

- Simplifying the datapath and the control logic



```
module fsm(
    input clk, rst, en,
    output reg y_mult,
    output reg valid,
    output reg ready
);

    always @ (posedge clk)
        if(x_clr) begin
            // for(i=0;i<8;i=i+1)
            //     x_shift[i] <= 0;
        end
        else
            if(en)
                begin
                    ready <= ~rst;
                    y_mult <= en;
                    valid <= y_mult;
                end
            end
endmodule
```

) &

```
always @ (posedge clk)
    //if(x_clr) begin
    //    for(i=0;i<8;i=i+1)
    //        x_shift[i] <= 0;
    //end
    //else
    if(en)
        begin
            x_shift[0] <= x;
            for(i=1;i<8;i=i+1) begin
                x_shift[i] <= x_shift[i-1];
            end
        end
    end
```

&

```
always @(*)
    // if(y_clr)
    // begin
    //     y <= 0;
    // end
    // else
    // begin
    //if(y_mult)
    begin
        y <= mul_0_out_floor + mul_0_out_fra;
    end
    // else y = 0;
    end
endmodule
```

- Stimulating clock-gating by the synthesizer

```
always @ (posedge clk)
    if (y_mult) begin
        mul_0_out <= x_shift[0]*b0;
        mul_1_out <= x_shift[1]*b1;
        mul_2_out <= x_shift[2]*b2;
        mul_3_out <= x_shift[3]*b3;
        mul_4_out <= x_shift[4]*b4;
        mul_5_out <= x_shift[5]*b5;
        mul_6_out <= x_shift[6]*b6;
        mul_7_out <= x_shift[7]*b7;
    end
    // else begin
    //     mul_0_out <= mul_0_out;
    //     mul_1_out <= mul_1_out;
    //     mul_2_out <= mul_2_out;
    //     mul_3_out <= mul_3_out;
    //     mul_4_out <= mul_4_out;
    //     mul_5_out <= mul_5_out;
    //     mul_6_out <= mul_6_out;
    //     mul_7_out <= mul_7_out;
    // end
// end
```

&

```
integer i;
always @ (posedge clk)
    if(en)
        begin
            x_shift[0] <= x;
            for(i=1;i<8;i=i+1) begin
                x_shift[i] <= x_shift[i-1];
            end
        end
    end
```

- Moving the output register



# Computing the final power consumption...

gate-level switching

```
"clk"  : "0.500262 1805";  
"rst"  : "0.000880 1";  
"en"   : "0.433120 843";
```

+

enabling clock-gating

```
34 set attribute lp insert clock gating true /
```

+

setting a constraint for dynamic power

```
9 set_attribute max_dynamic_power 200 /designs/top
```

+

down-scaling the power voltage

```
27 set attribute library liberty LPMOS/v3 0 1/PVT 1 20V range/D CELLS HD LPMOS typ 1 20V 25C.lib
```

# Results at each optimization stage

## Starting Design

```
legacy_genus:/> report power
=====
Generated by: Genus(TM) Synthesis Solution 18.10-p003_1
Generated on: Dec 17 2025 03:19:16 pm
Module: top
Technology library: D_CELLS_HD_LPMOS_typ_1_80V_25C 3.0.1
Operating conditions: typ_1_80V_25C
Interconnect mode: global
Area mode: physical library
=====
Leakage Dynamic Total
Instance Cells Power(uW) Power(uW) Power(uW)
-----
top 1129 0.007 1780.328 1780.335
datapath 1125 0.007 1637.222 1637.230
control_unit 4 0.000 8.391 8.391
```

```
legacy_genus:/> report area
=====
Generated by: Genus(TM) Synthesis Solution 18.10-p003_1
Generated on: Dec 17 2025 03:21:16 pm
Module: top
Technology library: D_CELLS_HD_LPMOS_typ_1_80V_25C 3.0.1
Operating conditions: typ_1_80V_25C
Interconnect mode: global
Area mode: physical library
=====
Instance Module Cell Count Cell Area Net Area Total Area
-----
top dp 1129 35795.558 17313.145 53108.703
datapath dp 1125 35652.557 14505.857 50158.414
control_unit fsm 4 143.002 7.824 150.825
```

| Type           | Instances | Area      | Area % |
|----------------|-----------|-----------|--------|
| sequential     | 131       | 7842.509  | 21.9   |
| inverter       | 17        | 127.949   | 0.4    |
| logic          | 981       | 27825.101 | 77.7   |
| physical_cells | 0         | 0.000     | 0.0    |
| <b>total</b>   | 1129      | 35795.558 | 100.0  |

```
(clock clk) capture 27700 R
Cost Group : 'clk' (path_group 'clk')
Timing slack : 19970ps
Start-point : datapath/x_shift_reg[6][4]/C
End-point : datapath/mul_6_out_reg[14]/SD
```

## Gated Design

-13,3%

```
legacy_genus:/> report power
=====
Generated by: Genus(TM) Synthesis Solution 18.10-p003_1
Generated on: Dec 17 2025 03:30:31 pm
Module: top
Technology library: D_CELLS_HD_LPMOS_typ_1_80V_25C 3.0.1
Operating conditions: typ_1_80V_25C
Interconnect mode: global
Area mode: physical library
=====
Leakage Dynamic Total
Instance Cells Power(uW) Power(uW) Power(uW)
-----
top 1131 0.007 1571.767 1571.774
datapath 1127 0.007 1558.438 1558.445
control_unit 4 0.000 8.256 8.256
```

```
legacy_genus:/> report area
=====
Generated by: Genus(TM) Synthesis Solution 18.10-p003_1
Generated on: Dec 17 2025 03:31:08 pm
Module: top
Technology library: D_CELLS_HD_LPMOS_typ_1_80V_25C 3.0.1
Operating conditions: typ_1_80V_25C
Interconnect mode: global
Area mode: physical library
=====
Instance Module Cell Count Cell Area Net Area Total Area
-----
top dp 1131 33949.082 16345.018 50294.099
datapath dp 1127 33806.080 14017.982 47824.062
control_unit fsm 4 143.002 7.824 150.825
```

| Type                         | Instances | Area      | Area % |
|------------------------------|-----------|-----------|--------|
| sequential                   | 131       | 5915.750  | 17.4   |
| inverter                     | 17        | 127.949   | 0.4    |
| clock_gating_integrated_cell | 2         | 80.282    | 0.2    |
| logic                        | 981       | 27825.101 | 82.0   |
| physical_cells               | 0         | 0.000     | 0.0    |
| <b>total</b>                 | 1131      | 33949.082 | 100.0  |

```
(clock clk) capture 27700 R
Cost Group : 'clk' (path_group 'clk')
Timing slack : 20329ps
Start-point : datapath/x_shift_reg[0][4]/C
End-point : datapath/mul_0_out_reg[14]/D
```

## Final Design

-57,7%

```
legacy_genus:/> report power
=====
Generated by: Genus(TM) Synthesis Solution 18.10-p003_1
Generated on: Dec 17 2025 03:35:41 pm
Module: top
Technology library: D_CELLS_HD_LPMOS_typ_1_20V_25C 3.0.1
Operating conditions: typ_1_20V_25C
Interconnect mode: global
Area mode: physical library
=====
Leakage Dynamic Total
Instance Cells Power(uW) Power(uW) Power(uW)
-----
top 1130 0.004 663.977 663.977
datapath 1126 0.004 658.269 658.273
control_unit 4 0.000 3.564 3.564
```

```
legacy_genus:/> report area
=====
Generated by: Genus(TM) Synthesis Solution 18.10-p003_1
Generated on: Dec 17 2025 03:37:31 pm
Module: top
Technology library: D_CELLS_HD_LPMOS_typ_1_20V_25C 3.0.1
Operating conditions: typ_1_20V_25C
Interconnect mode: global
Area mode: physical library
=====
Instance Module Cell Count Cell Area Net Area Total Area
-----
top dp 1130 33949.082 16337.194 50286.275
datapath dp 1126 33806.080 14010.158 47816.238
control_unit fsm 4 143.002 7.824 150.825
```

| Type                         | Instances | Area      | Area % |
|------------------------------|-----------|-----------|--------|
| sequential                   | 131       | 5915.750  | 17.4   |
| inverter                     | 17        | 127.949   | 0.4    |
| clock_gating_integrated_cell | 2         | 80.282    | 0.2    |
| logic                        | 980       | 27825.101 | 82.0   |
| physical_cells               | 0         | 0.000     | 0.0    |
| <b>total</b>                 | 1130      | 33949.082 | 100.0  |

```
(clock clk) capture 27700 R
Cost Group : 'clk' (path_group 'clk')
Timing slack : 10959ps
Start-point : datapath/x_shift_reg[0][4]/C
End-point : datapath/mul_0_out_reg[14]/D
```

# Final result

```
legacy_genus:/> report power
=====
Generated by: Genus(TM) Synthesis Solution 18.10-p003_1
Generated on: Dec 17 2025 03:19:16 pm
Module: top
Technology library: D_CELLS_HD_LPMOS_typ_1_80V_25C 3.0.1
Operating conditions: typ_1_80V_25C
Interconnect mode: global
Area mode: physical library
=====
Leakage Dynamic Total
Instance Cells Power(uW) Power(uW) Power(uW)
top 1129 0.007 1780.328 1780.335
datapath 1125 0.007 1637.222 1637.230
control_unit 4 0.000 8.391 8.391
```

```
legacy_genus:/> report area
=====
Generated by: Genus(TM) Synthesis Solution 18.10-p003_1
Generated on: Dec 17 2025 03:21:16 pm
Module: top
Technology library: D_CELLS_HD_LPMOS_typ_1_80V_25C 3.0.1
Operating conditions: typ_1_80V_25C
Interconnect mode: global
Area mode: physical library
=====
Instance Module Cell Count Cell Area Net Area Total Area
top 1129 35795.558 17313.145 53108.703
datapath dp 1125 35652.557 14505.857 50158.414
control_unit fsm 4 143.002 7.824 150.825
```

```
legacy_genus:/> report power
=====
Generated by: Genus(TM) Synthesis Solution 18.10-p003_1
Generated on: Dec 17 2025 03:21:16 pm
Module: top
Technology library: D_CELLS_HD_LPMOS_typ_1_80V_25C 3.0.1
Operating conditions: typ_1_80V_25C
Interconnect mode: global
Area mode: physical library
=====
Leakage Dynamic Total
Instance Cells Power(uW) Power(uW) Power(uW)
top 1131 0.007 1571.767 1571.774
datapath 1127 0.007 1558.438 1558.445
control_unit 4 0.000 8.256 8.256
```

```
legacy_genus:/> report area
=====
Generated by: Genus(TM) Synthesis Solution 18.10-p003_1
Generated on: Dec 17 2025 03:31:08 pm
Module: top
Technology library: D_CELLS_HD_LPMOS_typ_1_80V_25C 3.0.1
Operating conditions: typ_1_80V_25C
Interconnect mode: global
Area mode: physical library
=====
Instance Module Cell Count Cell Area Net Area Total Area
top 1131 33949.082 16345.018 50294.099
datapath dp 1127 33806.080 14017.982 47824.062
control_unit fsm 4 143.002 7.824 150.825
```

```
legacy_genus:/> report power
=====
Generated by: Genus(TM) Synthesis Solution 18.10-p003_1
Generated on: Dec 17 2025 03:35:41 pm
Module: top
Technology library: D_CELLS_HD_LPMOS_typ_1_20V_25C 3.0.1
Operating conditions: typ_1_20V_25C
Interconnect mode: global
Area mode: physical library
=====
Leakage Dynamic Total
Instance Cells Power(uW) Power(uW) Power(uW)
top 1130 0.004 663.973 663.977
datapath 1126 0.004 658.269 658.273
control_unit 4 0.000 3.564 3.564
```

```
legacy_genus:/> report area
=====
Generated by: Genus(TM) Synthesis Solution 18.10-p003_1
Generated on: Dec 17 2025 03:37:31 pm
Module: top
Technology library: D_CELLS_HD_LPMOS_typ_1_20V_25C 3.0.1
Operating conditions: typ_1_20V_25C
Interconnect mode: global
Area mode: physical library
=====
Instance Module Cell Count Cell Area Net Area Total Area
top 1130 33949.082 16337.194 50286.275
datapath dp 1126 33806.080 14010.158 47816.238
control_unit fsm 4 143.002 7.824 150.825
```

**-62,7%**

**-5,3%**

**Total Power**

**663,977 uW**

**Energy/output**

**18,3927 pJ**