

# FPGA Implementation of Adaptive LMS Filter using Offset Binary Coding (OBC)

Names: Rajdeep Saha, Ramkushal B.  
Roll Nos: IMT2023600, IMT2023601

November 25, 2025

## Abstract

An Adaptive Noise Canceller based on the Sign-Error LMS algorithm with Offset Binary Coding was implemented on an Artix-7 FPGA. Python-based Golden Model verification and post-implementation timing, power, resource, and critical-path reports demonstrate that the architecture is efficient, scalable, and capable of real-time DSP performance.

## Contents

|          |                                                   |          |
|----------|---------------------------------------------------|----------|
| <b>1</b> | <b>Introduction</b>                               | <b>2</b> |
| <b>2</b> | <b>Mathematical Background</b>                    | <b>2</b> |
| 2.1      | Adaptive Noise Cancellation . . . . .             | 2        |
| 2.2      | Sign-Error LMS Algorithm . . . . .                | 2        |
| 2.3      | Offset Binary Coding (OBC) . . . . .              | 2        |
| <b>3</b> | <b>System Architecture</b>                        | <b>3</b> |
| 3.1      | Block Diagram . . . . .                           | 3        |
| 3.2      | Signal Definitions & Mathematical Model . . . . . | 3        |
| 3.3      | Hardware Implementation Details . . . . .         | 4        |
| 3.4      | Design Constraints . . . . .                      | 4        |
| 3.5      | Simulation Testbench . . . . .                    | 4        |
| <b>4</b> | <b>Verification and Results</b>                   | <b>5</b> |
| 4.1      | Python Golden Model . . . . .                     | 5        |
| 4.2      | Hardware Simulation . . . . .                     | 5        |
| 4.3      | Hardware ILA Results . . . . .                    | 6        |
| 4.4      | Key Insights . . . . .                            | 6        |
| <b>5</b> | <b>Post-Implementation Reports</b>                | <b>6</b> |
| 5.1      | Timing Summary . . . . .                          | 7        |
| 5.2      | Critical Path Analysis . . . . .                  | 7        |
| 5.3      | Resource Utilization . . . . .                    | 7        |
| 5.4      | Power Report . . . . .                            | 8        |
| <b>6</b> | <b>Conclusion</b>                                 | <b>8</b> |
| <b>7</b> | <b>Project Repository</b>                         | <b>8</b> |

# 1 Introduction

Adaptive filters are essential components in digital signal processing, widely used for applications such as noise cancellation, system identification, and echo cancellation. This project focuses on the hardware implementation of an Adaptive Noise Canceller (ANC) on an Artix-7 FPGA.

The core objectives of this project were:

- To implement the Least Mean Squares (LMS) adaptive algorithm.
- To utilize the **Sign-Error** variant of LMS to reduce hardware complexity.
- To employ **Offset Binary Coding (OBC)** for the inner product calculation, replacing standard multipliers with shift-add logic.
- To verify the design using Python Golden Model and hardware Integrated Logic Analyzer (ILA).

## 2 Mathematical Background

### 2.1 Adaptive Noise Cancellation

The system takes two inputs: a primary signal  $d[n]$  containing the desired signal plus noise, and a reference signal  $x[n]$  containing correlated noise. The filter estimates the noise  $y[n]$  and subtracts it from  $d[n]$  to produce the clean error signal  $e[n]$ .

$$e[n] = d[n] - y[n] \quad (1)$$

### 2.2 Sign-Error LMS Algorithm

The standard LMS update rule  $w_k[n + 1] = w_k[n] + \mu e[n]x[n - k]$  requires significant multiplication resources. To optimize for hardware, we implemented the Sign-Error LMS variant:

$$w_k[n + 1] = w_k[n] + \mu \cdot \text{sign}(e[n]) \cdot x[n - k] \quad (2)$$

Here,  $\mu$  denotes the learning rate and  $\mathbf{w} = [w_0, w_1, \dots, w_{N-1}]$  represents the adaptive filter weights. This simplifies the update step to a conditional addition or subtraction of the input vector.

### 2.3 Offset Binary Coding (OBC)

To compute the inner product  $\mathbf{y}[\mathbf{n}] = \mathbf{w}^T \mathbf{x}$  without standard multipliers, we utilized Offset Binary Coding principles in a distributed arithmetic style logic. The inner product is decomposed into bit-slices:

$$y[n] = \sum_{j=0}^{B-1} D_j 2^{-j} + D_{\text{offset}} \quad (3)$$

Where  $D_j$  is a partial sum of weights conditioned on the input bits, and  $D_{\text{offset}}$  is a pre-calculated correction term  $(-\frac{1}{2} \sum w_k)$ . This allows the filter to be implemented using only adders, shifters, and accumulators.

### 3 System Architecture

#### 3.1 Block Diagram

The system consists of an 8-tap FIR filter structure with feedback control for weight updates.



Figure 1: Block Diagram of the 8-Tap Adaptive Sign-Error LMS Filter with OBC

#### 3.2 Signal Definitions & Mathematical Model

The system signals and governing equations are defined as follows:

##### Explanation of Signals:

- $x_{in}[n]$ : Reference noise input.
- $d_{in}[n]$ : Primary input (containing Clean Signal + Noise).
- $x_0 \dots x_7$ : Stored current and past 7 samples of  $x_{in}$ .
- $y_{out}[n]$ : Estimated noise generated by the adaptive filter.
- $e_{out}[n]$ : Cleaned signal (Noise-reduced output).

##### Mathematical Equations:

$$d_{in}[n] = s[n] + p[n]$$

$$y_{out}[n] = \sum_{k=0}^7 w_k[n] x_{in}[n - k]$$

$$e_{out}[n] = d_{in}[n] - y_{out}[n]$$

$$w_k[n + 1] = w_k[n] + \mu \cdot \text{sign}(e_{out}[n]) \cdot x_{in}[n - k]$$

### 3.3 Hardware Implementation Details

The design was implemented in Verilog HDL using Vivado 2022.2.

- **Memory (rom\_data.v):** Stores 1024 samples of noisy sine wave data generated via Python Golden Model. Implemented as a synthesizable ROM.
- **LMS Core (lms\_core.v):** Implements the OBC Pipelined Architecture.
  - *Stage 1 (Offset)*: Calculates the OBC offset term.
  - *Stage 2 (Slicing)*: Determines the weight sum for the current bit position.
  - *Stage 3 (Accumulate)*: Shifts and accumulates partial results into a 40-bit accumulator to prevent overflow.
  - *Stage 4 (Update)*: Updates weights based on the sign of the error.
- **Top Level (top\_lms\_system.v):** Integrates the ROM, LMS Core, and Integrated Logic Analyzer (ILA) for real-time debugging.

### 3.4 Design Constraints

The system targets the Basys 3 FPGA board (Artix-7 XC7A35T).

- **Clock:** A 100 MHz system clock is sourced from the onboard oscillator (Pin W5). This high frequency ensures the multi-cycle OBC calculation completes well within the sampling period of audio-range signals.
- **Reset:** The reset signal is mapped to **Switch 0 (Pin V17)**. This allows for a stable manual reset, which is crucial for the ILA trigger setup. A 'set\_false\_path' constraint is applied to this pin to safely ignore timing analysis on this asynchronous human input.
- **ILA Trigger:** The debug core is configured with 'Capture Control' enabled. This is essential because the OBC core takes multiple clock cycles to process one sample. The trigger condition 'core\_done == 1' ensures the ILA only records valid output samples, filtering out intermediate calculation states.

### 3.5 Simulation Testbench

A behavioral testbench was developed to validate the logic before synthesis.

- **Clock Generation:** Generates a 100 MHz clock (period = 10ns).
- **Duration:** The simulation runs for 1,000,000 ns (1 ms). This duration is sufficient to process thousands of samples, allowing the adaptive filter enough time to converge and visibly reduce the error signal in the waveform viewer.

## 4 Verification and Results

### 4.1 Python Golden Model

A Python script was developed to simulate the Sign-Error LMS algorithm and generate the test vectors. This served as the reference for hardware verification.



Figure 1: Python Golden Model Output: Comparison of Clean Signal vs Filtered Error

### 4.2 Hardware Simulation

Behavioral simulation was performed to verify the timing and logic correctness before synthesis.



Figure 2: Vivado Behavioral Simulation Results showing convergence over time.

## 4.3 Hardware ILA Results

The design was deployed on the Basys 3 FPGA. The ILA waveform captures demonstrate the real-time noise cancellation capability. By using Capture Control, valid output samples were isolated from the calculation latency.



Figure 3: ILA Hardware Capture showing  $d_{in}$  (Noisy Input) vs  $e_{out}$  (Clean Output).

## 4.4 Key Insights

- The adaptive LMS filter does not cancel noise instantly, it requires multiple iterations for the weights to converge and learn the noise characteristics.
- Initially, the error signal  $e_{out}[n]$  contains strong noise because the filter has not yet estimated the noise component accurately.
- As the weights update over time, the filter estimate  $y_{out}[n]$  becomes increasingly similar to the reference noise  $x_{in}[n]$ , indicating successful modelling of the noise.
- After convergence, subtracting  $y_{out}[n]$  from the primary input  $d_{in}[n]$  results in a smooth sinusoidal signal, proving that the clean target signal  $s[n]$  has been recovered.
- The similarity between behavioral simulation results and real FPGA ILA capture confirms the correctness of both the algorithm and the hardware implementation.

## 5 Post-Implementation Reports

After synthesis and implementation, several hardware performance reports were generated in Vivado to verify timing closure, resource efficiency, and power consumption of the proposed LMS–OBC architecture. The main results are summarised below.

## 5.1 Timing Summary

The timing report confirms that all user-defined timing constraints are satisfied. The design meets a clock period requirement of 10 ns (100 MHz) with a Worst Negative Slack (WNS) of 0.555 ns, indicating safe timing margins.



Figure 4: Timing Summary



Figure 5: Critical Path Properties

## 5.2 Critical Path Analysis

The worst-case datapath was traced using Vivado's implemented schematic viewer. The path runs through the OBC accumulator and CARRY4 chain, confirming that the inner product logic is the dominant contributor to total delay, an expected outcome for adaptive FIR filters.



Figure 6: Critical Path Schematic generated from implemented design

## 5.3 Resource Utilization

FPGA resource usage was analyzed post-implementation. The design occupies a small percentage of overall device capacity, showing high efficiency due to the OBC approach requiring no DSP slices.



Figure 7: Device Resource Utilization



Figure 8: Power Summary

## 5.4 Power Report

Power estimation was performed using switching activity extracted from implementation. The design shows very low on-chip power consumption of **0.092 W**, demonstrating suitability for low-power embedded signal-processing applications.

Overall, the timing, power, and resource reports confirm that the LMS filter operates efficiently at 100 MHz, converges reliably in real time, and fits comfortably within the available FPGA resources.

## 6 Conclusion

The project successfully demonstrated the implementation of an adaptive filter using OBC logic on an FPGA. The 8-tap Sign-Error LMS filter effectively removed noise from the input signal. The OBC implementation reduced reliance on multipliers by leveraging shift-add logic, while the pipelined architecture ensured timing closure at 100 MHz.

## 7 Project Repository

- GitHub Link: [https://github.com/dipsy32/Adaptive\\_LMS\\_FPGA](https://github.com/dipsy32/Adaptive_LMS_FPGA)