

# Hybrid FPGA and GPP Implementation of IEEE 802.15.4 Physical Layer

Jeong-O Jeong

Thesis submitted to the Faculty of the  
Virginia Polytechnic Institute and State University  
in partial fulfillment of the requirements for the degree of

Master of Science

in

Electrical Engineering

Carl B. Dietrich

Jeffrey H. Reed

Peter Athanas

July 30, 2012

Blacksburg, Virginia

Keywords: Software Defined Radio, FPGA, IEEE 802.15.4, ZigBee, USRP N210

Copyright 2012, Jeong-O Jeong

# Hybrid FPGA and GPP Implementation of IEEE 802.15.4 Physical Layer

Jeong-O Jeong

## ABSTRACT

In this thesis, two different cases of hybrid IEEE 802.15.4 PHY (Physical Layer) implementation are explored. The first case is an FPGA implementation of IEEE 802.15.4 PHY on the Xilinx Spartan-3A DSP FPGA of USRP N210. All of the signal processing tasks are performed on the FPGA, while less complex MAC (Media Access Control) layer tasks are performed in GNU Radio on the host. The second case is an implementation of a multi-channel IEEE 802.15.4 receiver. A four-channel channelizer is implemented on the external Virtex 5 FPGA, while the IEEE 802.15.4 receiver is implemented in GNU Radio on the host. The first case demonstrates how spare resources in USRP's FPGA can be used to implement signal processing task while still interfacing with GNU Radio. The second case builds a platform on which a combination of GNU Radio and an external FPGA can be used for signal processing and USRP as an RF source. This thesis lays out the groundwork for more complex wireless protocols to be implemented on any combination of USRP's FPGA, an external FPGA, and GNU Radio.

# Acknowledgments

I am grateful to Dr. Carl Dietrich who have guided and supported me throughout my years in graduate school. This work would not have been possible without the help and support from Dr. Dietrich, Dr. Reed, Dr. Athanas, Dr. Gaeddert, and friends from Wireless @ VT and CCM Lab.

# **Dedication**

To my family

# Contents

|          |                                             |          |
|----------|---------------------------------------------|----------|
| <b>1</b> | <b>Introduction</b>                         | <b>1</b> |
| 1.1      | Motivation . . . . .                        | 1        |
| 1.2      | Previous Studies . . . . .                  | 3        |
| 1.3      | Goals . . . . .                             | 5        |
| 1.4      | Accomplishments and Contributions . . . . . | 5        |
| <b>2</b> | <b>Background</b>                           | <b>7</b> |
| 2.1      | Software Defined Radio . . . . .            | 7        |
| 2.1.1    | SDR Platforms . . . . .                     | 8        |
| 2.1.2    | Summary . . . . .                           | 15       |
| 2.2      | ZigBee . . . . .                            | 16       |
| 2.2.1    | Overview . . . . .                          | 16       |

|          |                                                       |           |
|----------|-------------------------------------------------------|-----------|
| 2.2.2    | IEEE 802.15.4 . . . . .                               | 17        |
| 2.3      | Polyphase Filter-Bank Channelizer . . . . .           | 21        |
| 2.4      | USRP N210 . . . . .                                   | 25        |
| 2.4.1    | Signal Processing in FPGA of USRP N210 . . . . .      | 26        |
| <b>3</b> | <b>Methodology and Implementation</b>                 | <b>32</b> |
| 3.1      | IEEE 802.15.4 PHY on FPGA . . . . .                   | 32        |
| 3.1.1    | Configuration . . . . .                               | 34        |
| 3.1.2    | Transmitter . . . . .                                 | 35        |
| 3.1.3    | Receiver . . . . .                                    | 39        |
| 3.2      | Hybrid Multi-Channel IEEE 802.15.4 Receiver . . . . . | 60        |
| 3.2.1    | Configuration . . . . .                               | 60        |
| 3.2.2    | Channelizer . . . . .                                 | 62        |
| 3.2.3    | Energy Detector . . . . .                             | 68        |
| 3.2.4    | Resampler 4/5 . . . . .                               | 68        |
| 3.2.5    | Ethernet interface . . . . .                          | 70        |
| <b>4</b> | <b>Results</b>                                        | <b>74</b> |

|                     |                                                                           |            |
|---------------------|---------------------------------------------------------------------------|------------|
| 4.1                 | Resource Utilization . . . . .                                            | 74         |
| 4.1.1               | Transmitter . . . . .                                                     | 74         |
| 4.1.2               | Receiver . . . . .                                                        | 76         |
| 4.1.3               | Channelizer . . . . .                                                     | 78         |
| 4.2                 | Interoperability . . . . .                                                | 79         |
| 4.3                 | Performance . . . . .                                                     | 79         |
| 4.3.1               | IEEE 802.15.4 Receiver . . . . .                                          | 79         |
| 4.4                 | Radio Characteristics . . . . .                                           | 91         |
| <b>5</b>            | <b>Conclusion and Future Work</b>                                         | <b>94</b>  |
| <b>Bibliography</b> |                                                                           | <b>96</b>  |
| <b>A</b>            | <b>Verilog Source Code for IEEE 802.15.4 Receiver on USRP N210's FPGA</b> | <b>100</b> |
| A.1                 | Receiver Top Level . . . . .                                              | 100        |
| A.2                 | Strober . . . . .                                                         | 107        |
| A.3                 | AGC . . . . .                                                             | 108        |
| A.4                 | Delay-conjugate-multiply . . . . .                                        | 114        |
| A.5                 | Clock Recovery . . . . .                                                  | 117        |

|          |                                                                              |            |
|----------|------------------------------------------------------------------------------|------------|
| A.6      | Symbol Correlation . . . . .                                                 | 125        |
| A.7      | Find Max 16-input . . . . .                                                  | 127        |
| A.8      | Find Max 2-input . . . . .                                                   | 129        |
| A.9      | CRC-16 . . . . .                                                             | 132        |
| A.10     | MAC State Machine . . . . .                                                  | 133        |
| <b>B</b> | <b>Verilog Source Code for IEEE 802.15.4 Transmitter on USRP N210's FPGA</b> | <b>141</b> |
| B.1      | Top Level Transmitter . . . . .                                              | 141        |
| B.2      | Symbols to Chips . . . . .                                                   | 145        |
| B.3      | GNU Radio Packed to Unpacked . . . . .                                       | 147        |
| B.4      | GNU Radio Chunks to Symbols . . . . .                                        | 149        |
| B.5      | Upsampler K=4 . . . . .                                                      | 151        |
| B.6      | Half-Sine Pulse Shaper . . . . .                                             | 153        |
| B.7      | Delay Quadrature . . . . .                                                   | 156        |
| <b>C</b> | <b>Verilog Source Code for Multi-channel IEEE 802.15.4 Receiver</b>          | <b>158</b> |
| C.1      | Top Level Multi-Channel Receiver . . . . .                                   | 158        |
| C.2      | 1:4 Commutator . . . . .                                                     | 161        |

|          |                                             |            |
|----------|---------------------------------------------|------------|
| C.3      | Four-Channel Channelizer . . . . .          | 163        |
| C.4      | Polyphase Filter Bank . . . . .             | 166        |
| C.5      | Four-point FFT . . . . .                    | 172        |
| C.6      | Complex Mixer . . . . .                     | 174        |
| C.7      | Energy Detector . . . . .                   | 176        |
| C.8      | Resampler 4/5 . . . . .                     | 180        |
| C.9      | Accumulator and Overflow Detector . . . . . | 183        |
| <b>D</b> | <b>C++ Source Code for GNU Radio Blocks</b> | <b>186</b> |
| D.1      | GNU Radio Transmitter .h and .cc . . . . .  | 186        |
| D.2      | GNU Radio Receiver .h and .cc . . . . .     | 191        |

# List of Abbreviations

|         |                                                   |
|---------|---------------------------------------------------|
| ASIC    | Application-Specific Integrated Circuit           |
| ASK     | Amplitude-Shift Keying                            |
| CIC     | Cascaded Integrator-Comb                          |
| CLB     | Configurable Logic Block                          |
| CRC     | Cyclic Redundancy Check                           |
| CSMA-CA | Carrier Sense Multiple Access Collision Avoidance |
| DAC     | Digital-to-Analog Converter                       |
| DFT     | Discrete Fourier Transform                        |
| DSP     | Digital Signal Processor                          |
| DSSS    | Direct Sequence Spread Spectrum                   |
| FFT     | Fast Fourier Transform                            |

|        |                                      |
|--------|--------------------------------------|
| FIR    | Finite Impulse Response              |
| FPGA   | Field-Programmable Gate Array        |
| GPP    | General Purpose Processor            |
| HDL    | Hardware Description Language        |
| ISM    | Industrial, Scientific and Medical   |
| LUT    | Look-Up Table                        |
| MAC    | Media Access Control                 |
| MAC    | multiply-accumulate                  |
| MSK    | Minimum-Shift Keying                 |
| O-QPSK | Offset Quadrature Phase-Shift Keying |
| PHR    | PHY Header                           |
| PHY    | Physical Layer                       |
| PN     | Pseudo-random Noise                  |
| PPDU   | PHY Protocol Data Unit               |
| PSDU   | Physical Layer Service Data Unit     |
| RTL    | Register-Transfer Level              |

|      |                                        |
|------|----------------------------------------|
| SDR  | Software Defined Radio                 |
| SFD  | Start of Frame Delimiter               |
| SHR  | Synchronization Header                 |
| SOC  | System on Chip                         |
| SVD  | Singular Value Decomposition           |
| UDP  | User Datagram Protocol                 |
| USRP | Universal Software Radio Peripheral    |
| VITA | VMEbus International Trade Association |
| VRT  | VITA Radio Transport                   |
| WPAN | Wireless Personal Area Network         |

# List of Figures

|      |                                                       |    |
|------|-------------------------------------------------------|----|
| 2.1  | Ideal Software Defined Radio . . . . .                | 8  |
| 2.2  | IEEE 802.15.4 Data Frame [1] . . . . .                | 18 |
| 2.3  | O-QPSK Chip Offset [1] . . . . .                      | 21 |
| 2.4  | Polyphase Channelizer . . . . .                       | 22 |
| 2.5  | Single Channel M-to-1 Resampler . . . . .             | 23 |
| 2.6  | Input to Eight-Channel Channelizer . . . . .          | 24 |
| 2.7  | Output of Each Channel . . . . .                      | 25 |
| 2.8  | SOC in USRP's FPGA . . . . .                          | 27 |
| 2.9  | Transmit Signal Processing Chain . . . . .            | 27 |
| 2.10 | Time-Domain Plot of First Half-Band Filter . . . . .  | 28 |
| 2.11 | Time-Domain Plot of Second Half-Band Filter . . . . . | 29 |
| 2.12 | Frequency Response of Two Halfband Filters . . . . .  | 29 |

|      |                                                                       |    |
|------|-----------------------------------------------------------------------|----|
| 2.13 | CIC Interpolator in USRP N210 . . . . .                               | 30 |
| 2.14 | Frequency Response of CIC Interpolator . . . . .                      | 30 |
| 2.15 | Receiver Signal Processing Chain . . . . .                            | 31 |
| 3.1  | Two Paths Explored in FPGA Implementation . . . . .                   | 33 |
| 3.2  | Overall Setup of IEEE 802.15.4 PHY on FPGA . . . . .                  | 34 |
| 3.3  | Overall Transmitter Chain . . . . .                                   | 35 |
| 3.4  | Simulation of O-QPSK at Baseband . . . . .                            | 35 |
| 3.5  | Transmitter Chain . . . . .                                           | 36 |
| 3.6  | ModelSim simulation of transmitted signal . . . . .                   | 38 |
| 3.7  | Transmitter module inside USRP's FPGA . . . . .                       | 39 |
| 3.8  | Chipscope showing interpolated in-phase waveform of O-QPSK . . . . .  | 40 |
| 3.9  | Overall receiver algorithm . . . . .                                  | 40 |
| 3.10 | AGC Structure . . . . .                                               | 41 |
| 3.11 | Delay-conjugate-multiply . . . . .                                    | 42 |
| 3.12 | Output of delay-conjugate-multiply . . . . .                          | 42 |
| 3.13 | Clock Recovery Block Structure . . . . .                              | 43 |
| 3.14 | Comparison of Sampled Chips With and Without Clock Recovery . . . . . | 43 |

|                                                                                   |    |
|-----------------------------------------------------------------------------------|----|
| 3.15 Preamble and SFD correlations . . . . .                                      | 44 |
| 3.16 BER Simulation of Soft vs. Hard Correlations . . . . .                       | 47 |
| 3.17 Packet Detection Rate Simulation of Soft vs. Hard Correlations . . . . .     | 47 |
| 3.18 Output of delay-conjugate-multiply with frequency offset of 20 KHz . . . . . | 48 |
| 3.19 Frequency Offset Estimation . . . . .                                        | 51 |
| 3.20 Delay-conjugate-multiply Block in Simulink . . . . .                         | 51 |
| 3.21 Complex Mulitply Block in Simulink . . . . .                                 | 52 |
| 3.22 Black Box wrapper for a simple MAC processing . . . . .                      | 52 |
| 3.23 Output of CORDIC atan without AGC . . . . .                                  | 53 |
| 3.24 Output of CORDIC atan with AGC . . . . .                                     | 54 |
| 3.25 ModelSim Simulation Showing Preamble and SFD Corrleations . . . . .          | 55 |
| 3.26 CRC-16 . . . . .                                                             | 55 |
| 3.27 State machine for MAC layer . . . . .                                        | 56 |
| 3.28 Receiver module inside USRP's FPGA . . . . .                                 | 57 |
| 3.29 GNU Radio companion view of ZigBee RX . . . . .                              | 59 |
| 3.30 Modified UHD USRP Source Block . . . . .                                     | 59 |
| 3.31 Control Chain in Modifying Registers Inside FPGA from GNU Radio . . . . .    | 61 |

|                                                                                    |    |
|------------------------------------------------------------------------------------|----|
| 3.32 System Configuration of Hybrid Multi-Channel IEEE 802.15.4 Receiver . . . . . | 62 |
| 3.33 Channelizer 4:1 . . . . .                                                     | 63 |
| 3.34 Prototype Lowpass Filter . . . . .                                            | 64 |
| 3.35 Input to Four-Channel Channelizer . . . . .                                   | 64 |
| 3.36 Output of Four-Channel Channelizer . . . . .                                  | 65 |
| 3.37 Filter with Real Coefficients . . . . .                                       | 65 |
| 3.38 Filter with Complex Coefficients . . . . .                                    | 65 |
| 3.39 Energy Detector . . . . .                                                     | 68 |
| 3.40 Resampler 4/5 . . . . .                                                       | 69 |
| 3.41 Resampler 4/5 Simulation . . . . .                                            | 69 |
| 3.42 Flow of Data and Control Packets . . . . .                                    | 70 |
| 3.43 Ethernet interface in XUPV5 . . . . .                                         | 72 |
| 3.44 Frames at Different OSI Layers . . . . .                                      | 72 |
| 4.1 Message from FPGA received by Xbee console . . . . .                           | 80 |
| 4.2 Chip Error Rate Simulation . . . . .                                           | 81 |
| 4.3 BER Simulation . . . . .                                                       | 82 |
| 4.4 Received Packets in Time Domain . . . . .                                      | 83 |

|      |                                                                           |    |
|------|---------------------------------------------------------------------------|----|
| 4.5  | Percentages of Packets Detected . . . . .                                 | 86 |
| 4.6  | Packet Error Rate . . . . .                                               | 86 |
| 4.7  | Bit Error Rate . . . . .                                                  | 87 |
| 4.8  | Output of Inverse Tangent with Correct Sign . . . . .                     | 88 |
| 4.9  | Output of Inverse Tangent with Mismatched Sign . . . . .                  | 89 |
| 4.10 | Output of Inverse Tangent with Increased CORDIC Iterations . . . . .      | 89 |
| 4.11 | Packet Detection Rate with Varying Word Length . . . . .                  | 90 |
| 4.12 | Spectral Mask of IEEE 802.15.4 transmitted using USRP N210 FPGA . . . . . | 92 |
| 4.13 | Center Frequency of IEEE 802.15.4 Signal from USRP N210 FPGA . . . . .    | 93 |
| 4.14 | Occupied Bandwidth . . . . .                                              | 93 |
| 4.15 | O-QPSK Constellation . . . . .                                            | 93 |

# List of Tables

|     |                                                                                               |    |
|-----|-----------------------------------------------------------------------------------------------|----|
| 2.1 | ZigBee Channels in 2.4 GHz ISM Band . . . . .                                                 | 17 |
| 2.2 | Preamble Field [1] . . . . .                                                                  | 19 |
| 2.3 | SFD Field [1] . . . . .                                                                       | 19 |
| 2.4 | Symbol to Chip Mapping [1] . . . . .                                                          | 20 |
| 2.5 | USRP N210 Specification . . . . .                                                             | 25 |
| 2.6 | FPGAs of USRP N210 (3A3400) and USRP2 (3S2000) . . . . .                                      | 26 |
| 3.1 | Output data format for USRP . . . . .                                                         | 58 |
| 4.1 | Device Utilization Summary of ZigBee TX core on Xilinx Spartan 3-2000 . .                     | 75 |
| 4.2 | Device Utilization Summary of ZigBee TX core and USRP core on Xilinx Spartan 3-2000 . . . . . | 75 |
| 4.3 | Device Utilization Summary of ZigBee RX core on Xilinx Spartan 3A-DSP3400                     | 76 |

|                                                                                                       |    |
|-------------------------------------------------------------------------------------------------------|----|
| 4.4 Device Utilization Summary of ZigBee RX core and USRP core on Xilinx Spartan 3A-DSP3400 . . . . . | 76 |
| 4.5 Device Utilization Summary of Karve's ZigBee RX core (14-bit) on XUPV5 [2] . . . . .              | 77 |
| 4.6 Post-Synthesis Timing Summary . . . . .                                                           | 77 |
| 4.7 Post-PAR Timing Summary . . . . .                                                                 | 77 |
| 4.8 Device Utilization Summary of Channelizer on Xilinx Virtex 5 LX110T . . . . .                     | 78 |
| 4.9 Device Utilization Summary of Channelizer with Ethernet MAC on Xilinx Virtex 5 LX110T . . . . .   | 78 |
| 4.10 Interoperability Chart . . . . .                                                                 | 79 |

# **Chapter 1**

## **Introduction**

### **1.1 Motivation**

Software Defined Radio (SDR) attempts to leverage the flexibility of software or reconfigurable hardware to implement flexible radios that can easily switch to different waveforms and standards. This is achieved by migrating most of the signal processing traditionally implemented on hardware onto software. By implementing the signal processing region with software, the matter of reconfiguring the system is simply loading a different image, which can be done even in real-time. The most common platforms on which SDR is implemented are GPP (general purpose processor), DSP (digital signal processor), and FPGA (field-programmable gate array). Each of these platforms has its own advantages and disadvantages. Although GPPs are relatively easy to program, test, and verify, they

are often not well-suited for parallel, signal-processing intensive, or real-time constrained operations. DSPs are optimized for certain signal processing tasks, but they are harder to program than GPPs and not optimized for parallel computations. FPGAs are well-suited for parallel and signal-processing intensive computations, but they are much harder to program and verify. However, many high-level tools such as Simulink and ImpulseC are available to facilitate the programming of FPGA. These tools allow engineers to describe the algorithm at high-level that can be automatically converted to HDL (Hardware Description Language) to be implemented on the FPGA.

As mentioned, GPPs and DSPs are not the best platforms on which signal-processing intensive algorithms can be implemented. Processing power necessary for even one of the simplest wireless protocols such as IEEE 802.15.4 can be too demanding for GPP unless it is a very high-performance GPP. Even with a high-performance GPP, signal processing required for IEEE 802.15.4 consumes most of CPU cycles. It can be seen that wide-band wireless protocols such as IEEE 802.11a, which has 20 MHz of bandwidth and complex modulation scheme such as 52-subcarrier OFDM will be too demanding for GPPs. However, FPGAs, with their ability to perform computations in parallel, support high throughput and high sampling rate that GPPs or DSPs are not able to achieve. ASICs are a popular platform for implementing high-performance protocols, but they have very low re-configurability. With its software-like configurability, FPGAs bring significant reduction in NRE (non-recurring engineering) costs. Once a prototype is built and fault is found in the design, FPGAs can be reconfigured with a modified bit-stream and can

be tested again, whereas ASICs need to be re-spun at a very high-cost, often in millions of dollars [3]. Because FPGAs are high-performance like ASICs and flexible like GPPs and DSPs, they are an ideal platform on which prototype wireless protocols can be built and tested. Additionally, the partial reconfiguration ability of FPGA allows it to reconfigure itself in real-time, which unlocks the SDR's promises of concurrent multi-protocol operation.

## 1.2 Previous Studies

Many implementations of SDR have been done on FPGAs. At Virginia Tech, Charles Irick from Configurable Computing Lab developed an SDR framework which improves upon the GNU Radio framework by allowing an auxiliary Virtex-5 FPGA. In the enhanced GNU Radio framework, a software block representing the auxiliary FPGA controls the dataflow within the software environment. A single high-level software description of GNU Radio and FPGA leaves the mixture of software blocks and the FPGA block transparent to the programmer [4]. More recently, Richard Stroop from the same lab has extended Irick's work to develop a framework called GReasy (GNU Radio Easy). Whereas Irick's work only used a USRP2 as RF front-end and a single auxiliary FPGA, GReasy has successfully worked with not only USRP2 but also a 3.6GSPS ADC as an RF front-end. It has also interfaced with four auxiliary FPGAs for distributed FPGA processing. More importantly, the framework allows rapid reconfiguration of the FPGA by placing and

routing pre-compiled modules. Thus, GReasy presents the user interface where FPGA blocks can be as easily placed and rearranged as software blocks, and the underlying mechanism for compiling the FPGA remains transparent to the user.

In Karve's Master's thesis, the enhanced GNURadio framework developed by Irick was used to implement a ZigBee receiver. The O-QPSK demodulator was implemented on the FPGA. An off-the-shelf ZigBee-compliant solution called XBee was used to transmit packets, which were then received by a ZigBee receiver developed with the framework for verification of interoperability [2].

Implementations of SDR based on other platforms such as GPPs and DSPs are also available. An SDR platform known as SORA is developed on commodity PC architectures. The platform consists of a radio-front end, a radio control board, and a PCIe bus to the CPU cores. Researchers have developed full IEEE 802.11 a/b/g PHY and MAC layers on the SORA platform and successfully interoperated with commercial IEEE 802.11 a/b/g devices. It was possible to implement such a complex standard by introducing optimizations such as replacing complex computations with extensive use of LUTs (Look-Up Tables) in L2 cache and use of SIMD (Single Instruction Multiple Data) instruction sets [5].

Other SDR platforms are based on custom reconfigurable hardware. The AsAP2 (Asynchronous Array of Simple Processors) platform from University of California Davis is composed of an array of 164 simple processors and Viterbi and FFT accelerators. Each processor has own instruction and data memory as well as arithmetic and logic unit. Using this platform, they were able to implement a complete IEEE 802.11a baseband receiver

[6].

### 1.3 Goals

The following list outlines the goals for the thesis.

- To implement IEEE 802.15.4 PHY on Xilinx Spartan 3A-DSP of USRP N210
- To implement a channelizer on an external Virtex 5 FPGA
- To interface GNU Radio and the external Virtex 5 FPGA for hybrid implementations
- To compare performance between GNU Radio and FPGA implementations of IEEE 802.15.4 PHY
- To lay the groundwork for more complex wireless communication protocols and applications to be implemented on FPGA with USRP N210 as an RF front end
- To develop open-source IP cores to be freely used in other SDR projects

### 1.4 Accomplishments and Contributions

The main accomplishment of this thesis is the implementation of the IEEE 802.15.4 PHY on the FPGA. The FPGA implementation was able to successfully inter-operate with a commercially available, standard-compliant ZigBee module as well as an open-source

GNU Radio implementation. Secondly, the interfacing between USRP N210, an external FPGA and GNU Radio with UHD (Universal Hardware Driver) has been implemented. This enables the hybrid implementation of a waveform where a more complex signal processing task such as channelization is performed on the external FPGA, while a simpler task such as demodulation of an IEEE 802.15.4 packet is performed on GNU Radio. Finally, numerous FPGA blocks developed in the course of this thesis work have been made available to Configurable Computing Lab at Virginia Tech for the GReasy project.

# **Chapter 2**

## **Background**

### **2.1 Software Defined Radio**

The term Software Defined Radio, coined by Joe Mitola in 1991, describes a radio whose physical layer is implemented mostly in software. It is a radio “whose physical layer behavior can be significantly altered through changes to its software” [7]. Common platforms for SDR include General Purpose Processors (GPP), Digital Signal Processors (DSP), Field Programmable Gate Arrays (FPGA), and recently even Graphics Processing Unit (GPU). Traditionally, radios are implemented in hardware, which makes it difficult to modify or upgrade after deployment. However, the software-defined nature of SDRs allows ease of modification and flexibility not found in hardware-defined radios. Because of its flexibility, SDRs can be prepared for “proliferation of wireless standards in the fu-



Figure 2.1: Ideal Software Defined Radio

ture” [8]. Due to the flexible and reconfigurable nature of software, SDRs can support multiple air interfaces and multiple modulation schemes as needed.

In order to reach this stage of flexibility, a radio platform close to an ideal SDR model shown in 2.1 is required. An ideal SDR radio would only consist of wide-band antennae that cover the entire RF spectrum, ADC and DAC fast enough to convert that spectrum to digital and analog domains, and processing elements that can process the wideband data. It may include an amplifier for better SNR. All the filtering, up-conversion, and down-conversion are performed in the software-defined processing element. This enables an ideal SDR that is not limited to any band or any modulation scheme.

### 2.1.1 SDR Platforms

The three most popular SDR platforms are GPP, DSP, and FPGA. Each has its own unique advantages and disadvantages.

### GPP (General Purpose Processor)

GPP is a processing unit used for a variety of purposes such as fixed and floating point arithmetic, memory interface, and general input and output. It supports multiple high-level programming languages, and therefore it is the most flexible and easiest to program. However, GPPs generally do not perform computation in parallel and are not optimized for arithmetic operations. For example, the multiply-add operation, the most common operation in signal processing, is not supported in hardware in GPPs.

However, modern microprocessors have started to employ different ways of parallelism to improve DSP and graphics operations. One example is hyperthreading in Intel microprocessors. Hyperthreading allows a single core to act as two logical cores that can execute threads in parallel. Each logical core has its own processor architectural state and shares the execution resources of the physical core. This results in a performance gain of 30 percent when executing multithreaded applications compared to a processor without hyperthreading [9].

Another example of parallelism is the SIMD (single instruction multiple data) technology, known as SSE (Streaming SIMD Extensions) in Intel and 3DNow! in AMD processors. It allows a single instruction to be applied on multiple data simultaneously. This is more efficient than the traditional SISD (single instruction single data) where a single instruction is executed on single data at a time. FFTW, the fastest software implementation of the FFT algorithm, takes advantage of SIMD to achieve its title [10].

The most well-known and established SDR platform for GPP is an open-source software called GNU Radio. It is a software development platform that enables researchers to build software radios from a library of signal processing blocks. While the signal processing blocks are written in C++, the glue logic that connects the blocks is written in Python. Users can also develop their own custom signal processing blocks for their application.

Like FFTW, GNU Radio also takes advantage of SIMD with its VOLK (vector-optimized library of kernels) library. The library consists of vector operations that provide much improvement in performance. For example, a function called *volk\_32fc\_multiply\_aligned16(c, a, b, N)* can perform vector multiplication of two vectors with N items. Without SIMD, the operation would be performed using a standard for-loop that multiplies each vector element, but SIMD allows simultaneous multiplication of the vector elements. Rondeau reports ten percent improvement in speed when using the function in the FFT filters where large number of multiplies may be required [11].

An example of a wireless standard implemented in GNU Radio is the ZigBee PHY implementation by Thomas Schmid from UCLA [12]. The signal processing is done on the general purpose processor with the USRP (Universal Software Radio Peripheral) as the RF front-end. The implementation was verified with a commercially available ZigBee radio compliant with the IEEE 802.15.4 standard. Because of issues such as high latency in GNU Radio, the full protocol stack could not be implemented, but only the physical layer was implemented. The physical layer implemented in this project was 2450 MHz O-QPSK PHY.

Schmid reports that even the high-performance machine they used, a dual Pentium IV, 2.8 GHz with hyperthreading and 4 GB of RAM, could not decode a constant stream of data from the USRP. To measure its performance, the throughput of GNU Radio implementation was measured. At 45 bytes per message, including both the MAC layer payload of 27 bytes and extra bytes at the PHY layer, the GNU Radio implementation could decode slightly above 200 messages per second.

### DSP (Digital Signal Processor)

DSP is a microprocessor optimized for mathematical operations, specifically multiply-accumulate (MAC) functions. Unlike GPPs which use Von Nuemann architecture, DSPs use the Harvard architecture. While the Von Nuemann architecture provides a single bus to fetch both instruction and data from memory, the Harvard architecture provides separate buses for instructions and data. This allows instructions and data to be accessed at the same time for faster computation.

Modern DSPs also employ the VLIW (very long instruction word) architecture. VLIW allows the processor to run multiple independent instructions in a single clock cycle, thus increasing parallelism. DSPs with VLIW provides a performance gain of 1.8 to 2.8 times over traditional DSPs without VLIW [13].

The most distinguishing feature of DSPs is the hardware MAC unit. It performs multiply-accumulate operation which forms the basis of essential DSP operations such as FIR fil-

tering, correlations, and FFTs. With specialized MAC blocks, DSPs can perform multiply-accumulate in one or two clock cycles, while it can take multiple clock cycles in GPPs [14].

However, even with dedicated MAC units, data-intensive processes such as Viterbi encoding and decoding are difficult to accelerate. Some DSP chips provide dedicated hardware-based co-processors such as Viterbi-decoder and turbo decoder, but they cannot be customized to specific design needs [15].

Because of special instruction sets and specialized architectural features, DSPs are usually programmed in low level languages such as assembly and C.

### FPGA (Field-Programmable Gate Array)

FPGA is a reconfigurable hardware consisting of configurable logic blocks (CLBs) and macro blocks connected via programmable interconnects. Xilinx Spartan-3A DSP FPGA targeted for this thesis consists of four types of macro blocks in addition to CLBs: XtremeDSP DSP48A Slice, Block RAM, Input/Output Blocks, and Digital Clock Manager.

CLBs usually consist of look-up tables, flip-flops, and multiplexers, but they can vary among different FPGA devices. Xilinx Spartan-3A DSP FPGA has four slices in each CLB. A slice consists of two LUTs (Look-Up Tables), two flip-flops, two multiplexers, and a carry-chain. Although CLBs can be used to implement multipliers and adders, Spartan-3A FPGA provides XtremeDSP DSP48A slices that are dedicated for 18-bit by 18-

bit multiplication and 48-bit accumulation for MAC (multiply-accumulate) operations. The DSP48A slices are ideal for implementing FIR filters which require adder, multiplier, and storage elements. They can be highly pipelined to provide maximum clock frequencies of 250 MHz.

Virtex 5 LX110T, the other FPGA targeted for this thesis, is a more advanced FPGA than Spartan 3A-DSP. Each slice contains four LUTs, four flipflops, multiplexers, and carry logic. The FPGA also has DSP48E slices which supports 25-bit by 18-bit multiplication, 48-bit adder, and accumulator. With maximum pipelining the DSP48E slices can operate at the maximum frequency of 550 MHz [16].

$$y[n] = \sum_{i=0}^{N-1} x[n-i]h[i] \quad (2.1)$$

When performing FIR (Finite Impulse Response) filtering shown in 2.1, DSP48E slices can be arranged in multiple ways to trade-off speed and resource usage. The Single-Multiplier MACC FIR Filter structure uses the least number of DSP48 slices but has the lowest throughput. This structure performs multiply and accumulate on a pair of input sample and filter coefficient at a time. Thus, it takes N clock cycles to produce a single output for a N-tap filter. The Parallel FIR Filter structure uses N DSP48 slices to perform simultaneous multiplication of N coefficients and N respective input samples and accumulate the results. Thus, it takes a single clock cycle to produce a new output sample. Therefore, the Parallel FIR structure has the highest throughput but uses the most num-

ber of DSP48 slices. The Semi-Parallel FIR Filter structure forms a hybrid between the two structures to obtain a higher throughput than the Single-Multiplier structure but uses less slices than the Parallel structure.

In addition to FIR filtering, other signal processing operations lend themselves well to FPGA implementation. For example, taking singular value decomposition (SVD), shown in 2.2, of a matrix is a common operation in MIMO, radar, or image processing applications.

$$M = U\Sigma V^* \quad (2.2)$$

It is a very computationally intensive process that requires a large number of clock cycles in sequential processors, but the algorithm to produce SVD can be parallelized to be sped up greatly in FPGAs. One well-known systolic array implementation of SVD by Brent can compute SVD in  $O(n \log n)$  time using  $O(n^2)$  processing elements [17]. AccelChip reports a factor of 50 times increase in speed up with FPGA fixed-point implementation of SVD of an 8x8 matrix compared to floating-point implementation done in a DSP chip TI TMS320C67x [18].

Another advantage of FPGA over other platforms is that it is easy to trade-off between speed and resource usage since the hardware is highly configurable. If the design takes up too much resource, the design can be easily modified to be more resource-efficient at the penalty of decrease in speed. On the other hand, if there is plenty of resource available,

the design can be fully parallelized to achieve maximum speed.

Since it is easily reconfigurable, FPGA can also be a great platform for prototyping a final ASIC design. Once the design is finalized and HDL (Hardware Description Language) is written for FPGA, it is often easy to port to ASIC for final production.

### **2.1.2 Summary**

The three most common SDR platforms have been reviewed. Each platform has its own advantages and disadvantages. GPPs are the easiest to program, test, and verify, but they are often too slow to perform complex computations at high sample rates. With dedicated MAC units, DSPs are better suited for signal processing than GPPs, but they too reach the performance limit as the complexity of the application increases. FPGAs can meet performance requirement needed for high bandwidth applications. However, they are often much harder to program, test, and verify than the other two platforms. Fortunately, recent developments in high-level tools such as Simulink and ImpulseC enable algorithm developers to program FPGAs more easily. Also, tools such as AutoESL's AutoPilot and Synopsys Symphony C Compiler make it easy for DSP software engineers to convert their high-level code in C/C++ to RTL (Register-Transfer-Level). These tools have reportedly been able to achieve comparable resource utilization as manually written RTL code [19]. FPGAs may also be almost as energy efficient as DSPs when considering highly complex signal processing applications. Since a large portion of DSP's circuitry is dedicated to

data transfer, the overall energy consumption per computation of FPGA may be better than that of DSP for some applications [20]. A hybrid approach where complex signal processing is partitioned to FPGA while control-type operation is partitioned to microprocessors maybe the best approach. In this thesis, such hybrid implementations of SDR will be explored.

## 2.2 ZigBee

### 2.2.1 Overview

ZigBee is a LR-WPAN (Low Rate-Wireless Personal Area Network) standard commonly used for home control applications and wireless sensor networks. The ZigBee standard defines the application, security, and network layers of the protocol stack, while the physical (PHY) and medium access control (MAC) layers of the standard are based on IEEE 802.15.4. The standard specifies the maximum data rate to be 250kbps and the maximum range to be 100m. Because of its low data rate and simple architecture, it is low cost and consumes less power compared to other WPAN protocols such as Bluetooth. ZigBee devices can last as long as five years on a pair of AA batteries [21].

The network layer of ZigBee supports different types of network topologies such as star, tree, and mesh networks. It supports ad-hoc networking where routes are automatically discovered as new nodes join the network. It has self-healing ability which allows nodes

Table 2.1: ZigBee Channels in 2.4 GHz ISM Band

| Channels                 | 11   | 12   | 13   | 14   | 15   | 16   | 17   | 18   |
|--------------------------|------|------|------|------|------|------|------|------|
| Center Frequencies (MHz) | 2405 | 2410 | 2415 | 2420 | 2425 | 2430 | 2435 | 2440 |
| Channels                 | 19   | 20   | 21   | 22   | 23   | 24   | 25   | 26   |
| Center Frequencies (MHz) | 2445 | 2450 | 2455 | 2460 | 2465 | 2470 | 2475 | 2480 |

to discover new routes if intermediary nodes in the route fail. The MAC layer uses CSMA-CA (Carrier Sense Multiple Access Collision Avoidance) to avoid packet collisions among multiple nodes.

This thesis implements the PHY layer as specified in IEEE 802.15.4 and a MAC-like control layer, but leaves out the higher MAC, network, and application layers.

## 2.2.2 IEEE 802.15.4

The IEEE 802.15 is a working group formed to define Wireless Personal Area Network (WPAN) standards. There are seven task groups in the working group. The ZigBee protocol is based on the IEEE 802.15.4 task group which defines low rate WPAN of 20 kbps, 40 kbps, and 250 kbps. The standard developed by the IEEE 802.15.4 task group allows for 16 channels in the 2.4 GHz ISM band, 10 channels in the 915 MHz band, and one channel in the 868MHz band. The center frequencies used for the 2.4 GHz band are shown in Table 2.1. The primary target applications for the standard are in home control, sensors, interactive toys, smart badges, and remote controls. The standard serves as PHY and MAC layers for the ZigBee standard.



Figure 2.2: IEEE 802.15.4 Data Frame [1]

### IEEE 802.15.4 Frame Structure

There are four different types of frames defined at the MAC sub-layer. They include beacon frame, data frame, acknowledgment frame, and MAC command frame. Beacon frames are used to synchronize the nodes in the network for slotted CSMA-CA. Data frames are used for all transfers of data between nodes. Acknowledgment frames are used for confirmation of successful reception of data or MAC command frames. If the transmitter does not receive the acknowledgment frame, it will retransmit. MAC command frames are used for transmitting low-level MAC commands. The data frame shown in Figure 2.2 was implemented in this thesis.

The MAC sub-layer frame is embedded into the PSDU (Physical Layer Service Data Unit) of the PHY layer. The PSDU is prefixed with SHR (synchronization header) and PHR (PHY header) to be transmitted over the air. Within SHR, Preamble Sequence is used by the receiver to detect and synchronize to the received packets. The preamble is simply all zero bits for all PHYs except for the ASK PHY. For the 2.4 GHz O-QPSK , the preamble is 4 octets of zeros, as shown in Table 2.2, equivalent to 8 symbols, and it is 128 us long [1].

The SFD (Start of Frame Delimiter) field indicates the start of PHR. For 2.4 GHz O-QPSK,

Table 2.2: Preamble Field [1]

| Bits 0: | 1  | 2  | 3  | 4  | 5  | 6  | 7  | 8  | 9  | 10 | 11 | 12 | 13 | 14 | 15 |
|---------|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|
| 0       | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  |
| 16      | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 |
| 0       | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  |

Table 2.3: SFD Field [1]

| Bits 0: | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|---------|---|---|---|---|---|---|---|
| 1       | 1 | 1 | 0 | 0 | 1 | 0 | 1 |

the SFD field is 8 bits long as shown in Table 2.3.

The Frame Length field is a 7 bits long field indicating the total number of octets in the PSDU. The valid frame length values are from 9 to  $aMaxPHYPacketSize$  of 127. The Frame Length field is followed by a reserved bit.

## 2450 MHz PHY specifications

The 2450 MHz PHY specification of IEEE 802.15.4 supports data rate of up to 250kb/s. It uses DS-SS (Direct Sequence Spread Spectrum) with O-QPSK (Offset-QPSK) for modulation. Each O-QPSK symbol is one of 16 quasi-orthogonal pseudo-random noise (PN) sequences. Each symbol is 32-chip long and corresponds to one of 16 possible combinations of four information bits. The mapping of 4-bit symbol to 32-chip sequence is shown in Figure 2.4.

Table 2.4: Symbol to Chip Mapping [1]

| Data Symbol ( $b_0 b_1 \dots b_3$ ) | Chip values ( $c_0 c_1 \dots c_{31}$ )                                |
|-------------------------------------|-----------------------------------------------------------------------|
| 0 0 0 0                             | 1 1 0 1 1 0 0 1 1 0 0 0 0 1 1 0 1 0 1 0 0 1 0 0 0 1 0 1 1 0           |
| 1 0 0 0                             | 1 1 1 0 1 1 0 1 1 0 0 1 1 0 0 0 0 1 1 0 1 0 1 0 0 1 0 0 0 1 0         |
| 0 1 0 0                             | 0 0 1 0 1 1 1 0 1 1 0 1 1 0 0 1 1 1 0 0 0 0 1 1 0 1 0 1 0 0 1 0       |
| 1 1 0 0                             | 0 0 1 0 0 0 1 0 1 1 0 1 1 0 1 1 0 0 1 1 1 0 0 0 0 1 1 0 1 0 1 0 1     |
| 0 0 1 0                             | 0 1 0 1 0 0 1 0 0 0 1 0 1 1 0 1 1 0 1 1 0 0 1 1 1 0 0 0 0 1 1         |
| 1 0 1 0                             | 0 0 1 1 0 1 0 1 0 0 1 0 0 0 1 0 1 1 0 1 1 0 1 1 0 0 1 1 1 0 0         |
| 0 1 1 0                             | 1 1 0 0 0 0 1 1 0 1 0 1 0 0 1 0 0 0 1 0 1 1 1 0 1 1 0 1 1 0 0 1       |
| 1 1 1 0                             | 1 0 0 1 1 1 0 0 0 0 1 1 0 1 0 0 1 0 0 0 1 0 1 1 1 0 1 1 0 1 0 1       |
| 0 0 0 1                             | 1 0 0 0 1 1 0 0 1 0 0 1 0 1 1 0 0 0 0 0 1 1 1 0 1 1 1 0 1 1           |
| 1 0 0 1                             | 1 0 1 1 1 0 0 0 1 1 0 0 1 0 0 1 0 1 1 0 0 0 0 0 1 1 1 0 1 1 1         |
| 0 1 0 1                             | 0 1 1 1 1 0 1 1 0 0 0 1 1 0 0 1 0 0 1 0 1 1 0 0 0 0 0 1 1 1           |
| 1 1 0 1                             | 0 1 1 1 0 1 1 1 0 1 1 0 0 0 1 1 0 0 1 0 0 1 0 1 1 0 0 0 0 0 1 1 0     |
| 0 0 1 1                             | 0 0 0 0 0 1 1 0 1 1 1 0 1 1 1 0 0 0 1 1 0 0 1 0 0 1 0 0 1 0 1 1 0     |
| 1 0 1 1                             | 0 1 1 0 0 0 0 0 1 1 1 0 1 1 1 0 1 1 1 0 0 0 1 1 0 0 1 0 0 1 0 0 1     |
| 0 1 1 1                             | 1 0 0 1 0 1 1 0 0 0 0 0 1 1 0 1 1 1 1 0 1 1 1 0 0 0 1 1 0 0 0 1 1 0 0 |
| 1 1 1 1                             | 1 1 0 0 1 0 0 1 0 1 1 0 0 0 0 0 1 1 1 0 1 1 1 1 0 1 1 1 0 0 0 1 1 0 0 |



Figure 2.3: O-QPSK Chip Offset [1]

Chip sequences are modulated onto the carrier using O-QPSK with half-sine pulse shaping. With the half-sine pulse shaping, the O-QPSK modulation can be treated as MSK (Minimum-Shift Keying) with a modulation index of  $h=0.5$ , which allows for a simple MSK demodulator to be used at the receiver. The biggest advantage of O-QPSK is that relatively non-linear amplifier can be used since the envelope of the signal remains constant.

Figure 2.3 shows how O-QPSK signal is generated from the chip sequences. The even-indexed chips are modulated onto the in-phase carrier, and odd-indexed chips are modulated onto the quadrature-phase carrier. The Q-phase is delayed by  $T_c$  with respect to I-phase chips to create O-QPSK from QPSK.

## 2.3 Polyphase Filter-Bank Channelizer

A polyphase filter-bank channelizer is an efficient signal processing technique used to divide up a wideband spectrum into a number of smaller evenly spaced bands. Its structure consists of a commutator, a polyphase filter bank, and a DFT block as shown in Figure 2.4. The polyphase filterbank consists of  $M$  filters created from a lowpass filter where  $M$



Figure 2.4: Polyphase Channelizer

is the number of channels.

The lowpass filter used to create the polyphase bank can be written as a one-dimensional array of coefficients as the following in the z-domain.

$$\begin{aligned}
 H(z) &= \sum_{n=0}^{N-1} h[n]z^{-n} \\
 &= h[0] + h[1]z^{-1} + h[2]z^{-2} + \dots + h[N-1]z^{-(N-1)}
 \end{aligned} \tag{2.3}$$

The coefficients can be rearranged as a two-dimensional array where the number of rows is M as shown in Equation 2.4.

$$\begin{aligned}
 H(z) &= h[0] + h[M+0]z^{-M} + h[2M+0]z^{-2M} + \dots \\
 &\quad + h[1]z^{-1} + h[M+1]z^{-(M+1)} + h[2M+1]z^{-(2M+1)} + \dots \\
 &\quad + h[2]z^{-2} + h[M+2]z^{-(M+2)} + h[2M+2]z^{-(2M+2)} + \dots \\
 &\quad \dots \\
 &\quad + h[M-1]z^{-(M-1)} + h[2M-1]z^{-(2M-1)} + h[3M-1]z^{-(3M-1)} + \dots
 \end{aligned} \tag{2.4}$$



Figure 2.5: Single Channel M-to-1 Resampler

Each row of Equation 2.4 can be grouped together so that  $H(z)$  can be re-written in the following way.

$$\begin{aligned}
 H(z) &= H_0(z^M) \\
 &+ z^{-1}H_1(z^M) \\
 &+ z^{-2}H_2(z^M) \\
 &+ \dots \\
 &+ z^{-(M-1)}H_{M-1}(z^M)
 \end{aligned} \tag{2.5}$$

where  $H_0(z^M)$  is the terms in the first row of Equation 2.4,  $z^{-1}H_1(z^M)$  is the terms in the second row, and so on.

The  $M$  terms in Equation 2.5 correspond to the  $M$  filters in the polyphase filter bank. The commutator act as the delay factors in front of the  $M$  terms. Given  $M$  channels, Figure 2.5 shows the resampler structure where a channel centered at  $k^{th}$  center frequency can



Figure 2.6: Input to Eight-Channel Channelizer

be extracted. The output of the  $k^{th}$  channel can be written as Equation 2.6

$$y_k[nM] = \sum_{r=0}^{M-1} y_r[nM] e^{j2\pi \frac{k}{M} r} \quad (2.6)$$

where  $y_r[nM]$  is output of  $r^{th}$  filter in the filter bank. This process is similar to IDFT as defined in Equation 2.7. The IDFT block effectively calculates Equation 2.6 for  $k = 0, 1, \dots, M - 1$ .

$$X[k] = \sum_{n=0}^{M-1} x_n e^{j2\pi \frac{k}{M} n} \quad (2.7)$$

The output of IDFT is then the  $M$  channels evenly spaced and sampled at  $\frac{F_s}{M}$ .

As an illustration, Figure 2.6 shows the positions of eight channels and two input signals occupying two different channels. Figure 2.7 shows the output of each channel. As expected *Channel 0* and *Channel 1* contain the two input signals, while the rest of the channels are empty.



Figure 2.7: Output of Each Channel

Table 2.5: USRP N210 Specification

|              |                                                                                                           |
|--------------|-----------------------------------------------------------------------------------------------------------|
| FPGA         | Xilinx Spartan 3A-DSP3400                                                                                 |
| ADC          | 14-bits 100 MS/s                                                                                          |
| DAC          | 16-bits 400 MS/s                                                                                          |
| RF Bandwidth | 50 MHz of instantaneous RF bandwidth in 8-bit mode<br>25 MHz of instantaneous RF bandwidth in 16-bit mode |
| Connectivity | Gigabit Ethernet Interface                                                                                |

## 2.4 USRP N210

The USRP (Universal Software Radio Peripheral) is a family of software radio platforms developed by Ettus Research LLC. Table 2.5 shows the summary of specifications of USRP N210 [22].

The USRP2, the predecessor of the USRP N210, has almost identical features as the USRP N210, except that it has a smaller Xilinx Spartan 3-2000 FPGA. Table 2.6 compares the FPGAs on USRP N210 and USRP2.

Table 2.6: FPGAs of USRP N210 (3A3400) and USRP2 (3S2000)

| FPGA   | System Gates | Equivalent Logic Cells | Total Slices | Distributed RAM Bits | Block RAM Bits | DSP48As | Dedicated Multipliers | DCMs | Maximum USER I/O |
|--------|--------------|------------------------|--------------|----------------------|----------------|---------|-----------------------|------|------------------|
| 3A3400 | 3400K        | 53,712                 | 23,872       | 373K                 | 2258K          | 126     | N/A                   | 8    | 469              |
| 3S2000 | 2000K        | 46,080                 | 20,480       | 320K                 | 720K           | N/A     | 40                    | 4    | 565              |

### 2.4.1 Signal Processing in FPGA of USRP N210

The FPGA is configured as a SOC (System On Chip) where different IP cores are connected via the Wishbone bus. Figure 2.8 shows the interconnection between different modules inside the FPGA. The Wishbone interconnect has a 32-bit soft-core processor called ZPU as the master. All other fourteen blocks connected to the Wishbone interconnect are slaves. The slave block of most interest is the *Buffer Pool* block. It routes received samples from *DSP\_CORE\_RX0* and *DSP\_CORE\_RX1* to Ethernet MAC for transmission to the host. It also receives frames containing samples to be transmitted over the air from the host and stores into *EXT\_FIFO*. The *VITA\_TX\_CHAIN* then reads the frames from *EXT\_FIFO* and strips the VITA headers. Once the headers are removed, samples are sent to DAC for transmission.

Within the *VITA\_TX\_CHAIN* block interpolation is done so that the sample rate matches the sample rate of the DAC. Figure 2.9 shows how samples from the host are interpolated before being sent to the DAC for transmission. The 32-bit input samples from the host



Figure 2.8: SOC in USRP's FPGA



Figure 2.9: Transmit Signal Processing Chain



Figure 2.10: Time-Domain Plot of First Half-Band Filter

are divided into 16-bit in-phase samples and 16-bit quadrature-phase samples. The first half-band filter interpolates the samples by a factor of two if the filter is enabled. The second half-band filter interpolates the signal again by another factor of two if enabled. The CIC interpolator can interpolate the signal up to a factor of 128. Thus, the maximum interpolation rate within the FPGA is 512. The 24-stage CORDIC block rotates the signal to give frequency offset before being up-converted by the daughterboard.

The first half-band filter shown in Figure 2.10 has 31 coefficients generated by  $\text{myfilt} = \text{halfgen4}(.7/4,8)$  in MATLAB, while the second half-band filter shown in Figure 2.11 only has 7 coefficients generated by  $\text{myfilt} = \text{halfgen4}(.75/8,2)$ . The first half-band filter is enabled when the interpolation factor is a multiple of two. Both half-band filters are enabled if the interpolation factor is a multiple of four. Figure 2.12, shows the frequency responses of the two half-band filters.

Figure 2.13 shows the four-stage CIC interpolator used in the USRP N210. The CIC filter



Figure 2.11: Time-Domain Plot of Second Half-Band Filter



Figure 2.12: Frequency Response of Two Halfband Filters



Figure 2.13: CIC Interpolator in USRP N210



Figure 2.14: Frequency Response of CIC Interpolator

has the Hogenauer structure, where all the differentiators are on one side and all the integrators are on the other side of the resampling switch [23]. The interpolation factor of the CIC in USRP N210 can be set as high as M=128.

For the implementation of IEEE 802.15.4 PHY, the sampling rate of 4 MHz was needed. Since the clock in the FPGA of USRP operates at 100 MHz, the interpolation factor M was set to be 25. Figure 2.14 shows the frequency response of the CIC filter when M=25.



Figure 2.15: Receiver Signal Processing Chain

In the receiver chain, the reverse of the transmitter chain happens. The *DSP\_CORE\_RX* core in the SOC receives the samples from the ADC. Inside the *DSP\_CORE\_RX* core, the CORDIC block performs digital down-conversion (DDC). After the DDC is performed, the samples are decimated with the two half-band decimators and the CIC filter. After decimation, the samples are sent to the VRT (VITA Radio Transport) core for packetization. The packets are then sent to the Ethernet MAC to be transported to the host computer.

# **Chapter 3**

## **Methodology and Implementation**

For this thesis, two major signal processing blocks, IEEE 802.15.4 PHY and a polyphase filter-bank channelizer, have been implemented in Spartan 3A-DSP FPGA of USRP N210 and Virtex 5 FPGA, respectively. This section illustrates the steps taken to simulate and implement the blocks.

### **3.1 IEEE 802.15.4 PHY on FPGA**

There are two paths explored in implementing the IEEE 802.15.4 PHY, as shown in Figure 3.1. The first path involves Simulink modeling, while the second one does not. In the first path, the floating-point simulation is performed in MATLAB which is then converted to a Simulink model. At this stage, the Simulink model is still in floating-point. In order to



Figure 3.1: Two Paths Explored in FPGA Implementation

generate Verilog code, the model is converted to a fixed-point model. Next, the Simulink HDL Coder is used to automatically generate Verilog or VHDL code from the fixed-point model.

The second path is to perform both floating-point and fixed-point simulations in MATLAB. The Verilog code is then written manually based on the MATLAB simulation, skipping the Simulink model and automatic generation of the Verilog code entirely. Once the ModelSim simulation of the Verilog code is finished, it is compiled and implemented on the FPGA. For the final implementation, the second path was taken.



Figure 3.2: Overall Setup of IEEE 802.15.4 PHY on FPGA

### 3.1.1 Configuration

Figure 3.2 shows the overall setup of the system. In normal receive mode, USRP N210 sends the data samples to GNU Radio after down-conversion and some filtering. However, when the IEEE 802.15.4 PHY receiver is enabled, the receiver resides between the DDC and Ethernet interface and intercepts the samples and processes them before sending the result to the Ethernet interface. Likewise, in transmit mode, USRP N210 normally sends samples from GNU Radio to DUC for transmission. However, when the IEEE 802.15.4 PHY transmitter is enabled, it intercepts the samples from GNU Radio and modulates the samples before sending them to DUC.



Figure 3.3: Overall Transmitter Chain



Figure 3.4: Simulation of O-QPSK at Baseband

### 3.1.2 Transmitter

#### MATLAB Simulation

The first step in implementation of the IEEE 802.15.4 transmitter is to simulate the signal processing chain, shown in Figure 3.3, in MATLAB. IEEE 802.15.4 frames are generated using the PPDU (PHY Protocol Data Unit) generator function. The PPDU bits are then grouped as quad-bits and mapped to symbols. The symbols are spread to give overall spreading factor of eight. The final output of the transmitter chain is the baseband I and Q as shown in Figure 3.4. The baseband signal is an O-QPSK signal where the quadrature



Figure 3.5: Transmitter Chain

component is delayed by half the symbol period.

### RTL Implementation/Simulation

After simulation in MATLAB and/or Simulink, the algorithm must be converted to Verilog. The signal chain is modularized so that blocks can be mapped onto the existing GNU Radio implementation of ZigBee Radio [12]. Some of the blocks such as *gr.packed\_to\_unpacked* and *gr.chunks\_to\_symbols* are based on the basic blocks provided by the GNU Radio library written in C++. The blocks developed for the thesis can be used in GReasy as basic building blocks for other complex signal processing applications.

Figure 3.5 shows the RTL blocks in the transmitter signal processing chain. The *zb\_symbols\_to\_chips* block breaks the bytes from the incoming payload into two separate four-bit symbols, each of which are then mapped to one of the sixteen chip sequences as was shown in Figure 2.4.

For example, if the current byte is 0x14, it is broken into two four-bit symbols, 0x1 and 0x4. The first symbol 0x1 is mapped to the chip  $3986437410_{10}$ , and the second symbol 0x4 is mapped to the chip  $1378802115_{10}$ . The clock rate increases from 250kbps to 2Mchips/s

at the output.

The *gr.packed\_to\_unpacked* block breaks the 32-chip long chip sequences into 16 two-chip chunks. For example, the chip  $1378802115_{10}$  ( $01010010001011101101100111000011_2$ ), is broken into chunks of 01, 01, 00, 10, 00, 10, 11, 10, 11, 01, 10, 01, 11, 00, 00, and 11. The clock rate is decreased by a factor of two from 2Mchips/s to 1Mchunks/s, or 1 M QPSK symbols/s at the output.

The *gr.chunks\_to\_symbols* block maps each chunk to a point on the QPSK constellation. Using gray coding, the chunk 00 is mapped to  $-1-j$ , 01 to  $-1+j$ , 10 to  $1-j$ , and 11 to  $1+j$ . The clock rate stays the same at the output.

The *zb.half\_sine\_pulse* block performs half-sine pulse-shaping of the QPSK symbols. The pulse-shaping is done by a simple mapping of the constellations to a positive half-sine pulse or a negative half-sine pulse depending on the data. For example, when the in-phase or the quadrature-phase is a '1', it is mapped to the positive pulse. If it is a '-1', it is mapped to a negative half-sine pulse. The pulses used are up-sampled by a factor of four. Therefore, the clock rate is increased by a factor of four at the output to be 4 M samples/s.

Finally, in order to generate O-QPSK signal from the QPSK signal, the quadrature component is delayed by two samples by the *zb.delay\_cc* block.

Figure 3.6 shows the I and Q signals generated by the RTL blocks. The signals *bb\_i* and *bb\_q* generated by the IEEE 802.15.4 transmitter module are taken from the output of the *zb.delay\_cc* block. The signals *i\_interp* and *q\_interp* are interpolated versions of *bb\_i* and



Figure 3.6: ModelSim simulation of transmitted signal

*bb\_q*. The interpolation is done inside the USRP N210's FPGA CIC interpolator blocks. The interpolated signal is up-converted and sent to the DAC for transmission over the air.

## FPGA Implementation

Once the transmitter module is thoroughly tested in the RTL simulation stage, it is integrated into the USRP N210's FPGA code. Figure 3.7 shows the modification done to the USRP FPGA code to integrate the IEEE 802.15.4 module. In normal transmit mode, GNU Radio sends a UDP packet containing samples to the USRP. The Ethernet MAC and *packet\_router* inside the USRP's FPGA decodes the UDP packet and relays the payload to



Figure 3.7: Transmitter module inside USRP’s FPGA

*vita\_tx\_chain*. The VRT headers are then removed by *deframer* and *vita\_tx\_control*, and the extracted samples are sent to *dsp\_tx\_core* where they are interpolated and sent to the DAC for transmission. The FPGA code was modified so that the samples from *vita\_tx\_control* are sent to the IEEE 802.15.4 transmitter instead of *dsp\_tx\_core*. The output of the IEEE 802.15.4 transmitter is then sent to *dsp\_tx\_core* for transmission. Using Chipscope, the actual signals inside the FPGA can be probed. Figure 3.8 shows the half-sine waves, generated by the IEEE 802.15.4 module, interpolated by the CIC filter in *dsp\_tx\_core*.

### 3.1.3 Receiver

#### MATLAB Simulation

Like the transmitter, the first step in implementation of the receiver is to simulate the algorithm in MATLAB. The overall receiver signal processing chain is shown in Figure



Figure 3.8: Chipscope showing interpolated in-phase waveform of O-QPSK



Figure 3.9: Overall receiver algorithm



Figure 3.10: AGC Structure

3.9.

### Automatic Gain Control

The automatic gain control (AGC) block is needed at the input of the receiver to utilize as much dynamic range as possible. When the magnitude of the input samples is small, the inverse tangent function block in FPGA implementation is not able to produce accurate outputs as shown in the later section. The structure of the AGC block based on the GNU Radio implementation *AGC2* is shown in Figure 3.10.

### Demodulation of O-QPSK as MSK

The O-QPSK modulated waveform can be demodulated as MSK when half-sine pulse shaping is used, as is the case with IEEE 802.15.4 [24]. An MSK waveform can be demodulated using the delay-conjugate-multiply shown in Figure 3.11, since the output of the delay-conjugate-multiply block is the phase change between two consecutive sam-



Figure 3.11: Delay-conjugate-multiply



Figure 3.12: Output of delay-conjugate-multiply

ples. The positive phase change shows that  $f_c + \frac{1}{4T_b}$  was sent while negative phase change shows that  $f_c - \frac{1}{4T_b}$  was sent. The output of delay-conjugate-multiply using simulated input data is shown in Figure 3.12.

## Clock Recovery

At the output of the delay-conjugate-multiply block, the sample rate is at 4 MHz. Since the chip rate is at 2M chips/sec, two samples are available for each chip. The clock recovery block based on Mueller and Muller algorithm is used to find the optimal sample instance of the chip [25]. Figure 3.14 shows the chips sampled without clock recovery and the chips sampled with clock recovery. The structure of the clock recovery block is shown in



Figure 3.13: Clock Recovery Block Structure



Figure 3.14: Comparison of Sampled Chips With and Without Clock Recovery



Figure 3.15: Preamble and SFD correlations

Figure 3.13.

### Preamble and SFD correlation thresholds

Figure 3.15 shows the preamble and SFD correlations for four consecutive PPDU frames. Once the preamble and SFD correlations are found, they have to be compared to some thresholds to determine if an IEEE 802.15.4 packet was actually present or not. If the correlation values,  $\lambda_{preamble}$  and  $\lambda_{SFD}$ , are greater than some thresholds  $\tau_{preamble}$  and  $\tau_{SFD}$ , then it is determined that a valid packet is present and the incoming samples should be demodulated. If the correlation values are lower than the thresholds, it is determined that there is no valid packet present. A simple binary hypothesis testing can be used to

determine the thresholds. Given the hypothesis,

$$H_0 = \text{Packet not present} \quad (3.1)$$

$$H_1 = \text{Packet present}$$

The thresholds  $\tau_{preamble}$  and  $\tau_{SFD}$  must be such that the false alarm rate  $P_{fa} = P[\lambda_{preamble} > \tau_{preamble}|H_0]P[\lambda_{SFD} > \tau_{SFD}|H_0]$  is less than an acceptable level.

The maximum values for the thresholds can be found theoretically by analyzing the properties of MSK waveform. The MSK waveform can be written as the following.

$$s(t) = \sqrt{\frac{2E_b}{T_b}} \cos(2\pi f_c t + \theta(t)) \quad (3.2)$$

$$\theta(t) = \theta(0) + \frac{\pi}{2T_b}, \text{ '1' was sent}$$

$$\theta(t) = \theta(0) - \frac{\pi}{2T_b}, \text{ '0' was sent}$$

Depending on whether '1' or '0' was sent, the carrier phase is rotated by  $\pm\frac{\pi}{2}$  over the symbol duration  $T_b$ . Therefore, when two samples per chip are available, the maximum absolute value of output of the delay-conjugate-multiply is half of  $\frac{\pi}{2}$ , or  $\frac{\pi}{4}$ .

Since the maximum absolute value of output of the delay-conjugate-multiply is  $\frac{\pi}{4}$ , the theoretical maximum value of preamble correlation is  $N_{Preamble\ chips}(\frac{\pi}{4}) = 201.06$  and SFD correlation  $N_{SFD\ chips}(\frac{\pi}{4}) = 50.27$ , where  $N_{Preamble\ chips} = 256$  and  $N_{SFD\ chips} = 64$ . However, because of noise the correlation values will never reach the theoretical max-

imum values. Therefore, the thresholds of preamble and SFD correlations have to be somewhat lower for the packets to be detected. The actual threshold values used for implementation are determined experimentally as shown in later sections.

## Symbol Correlations

Once the SFD is found, the receiver must decide which symbol was sent by the transmitter. The decision of which symbol was sent is made by picking the symbol that gives the highest correlation value. In the GNU Radio implementation, the correlation values are found by first slicing the chips at the output of the delay-conjugate-multiply module, which is similar to hard-decision decoding. In this implementation, the chips are not sliced, but instead the soft-decision values at the output of the delay-conjugate-multiply block is input to the correlators. As shown in Figures 3.16 and 3.17, the performance of the receiver is better with soft decision decoding. The disadvantage of soft-decision decoding is that it takes more processing power. In the hard-decision decoding, the correlation value can be obtained by simply performing an XOR operation of the input chips and the reference chips and then counting the number of bits that are '0'. Whereas in soft-decision decoding, 32-tap FIR filtering must be done for all 16 symbols. However, soft-decision decoding gives better performance at low SNR. This is in line with the result that Viterbi decoders give better results with soft-decision decoding in error-correcting codes [26].



Figure 3.16: BER Simulation of Soft vs. Hard Correlations



Figure 3.17: Packet Detection Rate Simulation of Soft vs. Hard Correlations



Figure 3.18: Output of delay-conjugate-multiply with frequency offset of 20 KHz

### Frequency Offset Estimation

Frequency offset occurs when the local oscillators at the receiver and the transmitter are not completely in sync. This results in a slight frequency shift when the RF signal is converted to baseband. For the O-QPSK case, when the baseband signal is demodulated using the delay-conjugate-multiply method, it results in a slight DC offset as shown in Figure 3.18.

The IEEE 802.15.4 standard uses a similar preamble structure as the IEEE 802.11a standard. The IEEE 802.11a standard has a preamble composed of ten short preambles repeating every 0.8us. Similarly, IEEE 802.15.4 has a preamble structure which consists of eight repeating sequence of chips every 16us. Thus, the frequency offset of the IEEE 802.15.4 packets can be estimated in a similar way as is done for the IEEE 802.11a packets [27].

The frequency offset of the received signal can be estimated using the received preamble

at two different sample times.

$$y(t) = x(t)e^{j2\pi f_\Delta t} \quad (3.3)$$

and

$$y(t - T) = x(t - T)e^{j2\pi f_\Delta(t-T)} \quad (3.4)$$

Where  $y(t)$  and  $y(t - T)$  are received preambles and  $x(t)$  and  $x(t - T)$  are transmitted preamble.  $T$  is the period of repetition in the preamble. The complex sinusoid gives a frequency offset of  $f_\Delta$  to the transmitted preamble. Since the preamble  $x(t)$  repeats with the period  $T$ ,  $y(t - T)$  can be re-written as

$$y(t - T) = x(t)e^{j2\pi f_\Delta(t-T)} \quad (3.5)$$

Multiplying the received signal  $y(t - T)$  and the conjugate of  $y(t)$  gives the following expressions.

$$\begin{aligned} y^*(t)y(t - T) &= x^*(t)e^{-j2\pi f_\Delta(t)}x(t)e^{j2\pi f_\Delta(t-T)} \\ &= x^*(t)x(t)e^{-j2\pi f_\Delta(t)}e^{j2\pi f_\Delta(t-T)} \\ &= |x(t)|^2 e^{j2\pi f_\Delta(-t+t-T)} \\ &= |x(t)|^2 e^{j2\pi f_\Delta(-T)} \end{aligned} \quad (3.6)$$

Taking the angle of the expression in 3.6 gives the following result.

$$\angle y^*(t)y(t-T) = 2\pi f_\Delta(-T) \quad (3.7)$$

Dividing by an appropriate factor reveals the frequency offset of the received signal.

$$f_\Delta = \frac{2\pi f_\Delta(-T)}{2\pi(-T)} \quad (3.8)$$

Since the range of valid values for the result of the angle function is  $(-\pi, \pi]$ , the minimum and maximum frequency offset that can be estimated are the following.

$$f_{\Delta min} = \frac{-\pi}{2\pi(T)} = -31.25 \text{ kHz} \quad (3.9)$$

$$f_{\Delta max} = \frac{\pi}{2\pi(T)} = 31.25 \text{ kHz}$$

Figure 3.19 shows the output of the frequency offset estimation. During the preamble, the simulated frequency offset of  $f_\Delta = 20$  kHz is correctly estimated. However, as seen in Figure 3.18 the delay-conjugate-multiply demodulation scheme is very robust to the effects of frequency offset. Therefore, no frequency offset correction is needed.

## Simulink Simulation

Once the signal processing chain is verified with MATLAB simulations, it is broken down into individual blocks to be implemented in Simulink. The same input test data used



Figure 3.19: Frequency Offset Estimation



Figure 3.20: Delay-conjugate-multiply Block in Simulink

for MATLAB simulation is used for the Simulink model to see that the results of the two simulations matched. Figure 3.20 and 3.21 show two of the blocks implemented for the Simulink model. In addition to using the library of signal processing blocks available in Simulink, the user can also create custom blocks written in HDL and use wrappers known as *Black Box* for co-simulation of the custom HDL blocks in the Simulink environment. This is a very powerful ability that lets users combine custom HDL blocks with standard Simulink blocks in a design. The user is able to pick and choose which blocks to implement in hand-written HDL and which blocks to implement using just the standard libraries. Furthermore, once the Simulink-only blocks are verified, the user can



Figure 3.21: Complex Multiply Block in Simulink



Figure 3.22: Black Box wrapper for a simple MAC processing

replace with the blocks with hand-optimized HDL blocks for better performance and still test the system in a Simulink environment. Figure 3.22 shows an example of a Black Box wrapper for a simple MAC processing. Although Simulink blocks are useful for streaming applications, it is often hard to implement control logic with finite states in Simulink. MathWorks does provide a tool called StateFlow for building state machines, but they are often cumbersome to use. Black Box allows users to implement complicated state machines in HDL rather than Simulink blocks or StateFlow.

Once the algorithm is tested in Simulink, the signals need to be converted to fixed-point format. When the signals are all converted to fixed-point format, the algorithm is tested and verified once again. After the results are verified, the Simulink model is ready for conversion to Verilog or VHDL code. The HDL Coder takes the Simulink model and



Figure 3.23: Output of CORDIC atan without AGC

converts the blocks into "bit-true and cycle-accurate" Verilog or VHDL code [28]. The Black Box implementations that are already written in HDL are used as they are written.

### RTL Implementation

For RTL implementation, the blocks shown in Figure 3.9 are converted to Verilog modules manually. The AGC block is required to utilize as much dynamic range as possible.

Figure 3.23 shows the effects of limited word-length in fixed-point implementation. When the gain of the USRP is set to a low value, the magnitude of the I and Q samples are small. This leads to limited dynamic range in the received samples. When these values are input to the CORDIC atan module, because of the small dynamic range, the atan block is not able to compute the output accurately. Therefore, the AGC block is needed



Figure 3.24: Output of CORDIC atan with AGC

at the front of the receiver to maximize the dynamic range of the input samples. Figure 3.24 shows that the output of the atan block matches the ideal MATLAB reference when the AGC block is introduced.

The correlations are performed using the Xilinx FIR cores. For the preamble and SFD correlations, a 256-tap FIR filter and a 64-tap FIR filter are generated using Xilinx CORE Generator. For the symbol correlations, 16 32-tap FIR filters are generated for each of the 16 chip sequences. The inputs to each filter are 16-bit signed soft-decision values. The input clock period for the filters are set to 50 so that the multipliers in the DSP48 slices could be used more efficiently by time-multiplexing. In contrast to the GNU Radio implementation, the FPGA implementation can calculate 16 correlation values simultaneously with soft-decision values because it supports parallelism. However, the GNU Radio im-



Figure 3.25: ModelSim Simulation Showing Preamble and SFD Correlations



Figure 3.26: CRC-16

plementation must rely on a for-loop to iterate through the 16 correlation computations with hard-decision chips. If the processor is not fast enough to execute the for-loop, an overflow may occur.

Figure 3.25 shows the preamble and SFD correlations in ModelSim. The peaks in the correlations show where the packets are present.

Once the correlation values are computed, the state machine shown in Figure 3.27 searches for a valid packet by comparing the preamble and SFD correlations to the thresholds. If the correlation values are above the thresholds, the state machine decides that a valid packet is present and starts decoding the frame. As it decodes the payload, it generates



Figure 3.27: State machine for MAC layer

the checksum using a CRC module and compares against the received checksum. If the two checksums are identical, the MAC layer signals that the packet was valid. The decoded payload is sent off to the Ethernet MAC to be processed by GNU Radio. Figure 3.26 shows the structure of the CRC module.

## FPGA Implementation

Similar to the transmitter implementation, the IEEE 802.15.4 receiver module is integrated into the USRP FPGA code as shown in Figure 3.28. Normally, the real and complex samples from the *dsp\_rx0* core are sent to *vita\_rx\_chain* directly to be formatted and sent over to the host. However, for the implementation, the samples are intercepted by the IEEE 802.15.4 receiver to be processed. The decoded bytes from the receiver module are sent out to the *vita\_rx\_chain* core which frames the bytes with VITA headers and sends out



Figure 3.28: Receiver module inside USRP’s FPGA

to the Ethernet MAC module.

The data format sent from the FPGA to GNU Radio is normally a 32-bit word consisting of 16-bit I and 16-bit Q data. However, since the data being sent from the FPGA is no longer samples but decoded bytes, the format has to be changed. Table 3.1 shows the new data format sent from FPGA to GNU Radio. The 32-bit data contains not only the decoded byte but also the state of the state machine, a strobe signal for when a valid byte is output, a flag to indicate if the checksum was correct, and debug signals.

### GNU Radio Signal Processing Block

In order to interface with the modified FPGA of USRP N210, a custom GNU Radio signal processing block was created. The custom block is able to decode the new 32-bit data format shown in Table 3.1. Figure 3.29 shows the GNU Radio Companion view

Table 3.1: Output data format for USRP

|               |    |                 |    |                 |    |            |    |
|---------------|----|-----------------|----|-----------------|----|------------|----|
| 31            | 30 | 29              | 28 | 27              | 26 | 25         | 24 |
| 0             | 0  | out strobe byte |    | out CRC correct |    | out symbol |    |
| 23            | 22 | 21              |    | 20              |    | 19         | 18 |
| out state     |    |                 |    | 1               | 1  | 1          | 1  |
| 15            | 14 | 13              | 12 | 11              | 10 | 9          | 8  |
| debug_signals |    |                 |    |                 |    |            |    |
| 7             | 6  | 5               | 4  | 3               | 2  | 1          | 0  |
| out_byte      |    |                 |    |                 |    |            |    |

of the working ZigBee Radio. All of the IEEE 802.15.4 receiver radio logic resides inside the USRP's FPGA. Hence, the user only sees the *UHD: USRP Source* block and the *zigbee\_rx\_mod* block. The *zigbee\_rx\_mod* block is a custom block that was created to correctly parse the output data coming from the USRP.

## Bypass Mode

In order to let users use the USRP as originally intended, a bypass logic was added to the FPGA. This bypass logic allows the user to select whether to operate the USRP as a IEEE 802.15.4 receiver, or as a normal USRP by bypassing the IEEE 802.15.4 receiver logic inside the FPGA. Figure 3.30 shows the modified *UHD: USRP Source* block where the user is able to select whether to operate the USRP as a ZigBee Radio or as a normal USRP.

In order to switch between the normal mode and IEEE 802.15.4 receiver mode, a register inside the FPGA needs to be toggled from the GNU Radio environment. This is accom-



Figure 3.29: GNU Radio companion view of ZigBee RX



Figure 3.30: Modified UHD USRP Source Block

plished by modifying the whole control chain that spans GNU Radio, UHD, and FPGA. Figure 3.31 shows the series of modifications made to enable changing the register inside the FPGA from a GNU Radio block.

Using the same logic for enabling the bypass mode, the thresholds for preamble and SFD can be controlled also by changing the registers values inside the FPGA. Since usually only a single receive channel *dsp\_rx0* on the USRP is used, the registers in *dsp\_rx1* were re-routed to control the bypass register and the threshold registers.

## 3.2 Hybrid Multi-Channel IEEE 802.15.4 Receiver

### 3.2.1 Configuration

The hybrid implementation of multi-channel IEEE 802.15.4 receiver consists of three major processing entities as shown in Figure 3.32: USRP N210, XUPV5, and GNU Radio. The USRP acts as an RF front-end and a digital down converter. XUPV5 processes high sample rate signal, and GNU Radio processes low sample rate signal. A channelizer resides in XUPV5's Virtex 5 FPGA to split a 20 MHz bandwidth signal into four 5 MHz channels. An energy detector in the same FPGA then detects which channel is occupied by the IEEE 802.15.4 transmitter. The energy detector then tells a 4:1 multiplexer to send only the output of the occupied channel to GNU Radio. The IEEE 802.15.4 receiver in GNU Radio then demodulates the received channel. When the transmitter switches to a



Figure 3.31: Control Chain in Modifying Registers Inside FPGA from GNU Radio



Figure 3.32: System Configuration of Hybrid Multi-Channel IEEE 802.15.4 Receiver

different channel, the energy detector correctly identifies the newly occupied channel and multiplexes the correct channel to GNU Radio.

### 3.2.2 Channelizer

In order to implement a multi-channel IEEE 802.15.4 receiver, a channelizer is needed. The IEEE 802.15.4 channels are separated by 5 MHz as specified in the standard. The maximum sample rate of USRP N210 is 25 MHz at baseband with 16-bit I and Q samples. Therefore, a maximum of five channels ( $25 \text{ MHz}/5 \text{ MHz}$ ) can be simultaneously channelized. However, a five-channel channelizer requires a five-point IDFT block. In order to utilize the efficient radix-4 FFT algorithm, a four-channel channelizer is implemented instead using a four-point FFT. Thus, the USRP N210 runs at the sampling frequency of 20 MHz instead of 25 MHz.



Figure 3.33: Channelizer 4:1

### MATLAB Simulation

Figure 3.33 shows the architecture of a four-channel channelizer. The channelizer presented in [23] has the first channel (channel 0) centered at baseband. An example of such channelizer was shown in Figure 2.6 in the earlier section. However such channel positioning makes one half of the edge channel (channel 4 in Figure 2.6) alias to the opposite end of the spectrum. Therefore when the baseband signal from the USRP is channelized, only the seven channels in the middle are useful, since the edge channel contains two halves of opposite ends of the spectrum. In order to prevent this aliasing, the lowpass filter is up-converted to the center frequency of  $F=0.125$  by multiplying with a complex sinusoid so that the center of edge channels will not be at the sampling frequency. Figure 3.34 shows the complex lowpass filter up-converted to  $F=0.125$ . Using the complex lowpass filter as a starting point, the channels are now evenly distributed across 20 MHz bandwidth as shown in Figure 3.35 with the edge channel not split between the opposite



Figure 3.34: Prototype Lowpass Filter



Figure 3.35: Input to Four-Channel Channelizer

ends of the spectrum. The disadvantage of this process is the filter coefficients of the filter bank are now complex. Each channel also needs to be down-converted to baseband at the output of the channelizer, since the prototype lowpass filter is not centered at baseband. Figure 3.36 shows the IEEE 802.15.4 signal residing in *channel 0*.



Figure 3.36: Output of Four-Channel Channelizer



Figure 3.37: Filter with Real Coefficients



Figure 3.38: Filter with Complex Coefficients

## FPGA Implementation

Since the lowpass filter is up-converted to  $F=0.125$ , the coefficients of the new lowpass filter are complex-valued. When a filter is real, the i-phase and q-phase parts of the signal can be independently filtered as shown in Figure 3.37. However, when a filter is complex, the i-phase and q-phase have to be both filtered by real and imaginary coefficients of the filter and combined as shown in Figure 3.38. Therefore, complex filters require twice as many resources as real filters.

Once the samples are filtered, they are input to the FFT block. The four-point FFT can be easily built with the radix-4 FFT architecture. Since  $N=4$ , only a single radix-4 butterfly is required. In matrix form, four-point DFT can be expressed as Equation 3.10.

$$\begin{bmatrix} X[0] \\ X[1] \\ X[2] \\ X[3] \end{bmatrix} = \begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & -j & -1 & j \\ 1 & -1 & 1 & -1 \\ 1 & j & -1 & -j \end{bmatrix} \times \begin{bmatrix} x[0] \\ x[1] \\ x[2] \\ x[3] \end{bmatrix} \quad (3.10)$$

The matrix equation can be simplified as Equation 3.11 which only consists of additions and subtractions. Thus, using the radix-4 butterfly, the four-point FFT can be reduced to

a series of simple additions and subtractions.

$$X(0) = x[0] + x[1] + x[2] + x[3] \quad (3.11)$$

$$= x_i[0] + x_i[1] + x_i[2] + x_i[3] + j(x_q[0] + x_q[1] + x_q[2] + x_q[3])$$

$$X(1) = x[0] - jx[1] - x[2] + jx[3]$$

$$= x_i[0] + x_q[1] - x_i[2] - x_q[3] + j(x_q[0] - x_i[1] - x_q[2] + x_i[3])$$

$$X(2) = x[0] - x[1] + x[2] - x[3]$$

$$= x_i[0] - x_i[1] + x_i[2] - x_i[3] + j(x_q[0] - x_q[1] + x_q[2] - x_q[3])$$

$$X(3) = x[0] + jx[1] - x[2] - jx[3]$$

$$= x_i[0] - x_q[1] - x_i[2] + x_q[3] + j(x_q[0] + x_i[1] - x_q[2] - x_i[3])$$

After IFFT, each channel needs to be down-converted by  $F = 0.125M$  to compensate for the up-converted lowpass filter. The down-conversion is achieved by multiplying each channel with the complex sinusoid  $e^{-j2\pi M 0.125(m)} = e^{-j2\pi 0.5(m)}$ . The complex sinusoid at  $F=0.5$  simplifies to  $1, -1, 1, -1, 1, \dots$ . Therefore, down-conversion for each channel simplifies to flipping the sign of every other sample.



Figure 3.39: Energy Detector

### 3.2.3 Energy Detector

#### FPGA Implementation

Figure 3.39 shows the energy detector to detect the channel where signal is present. The average of four consecutive samples of magnitude of each channel is compared, and the channel with most energy is sent through a multiplexer to the resampler.

### 3.2.4 Resampler 4/5

Since the sample rate of each channel is 5 MHz, a resampler is needed to convert the sample rate to 4 MHz for the demodulator to work properly. A simple polyphase filter-



Figure 3.40: Resampler 4/5



Figure 3.41: Resampler 4/5 Simulation



Figure 3.42: Flow of Data and Control Packets

bank resampler shown in Figure 3.40 is used to convert the sample rate from 5 MHz to 4 MHz. Figure 3.41 shows the simulation in MATLAB of resampling the input signal by a factor of 4/5.

### 3.2.5 Ethernet interface

#### UHD Modifications

For the external FPGA to process the samples from the USRP N210 before sending the result to GNU Radio, the USRP N210 must be configured to send the samples to the external FPGA instead of GNU Radio. By fixing the MAC destination of the UDP packets to be that of the external FPGA, the sample packets from USRP N210 can be redirected to the external FPGA. Figure 3.42 shows the flow of data and control packets. After processing the samples, XUPV5 sends the processed samples to GNU Radio as if the samples were coming straight from USRP N210. GNU Radio then accepts the samples for further processing.

Simple changes in the firmware of USRP N210 can redirect sample packets to designated MAC address while control packets still communicate with the host.

```

1 static void setup_network(void){
2     //setup ethernet header machine
3     /*sr_udp_sm->eth_hdr.mac_dst_0_1 = (fp_mac_addr_dst.addr[0] << 8) |
4         fp_mac_addr_dst.addr[1];
5     sr_udp_sm->eth_hdr.mac_dst_2_3 = (fp_mac_addr_dst.addr[0] << 8) |
6         fp_mac_addr_dst.addr[1];
7     sr_udp_sm->eth_hdr.mac_dst_4_5 = (fp_mac_addr_dst.addr[0] << 8) |
8         fp_mac_addr_dst.addr[1];*/
9
10    // Fix destination MAC address to be that of the external FPGA
11    sr_udp_sm->eth_hdr.mac_dst_0_1 = (0x00 << 8) | 0x0A;
12    sr_udp_sm->eth_hdr.mac_dst_2_3 = (0x35 << 8) | 0x00;
13    sr_udp_sm->eth_hdr.mac_dst_4_5 = (0x01 << 8) | 0x02;
14
15    sr_udp_sm->eth_hdr.mac_src_0_1 = (fp_mac_addr_src.addr[0] << 8) |
16        fp_mac_addr_src.addr[1];
17    sr_udp_sm->eth_hdr.mac_src_2_3 = (fp_mac_addr_src.addr[2] << 8) |
18        fp_mac_addr_src.addr[3];
19    sr_udp_sm->eth_hdr.mac_src_4_5 = (fp_mac_addr_src.addr[4] << 8) |
20        fp_mac_addr_src.addr[5];
21
22    ...
23    uint32_t dst_ip_addr = fp_socket_dst.addr.addr+2; // Change destination
24        address from 192.168.10.1 to 192.168.10.3 (IP of XUPV5)
25    ...
26
27    sr_udp_sm->udp_hdr.checksum = UDP_SM_LAST_WORD;
28 }
```

## FPGA

Once the packets arrive at XUPV5 from USRP N210, the FPGA must know how to decode the UDP packets. The PHY and MAC layers of Ethernet are handled by Ethernet PHY and the Xilinx Embedded Tri-Mode Ethernet MAC core. Figure 3.43 shows the Ethernet interface and packet encoders and decoders to handle UDP packets inside XUPV5.



Figure 3.43: Ethernet interface in XUPV5



Figure 3.44: Frames at Different OSI Layers

Figure 3.44 shows the structure of UDP packet for transmitting the received samples from USRP N210 to GNU Radio. XUPV5 must be able to extract the samples embedded within the payload of the UDP packet.

When a new packet arrives, the packet decoder module extracts the payload of the UDP packet and strips the VRT header and tail. The extracted samples are then sent to the channelizer module for signal processing. The output of the channelizer is then sent to the packet encoder module which constructs a valid frame that can be read by GNU Radio.

# **Chapter 4**

## **Results**

### **4.1 Resource Utilization**

The tables in this section show the resource utilizations of the FPGA implementations of the IEEE 802.15.4 PHY and the multi-channel receiver. For the IEEE 802.15.4 PHY, the transmitter is implemented on USRP2, while the receiver is implemented on USRP N210. The receiver is implemented on the USRP N210 instead of USRP2 because the latter does not have enough resources to support signal processing for the receiver algorithm.

#### **4.1.1 Transmitter**

Table 4.1 shows the resource utilization of the IEEE 802.15.4 transmitter by itself. No multiplier is used for the transmitter even though the transmitter core has a half-sin pulse

Table 4.1: Device Utilization Summary of ZigBee TX core on Xilinx Spartan 3-2000

| Logic Utilization | Used | Available | Utilization |
|-------------------|------|-----------|-------------|
| Slice Flip Flops  | 786  | 40,960    | 1%          |
| 4 input LUTs      | 782  | 40,960    | 1%          |
| Occupied Slices   | 820  | 20,480    | 4%          |
| MULT 18x18s       | 0    | 40        | 0%          |
| RAMB 16s          | 0    | 40        | 0%          |

Table 4.2: Device Utilization Summary of ZigBee TX core and USRP core on Xilinx Spartan 3-2000

| Logic Utilization | Used   | Available | Utilization |
|-------------------|--------|-----------|-------------|
| Slice Flip Flops  | 20,055 | 40,960    | 48%         |
| 4 input LUTs      | 29,024 | 40,960    | 70%         |
| Occupied Slices   | 17,130 | 20,480    | 83%         |
| MULT 18x18s       | 31     | 40        | 77%         |
| RAMB 16s          | 25     | 40        | 63%         |

shaping filter. It is possible to eliminate the need of multipliers since the input to the filter is always either -1 or 1, and therefore there are only two possible output sequences from the pulse shaping filter: half-sin pulse or inverted half-sin pulse. Depending on the input, the filter can simply output one of the two possible sequences, instead of performing the actual multiply-accumulate operations. Since distributed RAM instead of block RAM is used for FIFO between the signal processing blocks, no RAMB 16s are used either.

Table 4.2 shows the resource utilization when the transmitter module is integrated with the USRP2 FPGA. The 27 multipliers are used by USRP2 for scaling and interpolation operations.

Table 4.3: Device Utilization Summary of ZigBee RX core on Xilinx Spartan 3A-DSP3400

| Logic Utilization | Used  | Available | Utilization |
|-------------------|-------|-----------|-------------|
| Slice Flip Flops  | 6,907 | 47,744    | 14%         |
| 4 input LUTs      | 7,372 | 47,744    | 7%          |
| DSP48As           | 39    | 126       | 30%         |
| RAM16             | 47    | 126       | 37%         |

Table 4.4: Device Utilization Summary of ZigBee RX core and USRP core on Xilinx Spartan 3A-DSP3400

| Logic Utilization | Used   | Available | Utilization |
|-------------------|--------|-----------|-------------|
| Slice Flip Flops  | 25,931 | 47,744    | 54%         |
| 4 input LUTs      | 35,097 | 47,744    | 73%         |
| DSP48As           | 66     | 126       | 52%         |
| RAM16             | 75     | 126       | 59%         |

### 4.1.2 Receiver

Table 4.3 shows the resource utilization of the IEEE 802.15.4 receiver by itself. In contrast to the transmitter, the receiver uses 39 multipliers because of the correlation operations done by Xilinx FIR cores and the interpolation operation by the clock recovery module. A lot more slices are used compared to the transmitter because of its higher complexity.

Table 4.4 shows the resource utilization when the receiver module is integrated into USRP N210's FPGA. There are still enough room to implement additional signal processing blocks such as symbol timing synchronization to improve the performance of the receiver. Table 4.5 shows the resource utilization by Karve's receiver implementation [2]. Karve's implementation employs only hard-decision correlation and no clock recovery module,

Table 4.5: Device Utilization Summary of Karve's ZigBee RX core (14-bit) on XUPV5 [2]

| Logic Utilization | Used  | Available | Utilization |
|-------------------|-------|-----------|-------------|
| Slice Flip Flops  | 1,954 | 69,120    | 2%          |
| Slice LUTs        | 3,182 | 69,120    | 4%          |
| DSP48Es           | 10    | 64        | 15%         |
| Block RAM         | 9     | 148       | 6%          |
| TEMACs            | 1     | 2         | 50%         |

Table 4.6: Post-Synthesis Timing Summary

|                   |            |
|-------------------|------------|
| Minimum Period    | 12.246 ns  |
| Maximum Frequency | 81.659 MHz |

therefore it uses much less resource, even though it includes the Ethernet MAC module.

The post-synthesis and post-PAR timing summaries in Tables 4.6 and 4.7 show that the timing constraint of 100 MHz system clock is not met. However, using a different version of ISE, update USRP N210 FPGA code, and setting different synthesis, translate, and map properties may enable the tool to meet the timing constraints [29].

Table 4.7: Post-PAR Timing Summary

|                   |            |
|-------------------|------------|
| Minimum Period    | 16.804 ns  |
| Maximum Frequency | 59.510 MHz |

Table 4.8: Device Utilization Summary of Channelizer on Xilinx Virtex 5 LX110T

| Logic Utilization | Used  | Available | Utilization |
|-------------------|-------|-----------|-------------|
| Slice Registers   | 3,224 | 69,120    | 4%          |
| Slice LUTs        | 3,410 | 69,120    | 4%          |
| Block RAM/FIFO    | 1     | 148       | 1%          |
| DSP48Es           | 28    | 64        | 43%         |

Table 4.9: Device Utilization Summary of Channelizer with Ethernet MAC on Xilinx Virtex 5 LX110T

| Logic Utilization | Used  | Available | Utilization |
|-------------------|-------|-----------|-------------|
| Slice Registers   | 4,338 | 69,120    | 6%          |
| Slice LUTs        | 4,250 | 69,120    | 6%          |
| Block RAM/FIFO    | 10    | 148       | 6%          |
| DSP48Es           | 28    | 64        | 43%         |

### 4.1.3 Channelizer

Table 4.8 shows the device utilization of the channelizer module only. It uses a number of DSP48E slices because of the number of filters required in the polyphase filter bank and the resampler module. Table 4.9 shows the device utilization when the channelizer is combined with the Ethernet MAC module. Even with the MAC, sufficient slices are left for further use.

|              | FPGA RX       | GNU Radio RX  | Xbee RX       | Multi-Channel Rx |
|--------------|---------------|---------------|---------------|------------------|
| FPGA TX      | interoperable | interoperable | interoperable | interoperable    |
| GNU Radio TX | interoperable | interoperable | interoperable | interoperable    |
| XBee TX      | interoperable | interoperable | interoperable | interoperable    |

Table 4.10: Interoperability Chart

## 4.2 Interoperability

Combinations of different receiver and transmitter implementations are used to test interoperability. Table 4.10 shows that all three implementations of IEEE 802.15.4 are interoperable with each other.

Figure 4.1 shows that the commercially available X-Bee module is able to receive the IEEE 802.15.4 packets sent from the FPGA implementations on USRP 2.

## 4.3 Performance

### 4.3.1 IEEE 802.15.4 Receiver

The performance of the receiver is first simulated, and the actual bit error rate, packet error rate, and packet detection rate are measured over the air.



Figure 4.1: Message from FPGA received by Xbee console

## Simulation

Figure 4.2 shows the chip error rate of O-QPSK signal. The theoretical curve is simply the bit error rate curve of the O-QPSK modulation. The simulation of O-QPSK demodulation very closely matches the theoretical curve. As expected, differential demodulation using the delay-conjugate-multiply scheme results in poorer performance. However, matched-filtering the input signal with half-sine pulse shape increases the performance of the differential demodulation. The simulation done in Karve's thesis is also shown [2]. All simulations assume perfect synchronization. Figure 4.3 shows the simulated BER curves. The "Coherent Demodulation" curve assumes perfect receiver synchronization. The O-QPSK chips are demodulated coherently and then despread to recover the information



Figure 4.2: Chip Error Rate Simulation

bits. The “Coherent Differential Demodulation” curve assumes perfect receiver synchronization, but the O-QPSK chips are matched-filtered and demodulated differentially with the delay-conjugate-multiply block. The later case is closer to the implementation done for the thesis. As expected, the coherent demodulation of the O-QPSK symbol results in better performance and closely matches the simulation curve, while the differential demodulation results in poorer performance. The theoretical curve in Figure 4.3 is specified in the standard [1] as the following.

$$P_{BER,802.15.4} = \frac{8}{15} \frac{1}{16} \sum_{k=2}^{16} -1^k \binom{16}{k} e^{20SINR(\frac{1}{k}-1)} \quad (4.1)$$



Figure 4.3: BER Simulation

### SNR Measurement

The SNR values are calculated in the *bypass* mode, so that the raw received samples can be saved to a file and analyzed.

Two different measurements are done to calculate the SNR value. In the first measurement, only the noise power is measured. The following command is used to collect the samples from the USRP. No packets are sent from the Xbee module, so that only noise samples are collected.

```
1 ./rx_samples_to_file --freq=2480000000 --rate=4000000 --gain=70 --nsamps
=4000000 --filename=noise_70dB.dat
```



Figure 4.4: Received Packets in Time Domain

In the second measurement, the packets are sent from the XBee module so that the signal of interest as well as the noise samples are captured. The gain on the receiver is set to be the same so that the noise power is the same as in the first measurement.

```
1 ./ rx_samples_to_file --freq=2480000000 --rate=4000000 --gain=70 --nsamps
   =4000000 --filename=signal_70dB.dat
```

Figure 4.4 shows one instance of the second measurement, where two packets are clearly seen. The following equation is used to calculate the SNR from the two measurements.

$$\frac{S}{N} = \frac{P_{signal}}{P_{noise}} \quad (4.2)$$

$$\frac{E_b}{N_0} = \frac{S}{N} \quad (4.3)$$

$$\begin{aligned}
E_{total} &= \sum_0^{N-1} |x[n]|^2 \\
&= P_{signal} N_{signal} + P_{noise} N_{noise}
\end{aligned} \tag{4.4}$$

where  $N = N_{signal} + N_{noise}$  is the total number of samples,  $N_{signal}$  is the number of signal samples, and  $N_{noise}$  is the number of noise samples. In the first measurement, since there is no signal present, the equation can be re-written as the following.

$$\begin{aligned}
E_{total} &= \sum_0^{N-1} |x[n]|^2 \\
&= P_{noise} N_{noise} \\
&= P_{noise} N
\end{aligned} \tag{4.5}$$

Solving for  $P_{noise}$ , the noise power can be calculated.

$$\begin{aligned}
P_{noise} &= \frac{\sum |x[n]|^2}{N_{noise}} \\
&= \sum |x[n]|^2 / N
\end{aligned} \tag{4.6}$$

(4.7)

In Equation 4.4,  $N_{signal}$  is known because the number of packets and the length of each packet can be controlled.  $P_{signal}$  is then the following.

$$P_{signal} = \frac{E_{total} - P_{noise}N_{noise}}{N_{signal}} \quad (4.8)$$

Substituting  $P_{signal}$  and  $P_{noise}$  into Equation 4.2 gives the SNR.

### Packet Detection Rate

The packet detection rate is calculated using the following formula.

$$\text{Packet Detection Rate} = \frac{\text{Number of Packets Detected}}{\text{Number of Packets Transmitted}} \quad (4.9)$$

Figure 4.5 shows the packet detection rates for the FPGA and GNU Radio implementations. The packet detection rate is higher for the GNU Radio implementation.

### Packet Error Rate

The packet error rate is calculated using the following equation.

$$\begin{aligned} & \text{Packet Error Rate} \\ &= \frac{\text{Number of Packets With Incorrect CRC} + \text{Number of Packets Not Detected}}{\text{Number of Packets Transmitted}} \end{aligned} \quad (4.10)$$

Figure 4.6 shows the plot of the packet error rates.



Figure 4.5: Percentages of Packets Detected



Figure 4.6: Packet Error Rate



Figure 4.7: Bit Error Rate

### Bit Error Rate

The bit error rate is calculated using the following formula. Figure 4.7 shows the bit error rate of GNU Radio and FPGA implementations.

$$\text{Bit Error Rate} = \frac{\text{Number of Incorrectly Decoded Bits}}{\text{Number of Bits Received}} \quad (4.11)$$

In the Packet Detection Rate and Packet Error Rate measurements, the GNU Radio implementation and the FPGA implementation with clock recovery give comparable results, while the FPGA implementation without clock recovery gives the worst result. In the BER measurements, the GNU Radio implementation outperforms both of the FPGA implementations. The FPGA implementation with clock recovery performs better than the one without clock recovery as expected.



Figure 4.8: Output of Inverse Tangent with Correct Sign

There are multiple reasons why the GNU Radio implementation outperforms the FPGA implementation.

- Algorithmic

The output of the inverse tangent in the FPGA implementation is intentionally set to zero when it is outside  $[-\pi/2, \pi/2]$ , with some headroom, to decrease the correlation value when noise is present. The rationale behind this is that when the signal is present, the output of inverse tangent should be within  $[-\pi/4, \pi/4]$ , and when only noise is present, the output of inverse tangent is outside  $[-\pi/4, \pi/4]$ . By setting the values outside the range to be zero, the correlation values are lower when only noise is present, thereby decreasing the false alarm rate. However, at low SNR, the output of inverse tangent is outside the range of  $[-\pi/2, \pi/2]$  even when the signal is present. Therefore, the algorithm falsely decides that there is only noise present and



Figure 4.9: Output of Inverse Tangent with Mismatched Sign



Figure 4.10: Output of Inverse Tangent with Increased CORDIC Iterations



Figure 4.11: Packet Detection Rate with Varying Word Length

sets the output of inverse tangent to zero for some output samples. This decreases the correlation value and degrades the performance.

- Limited word-length

It is observed that the output of the CORDIC inverse tangent block does not match that of the floating-point inverse tangent function. Figure 4.8 shows the output of inverse tangent when the x value is fixed at  $x = -2$  while y is  $[-4 : 4]$ . This is a benign case where if the y values are big, the output of inverse tangent is not accurate, but the sign of the output is correct. Figure 4.9 shows the malignant case when y values are small, not only do the magnitudes mismatch, but also the signs of the output mismatch. The wrong sign results in wrong chip decision, which degrades the performance of the clock recovery module and decreases the correlation values. The output of the inverse tangent block can be improved by increasing the

number of CORDIC iterations, however, at the expense of increased latency. Figure 4.10 shows that output values match very closely when the number of CORDIC interations are increased.

The limited word-length of the internal registers of the clock recovery module leads to a poorer performance. This is verified by simulating the clock recovery module with different word lengths. The input to the clock recovery module is the output of the floating-point inverse tangent to make sure that only the clock recovery module affects the performance. When the word lengths of the internal registers are increased, the number of packets detected are increased. However, when the word lengths are decreased, the number of packets detected decreased. Figure 4.11 shows the packet detection rate as the word length is increased or decreased. Since the clock recovery module is very sensitive to noise, slight decrease in word length, actually increases the detection rate for some range, but the detection rate decreases to zero eventually. Similarly, when the word length is increased, the detection rate decreases slightly at one point, but eventually increases and stabilizes at large word length.

## 4.4 Radio Characteristics

Figure 4.12 shows the power spectral density of the IEEE 802.15.4 waveform transmitted from the FPGA implementation. The IEEE 802.15.4 standard states that the relative power



Figure 4.12: Spectral Mask of IEEE 802.15.4 transmitted using USRP N210 FPGA

in frequency  $|f - f_c| > 3.5\text{MHz}$  shall be less than -20 dB compared to the average spectral power measured within  $\pm 1\text{MHz}$  of the carrier frequency. The measurement shall be made using a 100 kHz resolution bandwidth. Figure 4.12 shows that power in the upper and lower bands 3.5 MHz away from the center frequencies are -30.47 dB and -29.81 dB, which satisfies the requirement.

The standards states that the transmitted center frequency tolerance shall be  $\pm 40\text{ppm} = 0.004\%$ . Figure 4.13 shows that the frequency deviation is about  $\frac{2.480008492\text{ GHz} - 2.48\text{ GHz}}{2.48\text{ GHz}} = 0.0003\%$ , which lies within the tolerance limit.

The 99% occupied bandwidth is approximately 2 MHz as shown in Figure 4.14.

Figure 4.15 shows the constellation plot of the O-QPSK signal transmitted by the USRP N210 FPGA implementation. As expected of O-QPSK signals, the constellation points do not travel across the origin.



Figure 4.13: Center Frequency of IEEE 802.15.4 Signal from USRP N210 FPGA



Figure 4.14: Occupied Bandwidth



Figure 4.15: O-QPSK Constellation

# **Chapter 5**

## **Conclusion and Future Work**

Two different versions of IEEE 802.15.4 PHY implementations have been explored. The first implementation demonstrates how a signal processing application such as IEEE 802.15.4 PHY can be embedded inside the USRP N210's FPGA. This enables developers to take advantage of the spare resources in the USRP N210's FPGA and delegate complex signal processing tasks from slow GPP to fast FPGA. GNU Radio is now free to do much more processing since FPGA takes care of all the complex tasks.

The second implementation demonstrates the use of an external FPGA in the USRP and GNU Radio environment. By modifying the firmware inside the USRP N210, it is possible to redirect packets from USRP N210 to the external FPGA. The external FPGA then extracts the samples from the UDP packets and performs signal processing on the samples. Because external FPGAs can be much larger than USRP's FPGA, even more complex

tasks can be performed on it. The example implementation of the multi-channel IEEE 802.15.4 demonstrates the viability of the platform by successfully decoding packets from commercially available IEEE 802.15.4 node. If desired, a signal processing chain can be distributed among all three platforms, the USRP's FPGA, an external FPGA, and GNU Radio. This platform is also extensible to additional external FPGAs if desired. The first FPGA can process the samples from USRP and relay to the next FPGA, and so on, until the result reaches GNU Radio.

Future work includes implementing a more complex signal processing applications such as spectrum sensing algorithms or IEEE 802.11a.

# Bibliography

- [1] *IEEE Std 802.15.4-2006*, IEEE Std.
- [2] M. P. Karve, "Evaluation of gnu radio platform enhanced for hardware accelerated radio design," Ph.D. dissertation, Virginia Polytechnic Institute and State University, 2010.
- [3] D. Bursky. (2006, May) Designers stretched in asic, fpga tug-of-war. [Online]. Available: <http://www.eetimes.com/design/automotive-design/4004080/> Designers-stretched-in-ASIC-FPGA-tug-of-war
- [4] C. R. Irick, "Enhancing gnu radio for hardware accelerated radio design," Ph.D. dissertation, Virginia Polytechnic Institute and State University, 2010.
- [5] K. Tan, J. Zhang, J. Fang, H. Liu, Y. Ye, S. Wang, Y. Zhang, H. Wu, W. Wang, and G. M. Voelker, "Sora: High performance software radio using general purpose multi-core processors," *NSDI 09: 6th USENIX Symposium on Networked Systems Design and Implementation*, 2009.

- [6] A. T. Tran, D. N. Truong, and B. M. Baas, "A complete real-time 802.11a baseband receiver implemented on an array of programmable processors," *Signals, Systems and Computers, 2008 42nd Asilomar Conference on*, pp. 165–170, 2008.
- [7] J. H. Reed, *Software Radio - A Modern Approach to Radio Engineering*. PEARSON Education, 2006.
- [8] D. Lau, J. Blackburn, and J. A. Seely, "The use of hardware acceleration in sdr waveforms," Altera Corporation, Tech. Rep., 2005.
- [9] Intel. (2003) Intel hyper-threading technology. Intel. [Online]. Available: [http://cache-www.intel.com/cd/00/00/01/77/17705\\_htt\\_user\\_guide.pdf](http://cache-www.intel.com/cd/00/00/01/77/17705_htt_user_guide.pdf)
- [10] Fftw. <http://www.fftw.org/>.
- [11] T. Rondeau. (2010, 12) Volk: Vector-optimized library of kernels. GNU Radio. [Online]. Available: <http://www.trondeau.com/blog/2010/12/11/volk-vector-optimized-library-of-kernels.html>
- [12] T. Schmid, "Gnu radio 802.15.4 en- and decoding," NESL Department of Electrical Engineering University of California, Los Angeles, Tech. Rep., 2006.
- [13] M. A. R. Saghir, P. Chow, and C. G. Lee. (1998) A comparison of traditional and vliw dsp architectures for compiled dsp applications. University of Toronto. [Online]. Available: [www.eecg.toronto.edu/~saghir/papers/CASES98.ps](http://www.eecg.toronto.edu/~saghir/papers/CASES98.ps)

- [14] D. Liu, *Embedded DSP Processor Design: Application Specific Instruction Set Processors*. Morgan Kaufmann, 2008.
- [15] Altera. (2001) Fpgas provide reconfigurable dsp solutions. [Online]. Available: [http://www.altera.com/literature/wp/wp\\_dsp\\_fpga.pdf](http://www.altera.com/literature/wp/wp_dsp_fpga.pdf)
- [16] Xilinx. (2012, January) Virtex-5 fpga xtremedsp design considerations. [Online]. Available: [http://www.xilinx.com/support/documentation/user\\_guides/ug193.pdf](http://www.xilinx.com/support/documentation/user_guides/ug193.pdf)
- [17] R. P. Brent, F. T. Luk, and C. V. Loan, "Computation of the singular value decomposition using mesh-connected processors," *Journal of VLSI and Computer Systems*, pp. 242–270, 1985.
- [18] AccelChip. Using fpgas to improve the performance of radar, navigation and guidance systems. [Online]. Available: <http://www.design-reuse.com/articles/11366/using-fpgas-to-improve-the-performance-of-radar-navigation-and-guidance-systems.html>
- [19] I. Berkeley Design Technology. (2010) High-level synthesis tools for xilinx fpgas. Berkeley Design Technology, Inc. [Online]. Available: [www.bdti.com/MyBDTI/pubs/Xilinx\\_hlstcp.pdf](http://www.bdti.com/MyBDTI/pubs/Xilinx_hlstcp.pdf)
- [20] ——. (2007, January) Fpgas vs. dsps: A look at the unanswered questions. [Online]. Available: <http://www.eetimes.com/design/signal-processing-dsp/4017003/FPGAs-vs-DSPs-A-look-at-the-unanswered-questions/>

- [21] D. Gislason, *ZigBee Wireless Networking*. Newnes, 2008.
- [22] E. Research. Ettus research. [Online]. Available: <http://ettus.com/>
- [23] fredric j harris, *Multirate Signal Processing For Communication Systems*. Prentice Hall, 2004.
- [24] J. Notor, A. Caviglia, and G. Levy. (2003) Cmos rfic architectures for ieee 802.15.4 networks. Cadence Design Systems, Inc. 6210 Old Dobbin Lane, Suite 100 Columbia, Maryland 21045, USA. [Online]. Available: <http://chipbook.com/databook/Interface/ZIGBEE/CMOSRFICArchforIEEE80215.pdf>
- [25] H. Meyr, M. Moeneclaey, and S. A. Fechtel, *Digital Communication Receivers*. John Wiley & Sons, 1998.
- [26] S. B. Wicker, *Error Control Systems*. Prentice Hall, 1995.
- [27] K. Sankar. (2008, March) Frequency offset estimation using 802.11a short preamble. [Online]. Available: <http://www.dsblog.com/2008/03/03/frequency-offset-estimation-using-80211a-short-preamble/>
- [28] MathWorks. Simulink hdl coder. MathWorks. [Online]. Available: <http://www.mathworks.com/products/slhdlcoder/>
- [29] J. Blum. (2012, May) Usrp users mailing list – universal software radio peripheral by ettus research. [Online]. Available: <http://comments.gmane.org/gmane.comp.hardware.usrp.e100/3719>

# Appendix A

## Verilog Source Code for IEEE 802.15.4 Receiver on USRP N210's FPGA

### A.1 Receiver Top Level

```
1  'timescale 1 ns / 1 ns
2  ///////////////////////////////////////////////////////////////////
3  // Company:      Wireless @ VT
4  // Author:       Jeong-O Jeong
5  //
6  // Date:         16:41:48 12/17/2011
7  // Design Name:
8  // Filename:     xbee_rx.v
9  // Project Name:
10 // Target Devices:
11 // Tool versions:
12 // Description:   Top file for the IEEE 802.15.4 receiver
13 //
14 // Dependencies:
15 //
16 // Revision:
17 // Revision 0.01 – File Created
18 // Additional Comments:
19 //
20 ///////////////////////////////////////////////////////////////////
21 module xbee_rx
22 (
23   input clk,                      // Clock 100 MHz
24   input reset,                    // Reset
25   input strobe_in,                // 4 MHz strobe
26   input signed [15:0] in_real,    // Input 16-bit real
27   input signed [15:0] in_imag,    // Input 16-bit imaginary
28   input signed [24:0] in_THP,     // Preamble Threshold
29   output signed [15:0] out_real,
30   output signed [15:0] out_imag,
31   output signed [24:0] out_THP
32 );
```

```

29  input signed [24:0] in_THP_NEXT,           // Secondary Preamble Threshold
30  input signed [22:0] in_THS,                // SFD Threshold
31  output reg  out_strobe_symbol,             // Symbol strobe
32  output reg  out_strobe_symbol_delayed,    // Delayed symbol strobe
33  output reg  out_strobe_byte,               // Delayed byte strobe
34  output reg  out_strobe_byte_pre,           // Byte strobe
35  output reg  [3:0] out_symbol,              // Decoded symbol
36  output reg  [7:0] out_byte,                // Decoded byte
37  output reg  [24:0] out_p_corr,              // Preamble correlation
38  output reg  [22:0] out_s_corr,              // SFD correlation
39  output reg  [3:0] out_state,               // MAC state
40  output reg  out_crc_correct,              // Strobe for correct CRC
41  output [511:0] debug                     // Debug signals
);

43
44 // maximum index of findmax16
45 wire [15:0] index_max;

46 // delay conjugate multiply
47 wire signed [15:0] out_phase;      // phase output
48 wire out_phase_ready;            // valid when phase output is ready

50
51 // output of fifo between clock recovery and correlation blocks
52 wire signed [15:0] fifo_avg_out;

53
54 // output of fifo between preamble correlation and MAC state machine
55 wire signed [24:0] fifo_preamble_dout;

56
57 // output of coregen_fir_preamble
58 wire signed [24:0] preamble_dout;
59 wire preamble_rfd;
60 wire preamble_rdy;

61
62 // output of fifo between sfd correlation and MAC state machine
63 wire signed [24:0] _fifo_sfd_dout;
64 wire signed [22:0] fifo_sfd_dout;
65 wire fifo_sfd_full, fifo_sfd_empty;

66
67 // output of coregen_fir_sfd
68 wire signed [22:0] sfd_dout;
69 wire sfd_rfd, sfd_rdy;

70
71 // output of MAC state machine
72 wire strobe_sym_from_mac;
73 wire strobe_byte_from_mac;
74 wire strobe_byte_pre_from_mac;
75 wire [3:0] out_state_from_mac;
76 wire out_crc_correct_from_mac;
77 wire [7:0] out_byte_from_mac;
78 wire [15:0] out_crc_from_mac;

```

79

```

// Debug signals
81 wire [511:0] debug_mac;

83 // Generate 2MHz strobe for chips
cic_strober #(.WIDTH(8)) // output 2MHz clk
85 strober_2(
    .clock(clk),
    .reset(reset),
    .enable(1'b1),
    .rate(2),
    .strobe_fast(strobe_in),
    .strobe_slow(strobe_out_2mhz)
);
93

// registering output of MAC state machine
95 always @(posedge clk) begin
    if(reset) begin
        out_state <= 0;
        out_crc_correct <= 0;
        out_byte <= 0;
        out_strobe_symbol <= 0;
        out_strobe_byte <= 0;
        out_strobe_symbol_delayed <= 0;
    end
    else begin
        out_state <= out_state_from_mac;
        out_crc_correct <= out_crc_correct_from_mac;
        out_byte <= out_byte_from_mac;
        out_strobe_symbol <= strobe_sym_from_mac;
        out_strobe_byte_pre <= strobe_byte_pre_from_mac;
        out_strobe_byte <= strobe_byte_from_mac;
        out_strobe_symbol_delayed <= out_strobe_symbol;
    end
end
113

115 // Registering symbol output
116 always @(posedge clk) begin
    if(reset) begin
        out_symbol <= 0;
    end
    else begin
        if(strobe_sym_from_mac)
            out_symbol <= index_max[3:0];
    end
end
125

// AGC
127 wire [32:0] out_agc;
128 wire [20:0] debug_agc;
129 agc agc(
    .clk(clk),

```

```

131     .rst(reset),
132     .in({in_real, in_imag, strobe_in}), // input
133     .out(out_agc), // output
134     .debug(debug_agc) // debug signals
135 );
136
137 // Delay-conjugate-multiply
138 fmdemod fmdemod (
139     .clk(clk),
140     .reset(reset),
141     .ce(out_agc[0]), // enable signal = valid from agc output
142     .in_real(out_agc[32:17]), // input real
143     .in_imag(out_agc[16:1]), // input imag
144     .in_real_valid(1'b1), // input valid
145     .in_imag_valid(1'b1), // input valid
146     .out_phase(out_phase), // output phase, fractional bits = 13
147     .out_phase_ready(out_phase_ready) // output phase available
148 );
149
150 // Clock Recovery
151 wire [32:0] out_mm;
152 clock_recovery_mm clock_recovery_mm (
153     .clk(clk),
154     .rst(reset),
155     .in({out_phase, 16'd0, strobe_in}/*out_agc*/),
156     .out(out_mm)
157 );
158
159 // FIFO to store output of clock recovery
160 coregen_fifo_512_16 coregen_fifo_avg_out (
161     .clk(clk), // input clk
162     .rst(reset), // input rst
163     .din(out_mm[32:17]), // input [15 : 0] din
164     .wr_en(~fifo_avg_out_full & out_mm[0]), // input wr_en
165     .rd_en(~fifo_avg_out_empty & preamble_rfd), // input rd_en
166     .dout(fifo_avg_out), // output [15 : 0] dout
167     .full(fifo_avg_out_full), // output full
168     .empty(fifo_avg_out_empty) // output empty
169 );
170
171 // Preamble correlation
172 coregen_fir_preamble coregen_fir_preamble (
173     .clk(clk), // 100MHz
174     .rfd(preamble_rfd), // ready for data
175     .rdy(preamble_rdy), // output data ready
176     .din(fifo_avg_out), // [15:0]
177     .dout(preamble_dout) // [24:0]
178 );
179
180 // FIFO to store preamble correlation
181 coregen_fifo_512_25 coregen_fifo_25_preamble (

```

```

183     .clk(clk),                                // input clk
184     .rst(reset),                               // input rst
185     .din(preamble_dout),                      // input [24 : 0] din
186     .wr_en(~fifo_preamble_full & preamble_rdy), // input wr_en
187     .rd_en(~fifo_preamble_empty & strobe_out_2mhz), // input rd_en
188     .dout(fifo_preamble_dout),                // output [24 : 0] dout
189     .full(fifo_preamble_full),                // output full
190     .empty(fifo_preamble_empty)               // output empty
191 );
192
193 // SFD correlation
194 coregen_fir_sfd coregen_fir_sfd (
195     .clk(clk),
196     .rfd(sfd_rfd),
197     .rdy(sfd_rdy),
198     .din(fifo_avg_out),
199     .dout(sfd_dout) // [22:0]
200 );
201
202 // FIFO to store SFD correlation
203 assign fifo_sfd_dout = _fifo_sfd_dout[22:0];
204 coregen_fifo_512_25 coregen_fifo_25_sfd (
205     .clk(clk),                                // input clk
206     .rst(reset),                             // input rst
207     .din({{2{ sfd_dout[22]}}, sfd_dout}), // input [24 : 0] din
208     .wr_en(~fifo_sfd_full & sfd_rdy),      // input wr_en
209     .rd_en(~fifo_sfd_empty & strobe_out_2mhz), // input rd_en
210     .dout(_fifo_sfd_dout),                  // output [24 : 0] dout
211     .full(fifo_sfd_full),                  // output full
212     .empty(fifo_sfd_empty)                 // output empty
213 );
214
215 // Symbol correlations
216 wire fir_bank_sym_out_rfd;
217 wire signed [21:0] fir_bank_sym_out_0;
218 wire signed [21:0] fir_bank_sym_out_1;
219 wire signed [21:0] fir_bank_sym_out_2;
220 wire signed [21:0] fir_bank_sym_out_3;
221 wire signed [21:0] fir_bank_sym_out_4;
222 wire signed [21:0] fir_bank_sym_out_5;
223 wire signed [21:0] fir_bank_sym_out_6;
224 wire signed [21:0] fir_bank_sym_out_7;
225 wire signed [21:0] fir_bank_sym_out_8;
226 wire signed [21:0] fir_bank_sym_out_9;
227 wire signed [21:0] fir_bank_sym_out_10;
228 wire signed [21:0] fir_bank_sym_out_11;
229 wire signed [21:0] fir_bank_sym_out_12;
230 wire signed [21:0] fir_bank_sym_out_13;
231 wire signed [21:0] fir_bank_sym_out_14;
232 wire signed [21:0] fir_bank_sym_out_15;

```

```

233 // Symbol Correlations bank
fir_bank_sym fir_bank_sym (
235 .clk(clk),
236 .reset(reset),
237 .strobe_in(strobe_out_2mhz), // Strobe at input chip rate
238 .strobe_out(strobe_out_2mhz), // Strobe at output correlation rate
239 .din(fifo_avg_out), // Input chips
240 .out_rfd(fir_bank_sym_out_rfd), // Ready-for-data of correlation blocks
241 .out_0(fir_bank_sym_out_0), // Correlation value for symbol 0
242 .out_1(fir_bank_sym_out_1), // Correlation value for symbol 1
243 .out_2(fir_bank_sym_out_2), // Correlation value for symbol 2
244 .out_3(fir_bank_sym_out_3), // Correlation value for symbol 3
245 .out_4(fir_bank_sym_out_4), // Correlation value for symbol 4
246 .out_5(fir_bank_sym_out_5), // Correlation value for symbol 5
247 .out_6(fir_bank_sym_out_6), // Correlation value for symbol 6
248 .out_7(fir_bank_sym_out_7), // Correlation value for symbol 7
249 .out_8(fir_bank_sym_out_8), // Correlation value for symbol 8
250 .out_9(fir_bank_sym_out_9), // Correlation value for symbol 9
251 .out_10(fir_bank_sym_out_10), // Correlation value for symbol 10
252 .out_11(fir_bank_sym_out_11), // Correlation value for symbol 11
253 .out_12(fir_bank_sym_out_12), // Correlation value for symbol 12
254 .out_13(fir_bank_sym_out_13), // Correlation value for symbol 13
255 .out_14(fir_bank_sym_out_14), // Correlation value for symbol 14
256 .out_15(fir_bank_sym_out_15) // Correlation value for symbol 15
257 );
258
259 // Findmax 16
findmax16 findmax16 (
260 .clk(clk),
261 .ce(strobe_out_2mhz),
262 .reset(reset),
263 .value0(fir_bank_sym_out_0),
264 .value1(fir_bank_sym_out_1),
265 .value2(fir_bank_sym_out_2),
266 .value3(fir_bank_sym_out_3),
267 .value4(fir_bank_sym_out_4),
268 .value5(fir_bank_sym_out_5),
269 .value6(fir_bank_sym_out_6),
270 .value7(fir_bank_sym_out_7),
271 .value8(fir_bank_sym_out_8),
272 .value9(fir_bank_sym_out_9),
273 .value10(fir_bank_sym_out_10),
274 .value11(fir_bank_sym_out_11),
275 .value12(fir_bank_sym_out_12),
276 .value13(fir_bank_sym_out_13),
277 .value14(fir_bank_sym_out_14),
278 .value15(fir_bank_sym_out_15),
279 .index_out(index_max),
280 .value_out(),
281 .passthrough()
282 );
283

```

```

285 // MAC
286 reg [24:0] delay_p [0:2];
287 reg [22:0] delay_s [0:2];
288 reg [24:0] in_p_corr;
289 reg [22:0] in_s_corr;

291 // need to delay preamble and sfd by 4 2mhz clock cycles because of findmax's
292 // latency
293 always @(posedge clk)
294 begin
295   if(reset) begin
296     in_p_corr <= 0;
297     in_s_corr <= 0;
298     delay_p[0] <= 0;
299     delay_p[1] <= 0;
300     delay_p[2] <= 0;
301     delay_s[0] <= 0;
302     delay_s[1] <= 0;
303     delay_s[2] <= 0;
304   end
305   else if(strobe_out_2mhz) begin
306     delay_p[2] <= fifo_preamble_dout;
307     delay_p[1] <= delay_p[2];
308     delay_p[0] <= delay_p[1];
309     in_p_corr <= delay_p[0];
310     delay_s[2] <= fifo_sfd_dout;
311     delay_s[1] <= delay_s[2];
312     delay_s[0] <= delay_s[1];
313     in_s_corr <= delay_s[0];
314   end
315 end

316 // MAC state machine
317 xbee_mac xbee_mac (
318   .clk(clk),
319   .reset(reset),
320   .strobe_in(strobe_out_2mhz),           // 2mhz
321   .in_p_corr(in_p_corr),                 // preamble correlation
322   .in_s_corr(in_s_corr),                 // SFD correlation
323   .in_sym(index_max[3:0]),               // symbol
324   .strobe_sym(strobe_sym_from_mac),      // output strobe 62.5khz
325   .strobe_byte(strobe_byte_from_mac),    // output strobe 31.25khz
326   .strobe_byte_pre(strobe_byte_pre_from_mac), // output strobe 31.25khz
327   .out_byte(out_byte_from_mac),          // output byte
328   .out_crc(out_crc_from_mac),           // output crc
329   .state(out_state_from_mac),           // output state
330   .out_crc_correct(out_crc_correct_from_mac), // output crc strobe
331   .debug(debug_mac),                   // debug
332   .in_THP(in_THP),                     // preamble threshold
333   .in_THP_NEXT(in_THP_NEXT),           // secondary preamble threshold

```

```

335     .in_THS(in_THS)                      // SFD threshold
      );
336
337 // Register preamble correlation
338   always @(posedge clk)
339   begin
340     if(reset)
341       out_p_corr <= 0;
342     else
343       out_p_corr <= fifo_preamble_dout;
344   end
345
346 // Register SFD correlation
347   always @(posedge clk)
348   begin
349     if(reset)
350       out_s_corr <= 0;
351     else
352       out_s_corr <= fifo_sfd_dout;
353   end
354
355 // Debug signals
356   assign debug = {debug_agc, in_real, in_imag, strobe_in, out_agc};
357
358 endmodule // xbee_rx

```

src/rx/xbee\_rx.v

## A.2 Strober

```

1 // 
2 //  USRP2 – Universal Software Radio Peripheral Mk II
3 //
4 //  Copyright (C) 2008 Matt Ettus
5 //
6 //  This program is free software; you can redistribute it and/or modify
7 //  it under the terms of the GNU General Public License as published by
8 //  the Free Software Foundation; either version 2 of the License, or
9 //  (at your option) any later version.
10 //
11 //  This program is distributed in the hope that it will be useful,
12 //  but WITHOUT ANY WARRANTY; without even the implied warranty of
13 //  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
14 //  GNU General Public License for more details.
15 //
16 //  You should have received a copy of the GNU General Public License
17 //  along with this program; if not, write to the Free Software
//  Foundation, Inc., 51 Franklin Street, Boston, MA 02110–1301 USA

```

```

19 ///
21 module cic_strober
22   #(parameter WIDTH=8)
23   ( input  clock ,
24     input  reset ,
25     input  enable ,
26     input [WIDTH-1:0] rate , // Rate should EQUAL to your desired divide
27       ratio , no more -1 BS
28     input  strobe_fast ,
29     output wire strobe_slow );
30
31 reg [WIDTH-1:0] counter;
32 wire      now = (counter==1);
33 assign    strobe_slow = now && enable && strobe_fast;
34
35 always @(posedge clock)
36   if (reset)
37     counter <= 8;
38   else if (~enable)
39     counter <= rate;
40   else if (strobe_fast)
41     if (now)
42       counter <= rate;
43   else
44     counter <= counter - 1;
45 endmodule // cic_strober

```

src/rx/cic\_strober.v

### A.3 AGC

```

1 'timescale 1ns / 1ps
2 //////////////////////////////////////////////////////////////////
3 // Company:      Wireless @ VT, Virginia Tech CCM Lab
4 // Author:       Jeong-O Jeong
5 //
6 // Date:         17:37:41 05/14/2012
7 // Design Name:
8 // Filename:     agc.v
9 // Project Name:
10 // Target Devices:
11 // Tool versions:
12 // Description: Automatic Gain Control based on GNU Radio's 'agc2' block
13 //
14 // Dependencies: mult - Xilinx core 16x16 multiplier
15 //                  sqrt - Xilinx core square root

```

```

// Revision:
// Revision 0.01 - File Created
// Additional Comments:
//
////////////////////////////////////////////////////////////////
module agc
#( parameter REFERENCE = 16'd512,      // 0.5 (10 decimal places)
parameter ATTACK RATE = 16'd32,        // 1e-3 (15 decimal places)
parameter DECAY RATE = 16'd32767,     // 0.99 (15 decimal places)
parameter INITIAL GAIN = 16'd16384,    // 4.0 (12 decimal places)
parameter MAXGAIN = 21'd262144       // 64.0 (12 decimal places)
)
(
    input clk ,
    input rst ,
    input [32:0] in ,           // 16-bit I, 16-bit Q, valid
    output reg [32:0] out,      // 16-bit I, 16-bit Q, valid
    output [127:0] debug       // debug signal
);
// wire signed [31:0] prod_error_rate; // 25 decimal places
// reg signed [15:0] error;          // 10 decimal places
// reg signed [15:0] rate;           // 15 decimal
// registered inputs
reg signed [15:0] in_i;
reg signed [15:0] in_q;
reg in_valid;
// register inputs
always @(posedge clk) begin
    if(rst) begin
        in_i <= 0;
        in_q <= 0;
    end
    else begin
        if(in[0]) begin
            in_i <= in[32:17];
            in_q <= in[16:1];
        end
        in_valid <= in[0];
    end
end
// set gain
wire signed [20:0] gain; // 21bits, 12 fractional
// pipeline registers for rdy_error_rate = rdy signal from sqrt block
reg rdy_error_rate[0:2];

```

```

67 // error * rate
68 wire signed [39:0] prod_error_rate_adjusted;
69 assign prod_error_rate_adjusted = {{3{prod_error_rate[31]}}, prod_error_rate, 5'0}; // need to make 25 -> 30 fractional
70
71 reg signed [39:0] full_gain; // 30 decimal places
72
73 // gain = (error * rate)
74 wire signed [39:0] diff_gain_prod_error_rate;
75 assign diff_gain_prod_error_rate = (full_gain - prod_error_rate_adjusted);
76
77 // maximum gain
78 wire signed [39:0] max_gain;
79 assign max_gain = {MAX_GAIN, 18'd0};
80
81 always @(posedge clk) begin
82   if(rst) begin
83     full_gain <= {INITIAL_GAIN, 18'd0}; // 18 = 30 decimal - 12 decimal places
84   end
85   else begin
86     if(rdy_error_rate[2]) begin // when rate*error signal is ready
87       if(diff_gain_prod_error_rate > max_gain)
88         full_gain <= max_gain;
89       else if(diff_gain_prod_error_rate < 0)
90         full_gain <= 1; // small number as per GNU Radio code
91       else
92         full_gain <= diff_gain_prod_error_rate;
93     end
94   end
95 end
96
97 // truncate gain to be 21 bits with 12 fractional
98 assign gain = full_gain[38:18];
99 wire signed [36:0] out_i; // 27 decimal places
100 wire signed [36:0] out_q; // 27 decimal places
101
102 // multiply inputs by gain to produce outputs
103 mult16x21 mult16x21_out_i (
104   .clk(clk), // input clk
105   .a(in_i), // input [15 : 0] a (15 decimal places)
106   .b(gain), // input [20 : 0] b (12 decimal places)
107   .ce(1'b1), // input ce
108   .p(out_i) // output [36 : 0] p (27 decimal places)
109 );
110 mult16x21 mult16x21_out_q (
111   .clk(clk), // input clk
112   .a(in_q), // input [15 : 0] a
113   .b(gain), // input [20 : 0] b
114   .ce(1'b1), // input ce
115   .p(out_q) // output [36 : 0] p
116 );

```

```

117 reg signed [15:0] out_i_hardlimit; // 11 fractional bits
119 reg signed [15:0] out_q_hardlimit;
120 parameter ONE_FRAC_27 = 32'd134217727; // 1, with 27 fractional bits
121 parameter NEG_ONE_FRAC_27 = -32'd134217728; // -1, with 27 fractional bits

123 // limit range of out_i and out_q
124 always @(posedge clk) begin
125   if(rst)
126     out_i_hardlimit <= 0;
127   else if(out_i[31]==1'b0 && out_i > ONE_FRAC_27) // if positive, and greater
128     than 0.9999
129     out_i_hardlimit <= ONE_FRAC_27[31:16];
130   else if(out_i[31]==1'b1 && out_i < NEG_ONE_FRAC_27) // if negative, and
131     smaller than -1
132     out_i_hardlimit <= NEG_ONE_FRAC_27[31:16];
133   else
134     out_i_hardlimit <= out_i[31:16];
135 end

136 always @(posedge clk) begin
137   if(rst)
138     out_q_hardlimit <= 0;
139   else if(out_q[31]==1'b0 && out_q > ONE_FRAC_27) // if positive, and greater
140     than 0.9999
141     out_q_hardlimit <= ONE_FRAC_27[31:16];
142   else if(out_q[31]==1'b1 && out_q < NEG_ONE_FRAC_27) // if negative, and
143     smaller than -1
144     out_q_hardlimit <= NEG_ONE_FRAC_27[31:16];
145   else
146     out_q_hardlimit <= out_q[31:16];
147 end

148 // pipeline valid signals
149 reg valid_prod_in_gain[3:0];
150 reg valid_out_square[4:0];

151 always @(posedge clk) begin
152   if(rst) begin
153     valid_prod_in_gain[0] <= 0;
154     valid_prod_in_gain[1] <= 0;
155     valid_prod_in_gain[2] <= 0;
156     valid_prod_in_gain[3] <= 0;
157     valid_out_square[0] <= 0;
158     valid_out_square[1] <= 0;
159     valid_out_square[2] <= 0;
160     valid_out_square[3] <= 0;
161     valid_out_square[4] <= 0;
162   end
163   else begin

```

```

165     valid_prod_in_gain[0] <= in_valid;
166     valid_prod_in_gain[1] <= valid_prod_in_gain[0];
167     valid_prod_in_gain[2] <= valid_prod_in_gain[1];
168     valid_prod_in_gain[3] <= valid_prod_in_gain[2];
169     valid_out_square[0] <= valid_prod_in_gain[3];
170     valid_out_square[1] <= valid_out_square[0];
171     valid_out_square[2] <= valid_out_square[1];
172     valid_out_square[3] <= valid_out_square[2];
173     valid_out_square[4] <= valid_out_square[3];
174 end
175
176 // real*real , imag*imag
177 wire [31:0] prod_i; // 22 decimal places
178 wire [31:0] prod_q; // 22 decimal places
179
180 mult mult16x16_i (
181     .clk(clk), // input clk
182     .a(out_i_hardlimit), // input [15 : 0] a
183     .b(out_i_hardlimit), // input [15 : 0] b
184     .ce(1'b1), // input ce
185     .p(prod_i) // output [31 : 0] p
186 );
187 mult mult16x16_q (
188     .clk(clk), // input clk
189     .a(out_q_hardlimit), // input [15 : 0] a
190     .b(out_q_hardlimit), // input [15 : 0] b
191     .ce(1'b1), // input ce
192     .p(prod_q) // output [31 : 0] p
193 );
194
195 parameter ONE_FRAC_15 = 16'd32767; // 1, with 15 fractional bits
196 parameter NEG_ONE_FRAC_15 = -16'd32768; // -1, with 15 fractional bits
197
198 // limit range of final output
199 always @(posedge clk) begin
200     if(rst)
201         out[32:17] <= 0;
202     else if(out_i[31]==1'b0 && out_i > ONE_FRAC_27) // if positive, and greater
203         than 0.9999
204         out[32:17] <= ONE_FRAC_27[27:12];
205     else if(out_i[31]==1'b1 && out_i < NEG_ONE_FRAC_27) // if negative, and
206         smaller than -1
207         out[32:17] <= NEG_ONE_FRAC_27[27:12];
208     else
209         out[32:17] <= out_i[27:12];
210 end
211
212 // Register output
213 always @(posedge clk) begin
214     if(rst)

```

```

213    out[16:1] <= 0;
214  else if(out_q[31]==1'b0 && out_q > ONE_FRAC_27) // if positive , and greater
215    than 0.9999
216    out[16:1] <= ONE_FRAC_27[27:12];
217  else if(out_q[31]==1'b1 && out_q < NEG_ONE_FRAC_27) // if negative , and
218    smaller than -1
219    out[16:1] <= NEG_ONE_FRAC_27[27:12];
220  else
221    out[16:1] <= out_q[27:12];
222 end
223
224 // Register output valid
225 always @(posedge clk) begin
226   if(rst)
227     out[0] <= 0;
228   else
229     out[0] <= valid_out_square[4];
230 end
231
232 // take sqrt of the sum to get signal magnitude
233 wire signed [32:0] sum;           // 22 decimal places
234 assign sum=prod_i+prod_q;
235
236 wire rdy;
237 wire signed [15:0] signal_mag; // 10 decimal places
238
239 sqrt sqrt(
240   .x_in(sum),          // input [32 : 0] x_in
241   .nd(valid_out_square[4]), // nd
242   .x_out(signal_mag),    // output [15 : 0] x_out
243   .rdy(rdy),           // output rdy
244   .clk(clk),            // input clk
245   .ce(1'b1)             // input ce
246 );
247
248 // error = magnitude - reference
249 always @(posedge clk) begin
250   if(rst) begin
251     error <= 0; // 10 decimal places
252   end
253   else begin
254     if(rdy) begin
255       error <= signal_mag - REFERENCE;
256     end
257   end
258 end
259
260 // pick ATTACK or DECAY rate
261 wire signed [20:0] error_adjusted; // match gain = 21 bits , 12 fractional
262 assign error_adjusted = {{3{error[15]}},error,2'd0};

```

```

263    always @(posedge clk) begin
264        if(rst) begin
265            rate <= 0;
266        end
267        else begin
268            if(rdy) begin
269                if(error_adjusted > gain) // since error has 10 decimal and gain has 12
270                    decimal
271                rate <= ATTACKRATE;
272                else
273                    rate <= DECAY_RATE;
274            end
275        end
276    end
277
278    // multiply rate and error
279    // prod_error_rate 25 fractional bits
280    mult mult_error_rate (
281        .clk(clk), // input clk
282        .a(error), // input [15 : 0] a // 10 decimal
283        .b(rate), // input [15 : 0] b // 15 decimal
284        .ce(1'b1), // input ce
285        .p(prod_error_rate) // output [31 : 0]
286    );
287
288    // rdy_error_rate[0] - high when sqrt is calculated
289    // rdy_error_rate[1] - high when error is calculated
290    // rdy_error_rate[2] - high when prod_error_rate is calculated (error * rate)
291    always @(posedge clk) begin
292        if(rst) begin
293            rdy_error_rate[0] <= 0;
294            rdy_error_rate[1] <= 0;
295            rdy_error_rate[2] <= 0;
296        end
297        else begin
298            rdy_error_rate[0] <= rdy;
299            rdy_error_rate[1] <= rdy_error_rate[0];
300            rdy_error_rate[2] <= rdy_error_rate[1];
301        end
302    end
303
304    assign debug = gain;
305 endmodule

```

src/rx/agc.v

## A.4 Delay-conjugate-multiply

```

'timescale 1 ns / 1 ns
// Company:      Wireless @ VT
// Author:       Jeong-O Jeong
//
// Date:          16:41:48 12/17/2011
// Design Name:
// Filename:     fmdemod.v
// Project Name:
// Target Devices:
// Tool versions:
// Description:   Delay-conjugate-multiply block
//
// Dependencies: Requires Xilinx CoreGen complex multiply and arctan
//
// Revision:
// Revision 0.01 - File Created
// Additional Comments:
//
module fmdemod
#(parameter WIDTH=16)
(
    input clk,
    input reset,
    input ce,
    input signed [WIDTH-1:0] in_real,           // Input 16-Bit real
    input signed [WIDTH-1:0] in_imag,           // Input 16-Bit imag
    input in_real_valid,                      // Valid for input
    input in_imag_valid,                      // Valid for input
    output reg signed [WIDTH-1:0] out_phase, // Output phase
    output reg out_phase_ready                // Valid when out_phase is available
);
// Delayed input values and valids
reg signed [WIDTH-1:0] in_real_delayed;
reg signed [WIDTH-1:0] in_imag_delayed;
reg in_real_valid_delayed;
reg in_imag_valid_delayed;
// Output of coregen_complex_mult
wire signed [32:0] out_real;
wire signed [32:0] out_imag;
// Output of coregen_arctan
wire signed [15:0] phase_out;
// Output of FIFO
wire [15:0] dout;
wire empty;

```

```

52 // Delay input
53 always @(posedge clk) begin
54   if(reset) begin
55     in_real_delayed <= 0;
56     in_imag_delayed <= 0;
57     in_real_valid_delayed <= 0;
58     in_imag_valid_delayed <= 0;
59   end
60   else if(ce) begin
61     in_real_delayed <= in_real;
62     in_imag_delayed <= in_imag;
63     in_real_valid_delayed <= in_real_valid;
64     in_imag_valid_delayed <= in_imag_valid;
65   end
66 end

67 // complexMult
68 // latency 6
69 coregen_complex_mult coregen_complex_mult (
70   .clk(clk),           // input clk
71   .ce(ce),             // input ce
72   .ar(in_real),        // input [15 : 0] ar
73   .ai(in_imag),        // input [15 : 0] ai
74   .br(in_real_delayed), // input [15 : 0] br
75   .bi(-in_imag_delayed), // input [15 : 0] bi
76   .pr(out_real),       // output [32 : 0] pr
77   .pi(out_imag));      // output [32 : 0] pi

78 // inverse tangent core
79 // latency : 20
80 coregen_arctan coregen_arctan (
81   .clk(clk),           // input clk
82   .ce(ce),             // input ce
83   .x_in(out_real),     // input [32 : 0] x_in
84   .y_in(out_imag),     // input [32 : 0] y_in
85   .phase_out(phase_out) // output [15 : 0] phase_out (fractional bits=13)
86 );
87

88 // output values
89 always @(posedge clk) begin
90   if(reset) begin
91     out_phase <= 0;
92     out_phase_ready <= 0;
93   end
94   else if(ce) begin
95     out_phase <= dout;
96     out_phase_ready <= ~empty;
97   end
98 end
99 end
100
```

```

102 // set to zero if outside [-pi/2:pi/2]
103 reg signed [15:0] clipped_phase_out;
104 reg signed [15:0] upperlimit = 16'd12868; // 12868 = pi/2
105 reg signed [15:0] lowerlimit = 16'd52668; // -pi/2
106
107 always @(posedge clk) begin
108   if(reset) begin
109     clipped_phase_out <= 0;
110   end
111   else if(ce) begin
112     // if outside [-pi/2:pi/2] set to zero
113     if(phase_out > upperlimit || phase_out < lowerlimit)
114       clipped_phase_out <= 16'd0;
115     else
116       clipped_phase_out <= phase_out;
117   end
118 end
119
120 // output fifo
121 coregen_fifo coregen_fifo (
122   .rst(reset),           // input rst
123   .wr_clk(clk),          // input wr_clk
124   .rd_clk(clk),          // input rd_clk
125   .din(clipped_phase_out), // input [15 : 0] din
126   .wr_en(~full & ce),    // input wr_en
127   .rd_en(~empty & ce),    // input rd_en
128   .dout(dout),           // output [15 : 0] dout
129   .full(full),           // output full
130   .empty(empty)           // output empty
131 );
132
133 endmodule // delayConjMult

```

src/rx/fmdemod.v

## A.5 Clock Recovery

```

1  `timescale 1ns / 1ps
2  ///////////////////////////////////////////////////////////////////
3  // Company:      Wireless @ VT
4  // Author:       Jeong-O Jeong
5  //
6  // Create Date:  11:33:56 07/11/2012
7  // Design Name:
8  // Module Name:  clock_recovery_mm
9  // Project Name:
10 // Target Devices:
11 // Tool versions:

```

```

// Description:      Mueller and Muller based clock recovery
13 //          Based on GNU Radio digital clock recovery mm block
//
15 // Dependencies:
//
17 // Revision:
// Revision 0.01 – File Created
19 // Additional Comments:
//
21 /////////////////////////////////////////////////
22 module clock_recovery_mm(
23     input clk,
24     input rst,
25     input [32:0] in,      // 16-bit I, 16-bit Q, valid
26     output [32:0] out    // 16-bit I, 16-bit Q, valid
27 );
28
29 parameter MU          = 32'd134217728; // 0.5, 28 fractional
30 parameter GAIN_OMEGA   = 16'd14;        // 2.25e-4 unsigned, 0 integer, 16
31             fractional
32 parameter GAIN_MU      = 16'd1966;      // 0.03, unsigned, 0 integer, 16
33             fractional
34
35 parameter OMEGA        = 32'd536870912; // 2, unsigned, 28 fractional
36 parameter MIN_OMEGA    = 32'd536763537; // unsigned, 28 fractional
37 parameter MAX_OMEGA    = 32'd536978286; // unsinged, 28 fractional
38
39 // interpolator ready counter
40 reg [3:0] rdy_counter;
41
42 // count the number of samples in memory
43 reg [3:0] total_elements;
44
45 // counter in state OMEGA
46 reg [1:0] state_counter_omega;
47
48 // State machine
49 parameter WAIT          = 3'd0;
50 parameter INTERPOLATE    = 3'd1;
51 parameter COMPUTE_MM     = 3'd2;
52 parameter COMPUTE_OMEGA  = 3'd3;
53 parameter COMPUTE_INCR   = 3'd4;
54 parameter COMPUTE_IQ      = 3'd5;
55 (* FSM_ENCODING="SEQUENTIAL", SAFE_IMPLEMENTATION="NO" *)
56 reg [2:0] state = INTERPOLATE;
57
58 // state machine
59 always@(posedge clk)
60     if (rst) begin
61         state <= WAIT;
62     end

```

```

61    else begin
62      case (state)
63        WAIT : begin
64          if (total_elements > 10)
65            state <= INTERPOLATE;
66          else
67            state <= WAIT;
68        end
69        INTERPOLATE : begin
70          if (rdy_counter == 7) // interpolation is finished with 8 input samples
71            ;
72            // 7, because rdy_counter changes to 8 one clock cycle after the 8th
73            // output is computed
74            state <= COMPUTEMM;
75          else
76            state <= INTERPOLATE;
77        end
78        COMPUTEMM : begin
79          if (1'b1)
80            state <= COMPUTEOMEGA;
81          else
82            state <= COMPUTEMM;
83        end
84        COMPUTEOMEGA : begin
85          if (state_counter_omega == 1)
86            state <= COMPUTE_INCR;
87          else
88            state <= COMPUTEOMEGA;
89        end
90        COMPUTE_INCR : begin
91          if (1'b1)
92            state <= COMPUTE_II;
93          else
94            state <= COMPUTE_INCR;
95        end
96        COMPUTE_II : begin
97          if (1'b1)
98            state <= WAIT;
99          else
100            state <= COMPUTE_II;
101        end
102      endcase
103    end
104
105    // input registers
106    reg signed [15:0] i_reg;
107    reg valid;
108
109    // fractional offset
110    reg signed [31:0] mu;           // 28 fractional
111    // samples per symbol

```

```

111    reg signed [31:0] omega; // 28 fractional
111    // error
111    reg signed [15:0] mm_val; // signed 4 integer, 12 fractional
113

115    // Interpolator wires
115    wire rfd, rdः;
117

119    // register inputs (only real value needed)
119    always @(posedge clk) begin
120        if(rst) begin
121            i_reg <= 0;
122        end
123        else if(in[0]) begin
124            i_reg <= in[32:17];
125        end
126        valid <= in[0];
127    end

129    // wires needed for 2 port block ram
130    reg [3:0] addr_write;
131    reg [3:0] ii; // pointer to start of read location
132    reg [3:0] addr_read; // actual memory read index
133    wire [15:0] dout_mem;
134    reg valid_mem;

135    reg [31:0] incr; // increment for read index, 28 fractional bit
137

139    // total elements counter
139    always @(posedge clk) begin
140        if(rst) begin
141            total_elements <= 0;
142        end
143        else if(valid) begin
144            total_elements <= total_elements + 1;
145        end
146        if(state == COMPUTE_II) begin
147            total_elements <= total_elements - incr[31:28];
148        end
149    end

151    // write address
151    always @(posedge clk) begin
152        if(rst) begin
153            addr_write <= 1'b0;
154        end
155        // whenever there is a new data increment write addr
156        else if(valid) begin
157            addr_write <= addr_write + 1;
158        end
159    end

```

```

161 // prepare for multiplication
163 wire signed [31:0] _gain_mu;
164 wire signed [31:0] _gain_omega;
165 wire signed [31:0] _mm_val;
166 assign _gain_omega = {16'b0, GAIN_OMEGA};
167 assign _gain_mu = {16'b0, GAIN_MU};
168 assign _mm_val = {{16{mm_val[15]}}, mm_val};

169 reg signed [31:0] prod_gainMu_mmVal; // 28 fractional
170 always @(posedge clk) begin
171   if(rst) begin
172     prod_gainMu_mmVal <= 0;
173   end
174   else begin
175     // 16 fractional * 12 fractional = 28 fractional
176     prod_gainMu_mmVal <= _gain_mu * _mm_val;
177   end
178 end

179

180 // increment for read index, 28 fractional bit
181 always @(posedge clk) begin
182   if(rst) begin
183     incr <= 0;
184   end
185   else if(state == COMPUTE_INCR) begin
186     //      28 + 28 + 28 fractional
187     incr <= mu + omega + prod_gainMu_mmVal;
188   end
189 end

190

191 wire [3:0] incr_int;
192 assign incr_int = incr[31:28];

193

194 // pointer to start index
195 always @(posedge clk) begin
196   if(rst) begin
197     ii <= 0;
198   end
199   else if(state == COMPUTE_II) begin
200     ii <= ii + incr[31:28];
201   end
202 end

203

204 // read address
205 always @(posedge clk) begin
206   if(rst) begin
207     addr_read <= 0;
208   end
209   else if(state == INTERPOLATE) begin

```

```

211      if(rfd && ~rdy) // using ~rdy because it goes high 8 cycles later since
212          first new data
213          addr_read <= addr_read + 1'b1;
214      end
215      else // set to ii
216          addr_read <= ii;
217      end
218
219      // rdy counter
220      always @ (posedge clk) begin
221          if(rst) begin
222              rd़y_counter <= 0;
223          end
224          else if(state == INTERPOLATE) begin
225              if(rdy)
226                  rd़y_counter <= rd़y_counter + 1;
227          end
228          else
229              rd़y_counter <= 0;
230      end
231
232      core_mem core_mem(
233          // port a
234          .clka(clk), // input clka
235          .wea(valid), // input [0 : 0] wea
236          .ena(~rst),
237          .addr(a(addr_write), // input [3 : 0] addra
238          .dina(i_reg), // input [15 : 0] dina
239          .douta(), // output [15 : 0] douta
240          // port b
241          .clkb(clk), // input clkb
242          .enb(~rst),
243          .web(1'b0), // input [0 : 0] web
244          .addrb(addr_read), // input [3 : 0] addrb
245          .dinb(16'b0), // input [15 : 0] dinb
246          .doutb(dout_mem) // output [15 : 0] doutb
247      );
248
249      wire signed [31:0] dout; // fractional 30
250      reg signed [15:0] outputs[0:1]; // 14 fractional bits
251
252      // store previous and current outputs of the interpolator
253      always @ (posedge clk) begin
254          if(rst) begin
255              outputs[0] <= 0;
256              outputs[1] <= 0;
257          end
258          else if(rdy_counter == 7) begin
259              outputs[0] <= dout[31:16]; // 14 fractional bits
              outputs[1] <= outputs[0];
          end
      
```

```

261    end

263 //assign out = {dout[30:15], 16'b0, rdy_counter == 7};
264 assign out = {dout[31:16], 16'b0, rdy_counter == 7};

265 wire signed [15:0] sample_current; // 14 fractional bits
266 wire signed [15:0] sample_last;
267 assign sample_current = outputs[0];
268 assign sample_last =outputs[1];

271 // mm error , 12 fractional bits
272 wire signed [15:0] mm_val_0, mm_val_1, mm_val_2, mm_val_3;
273 assign mm_val_0 = -sample_current + sample_last; // 14 fractional
274 assign mm_val_1 = sample_current + sample_last;
275 assign mm_val_2 = -sample_current - sample_last;
276 assign mm_val_3 = sample_current - sample_last;

277 // compute all four cases and pick one
278 always @(posedge clk) begin
279     if(rst) begin
280         mm_val <= 0;
281     end
282     else if(state == COMPUTEMM) begin
283         if(sample_last < 0 && sample_current < 0)
284             mm_val <= {{2{mm_val_0[15]}},mm_val_0[15:2]};
285         else if(sample_last > 0 && sample_current < 0)
286             mm_val <= {{2{mm_val_1[15]}},mm_val_1[15:2]};
287         else if(sample_last < 0 && sample_current > 0)
288             mm_val <= {{2{mm_val_2[15]}},mm_val_2[15:2]};
289         else
290             mm_val <= {{2{mm_val_3[15]}},mm_val_3[15:2]};
291     end
292 end
293

295 // OMEGA state counter
296 always @(posedge clk) begin
297     if(rst) begin
298         state_counter_omega <= 0;
299     end
300     else if(state == COMPUTEOMEGA) begin
301         state_counter_omega <= state_counter_omega + 1'b1;
302     end
303     else
304         state_counter_omega <= 0;
305 end

307 reg signed [31:0] prod_gainOmega_mmVal; // 28 fractional
308 // GAIN_OMEGA * mm_val
309 always @(posedge clk) begin
310     if(rst) begin
311         prod_gainOmega_mmVal <= 0;

```

```

313     end
314   else begin
315     // 16 fractional * 12 fractional = 28 fractional
316     prod_gainOmega_mmVal <= _gain_omega * _mm_val;
317   end
318
319   wire [32:0] omega_preclip = omega + prod_gainOmega_mmVal;
320
321   // omega = samples per symbol, 28 fractional
322   always @(posedge clk) begin
323     if(rst) begin
324       omega <= OMEGA;
325     end
326     else if(state == COMPUTEOMEGA & state_counter_omega == 1) begin
327       if(omega_preclip > MAXOMEGA)
328         omega <= MAXOMEGA;
329       else if(omega_preclip < MINOMEGA)
330         omega <= MINOMEGA;
331       else
332         omega <= omega_preclip;
333     end
334   end
335
336   // fractional offset, 28 fractional
337   always @(posedge clk) begin
338     if(rst) begin
339       mu <= MU;
340     end
341     else if(state == COMPUTEII) begin
342       mu <= {incr[27:0]};// fractional portion of incr
343     end
344   end
345
346   wire [7:0] filter_sel;
347   wire signed [31:0] round_mu;
348   wire signed [31:0] half;
349   assign half = 1'b1<<20;
350   assign round_mu = mu + half;
351   assign filter_sel = round_mu[28:21];
352
353   // register to indicate if the state is interpolate
354   reg reg_state_interpolate;
355   always @(posedge clk) begin
356     if(rst)
357       reg_state_interpolate <= 0;
358     else
359       reg_state_interpolate <= (state == INTERPOLATE);
360   end
361
362   wire nd;

```

```

363 assign nd = reg_state_interpolate & ~rdy & (state == INTERPOLATE);

365 // 8-tap interpolator
366 core_interpolator core_interpolator(
367   .sclr(~reg_state_interpolate), // input sclr
368   .clk(clk), // input clk
369   .ce(~rst), // input ce
370   .nd(nd), // input nd
371   .filter_sel(filter_sel), // input [7 : 0] filter_sel
372   .rfd(rfd), // output rfd
373   .rdy(rdy), // output rdy
374   .din(dout_mem), // input [15 : 0] din
375   .dout(dout)); // output [31 : 0] dout

376
377 endmodule

```

src/rx/clock\_recovery\_mm.v

## A.6 Symbol Correlation

```

1 'timescale 1ns / 1ps
2 ///////////////////////////////////////////////////////////////////
3 // Company:      Wireless @ VT
4 // Engineer:     Jeong-O Jeong
5 //
6 // Create Date:  01:14:33 12/11/2011
7 // Design Name:
8 // Module Name:  fir_bank_sym
9 // Project Name:
10 // Target Devices:
11 // Tool versions:
12 // Description:  Find the symbol correlations of 16 symbols
13 //
14 // Dependencies: CoreGen FIR and FIFO
15 //
16 // Revision:
17 // Revision 0.01 - File Created
18 // Additional Comments:
19 //
20 ///////////////////////////////////////////////////////////////////
21 module fir_bank_sym
22 #(parameter WIDTH=16, parameter N=16)
23 (
24   input clk,
25   input reset,
26   input strobe_in,      // 2Mhz strobe
27   input strobe_out,     // 2MHz strobe
28   input [WIDTH-1:0] din, // input chip

```

```

29   output out_rfd ,
30   output [21:0] out_0 ,      // symbol correlation output 0
31   output [21:0] out_1 ,
32   output [21:0] out_2 ,
33   output [21:0] out_3 ,
34   output [21:0] out_4 ,
35   output [21:0] out_5 ,
36   output [21:0] out_6 ,
37   output [21:0] out_7 ,
38   output [21:0] out_8 ,
39   output [21:0] out_9 ,
40   output [21:0] out_10 ,
41   output [21:0] out_11 ,
42   output [21:0] out_12 ,
43   output [21:0] out_13 ,
44   output [21:0] out_14 ,
45   output [21:0] out_15      // symbol correlation output 15
);
46
47 genvar i;
48
49
50 wire rfd [0:15];
51 wire rdy[0:15];
52 wire [21:0] dout[0:15];
53 wire fifo_full_sym [0:15];
54 wire fifo_empty_sym [0:15];
55 wire [24:0] fifo_dout_sym [0:15];
56
57 assign out_rfd = rfd[0];
58
59 // output signals
60 assign out_0 = fifo_dout_sym[0][21:0];
61 assign out_1 = fifo_dout_sym[1][21:0];
62 assign out_2 = fifo_dout_sym[2][21:0];
63 assign out_3 = fifo_dout_sym[3][21:0];
64 assign out_4 = fifo_dout_sym[4][21:0];
65 assign out_5 = fifo_dout_sym[5][21:0];
66 assign out_6 = fifo_dout_sym[6][21:0];
67 assign out_7 = fifo_dout_sym[7][21:0];
68 assign out_8 = fifo_dout_sym[8][21:0];
69 assign out_9 = fifo_dout_sym[9][21:0];
70 assign out_10 = fifo_dout_sym[10][21:0];
71 assign out_11 = fifo_dout_sym[11][21:0];
72 assign out_12 = fifo_dout_sym[12][21:0];
73 assign out_13 = fifo_dout_sym[13][21:0];
74 assign out_14 = fifo_dout_sym[14][21:0];
75 assign out_15 = fifo_dout_sym[15][21:0];
76
77 // generate 16 correlation FIR filters
78 generate
79

```

```

81  for(i=0;i<N;i=i+1) begin : coregen_fir_sym
82    coregen_fir_sym_0 coregen_fir_sym(
83      .clk(clk), // input clk
84      .filter_sel(i), // input [3 : 0] filter_sel
85      .rfd(rfd[i]), // output rfd
86      .rdy(rdy[i]), // output rdy
87      .din(din), // input [15 : 0] din
88      .dout(dout[i])); // output [21 : 0] dout
89  end
90  endgenerate

91 // generate 16 FIFO's
92 generate
93  for(i=0;i<N;i=i+1) begin : fifo_sym
94    coregen_fifo_512_25 coregen_fifo_25_sym (
95      .clk(clk), // input clk
96      .rst(reset), // input rst
97      .din({{3{dout[i][21]}},dout[i]}), // input [24 : 0] din
98      .wr_en(~fifo_full_sym[i] & rdy[i]), // input wr_en
99      .rd_en(~fifo_empty_sym[i] & strobe_out), // input rd_en
100     .dout(fifo_dout_sym[i]), // output [24 : 0] dout
101     .full(fifo_full_sym[i]), // output full
102     .empty(fifo_empty_sym[i]) // output empty
103   );
104 end
105 endgenerate
106
107 endmodule

```

src/rx/fir\_bank.sym.v

## A.7 Find Max 16-input

```

1 ///////////////////////////////////////////////////////////////////
2 // Company:      Wireless @ VT
3 // Author:       Jeong-O Jeong
4 //
5 // Date:         16:39:48 12/17/2011
6 // Design Name:
7 // Filename:     findmax.v
8 // Project Name:
9 // Target Devices:
10 // Tool versions:
11 // Description:  find maximum between two values
12 //
13 // Dependencies:
14 //
15 // Revision:

```

```

// Revision 0.01 - File Created
17 // Additional Comments:
//
19 /////////////////////////////////
'timescale 1 ns / 1 ns
21
22 module findmax
23 #(parameter WIDTH=22)
(
24   clk ,
25   ce ,
26   reset ,
27   index1 , // index of first element
28   value1 , // value of first element
29   index2 , // index of second element
30   value2 , // value of second element
31   index_out, // index of max element
32   value_out // value of max element
);
33
34
35   input    clk ;
36   input    ce ;
37   input    reset;
38   input [15:0] index1; // ufix16
39   input [15:0] index2; // ufix16
40   input signed [WIDTH-1:0] value1; // sfix16_En14
41   input signed [WIDTH-1:0] value2; // sfix16_En14
42
43   output reg [15:0] index_out; // ufix16
44   output reg [WIDTH-1:0] value_out; // sfix16_En14
45
46
47 // output the index of max element
48 always @(posedge clk)
49 begin
50   if(reset) begin
51     index_out = 0;
52   end
53   else begin
54     if(ce)
55       index_out = (value1 > value2) ? index1 : index2;
56   end
57 end
58
59 // output the value of max element
60 always @(posedge clk)
61 begin
62   if(reset)
63     value_out = 0;
64   else
65     if(ce)
66       value_out = (value1 > value2) ? value1 : value2;

```

```
67 end  
69 endmodule // findmax
```

src/rx/findmax.v

## A.8 Find Max 2-input

```
'timescale 1 ns / 1 ns  
////////////////////////////////////////////////////////////////  
// Company:      Wireless @ VT  
// Author:       Jeong-O Jeong  
//  
// Date:          16:39:59 12/17/2011  
// Design Name:  
// Filename:     findmax16.v  
// Project Name:  
// Target Devices:  
// Tool versions:  
// Description:   find the maximum among 16 different values  
//  
// Dependencies:  
//  
// Revision:  
// Revision 0.01 - File Created  
// Additional Comments:  
//  
////////////////////////////////////////////////////////////////  
module findmax16  
#(parameter WIDTH=22)  
(  
    input clk ,  
    input ce ,  
    input reset ,  
    input [WIDTH-1:0] value0 ,  
    input [WIDTH-1:0] value1 ,  
    input [WIDTH-1:0] value2 ,  
    input [WIDTH-1:0] value3 ,  
    input [WIDTH-1:0] value4 ,  
    input [WIDTH-1:0] value5 ,  
    input [WIDTH-1:0] value6 ,  
    input [WIDTH-1:0] value7 ,  
    input [WIDTH-1:0] value8 ,  
    input [WIDTH-1:0] value9 ,  
    input [WIDTH-1:0] value10 ,  
    input [WIDTH-1:0] value11 ,  
    input [WIDTH-1:0] value12 ,  
    input [WIDTH-1:0] value13 ,
```

```

41     input [WIDTH-1:0] value14 ,
42     input [WIDTH-1:0] value15 ,
43     output [15:0] index_out ,
44     output signed [WIDTH-1:0] value_out ,
45     output reg [WIDTH-1:0] passthrough
46 );
47
48 // max index of 8 values when comparing 16 values
49 wire [15:0] index_1_1 ;
50 wire [15:0] index_1_2 ;
51 wire [15:0] index_1_3 ;
52 wire [15:0] index_1_4 ;
53 wire [15:0] index_1_5 ;
54 wire [15:0] index_1_6 ;
55 wire [15:0] index_1_7 ;
56 wire [15:0] index_1_8 ;
57
58 // max index of 4 values when comparing 8 values
59 wire [15:0] index_2_1 ;
60 wire [15:0] index_2_2 ;
61 wire [15:0] index_2_3 ;
62 wire [15:0] index_2_4 ;
63
64 // max index of 2 values when comparing 4 values
65 wire [15:0] index_3_1 ;
66 wire [15:0] index_3_2 ;
67
68 // max 8 values when comparing 16 values
69 wire signed [WIDTH-1:0] value_1_1 ;
70 wire signed [WIDTH-1:0] value_1_2 ;
71 wire signed [WIDTH-1:0] value_1_3 ;
72 wire signed [WIDTH-1:0] value_1_4 ;
73 wire signed [WIDTH-1:0] value_1_5 ;
74 wire signed [WIDTH-1:0] value_1_6 ;
75 wire signed [WIDTH-1:0] value_1_7 ;
76 wire signed [WIDTH-1:0] value_1_8 ;
77
78 // max 4 values when comparing 8 values
79 wire signed [WIDTH-1:0] value_2_1 ;
80 wire signed [WIDTH-1:0] value_2_2 ;
81 wire signed [WIDTH-1:0] value_2_3 ;
82 wire signed [WIDTH-1:0] value_2_4 ;
83
84 // max 2 values when comparing 4 values
85 wire signed [WIDTH-1:0] value_3_1 ;
86 wire signed [WIDTH-1:0] value_3_2 ;
87
88 // 1st stage
89 // value0, value1
90 findmax findmax1(clk , ce , reset , 16'd0 , value0 , 16'd1 , value1 , index_1_1 ,
91   value_1_1 );

```

```

findmax findmax2(clk , ce , reset , 16'd2 , value2 , 16'd3, value3 , index_1_2 ,
    value_1_2);
92 findmax findmax3(clk , ce , reset , 16'd4 , value4 , 16'd5, value5 , index_1_3 ,
    value_1_3);
findmax findmax4(clk , ce , reset , 16'd6 , value6 , 16'd7, value7 , index_1_4 ,
    value_1_4);
94 findmax findmax5(clk , ce , reset , 16'd8 , value8 , 16'd9, value9 , index_1_5 ,
    value_1_5);
findmax findmax6(clk , ce , reset , 16'd10 ,value10 , 16'd11, value11 , index_1_6 ,
    value_1_6);
96 findmax findmax7(clk , ce , reset , 16'd12 ,value12 , 16'd13, value13 , index_1_7 ,
    value_1_7);
findmax findmax8(clk , ce , reset , 16'd14 ,value14 , 16'd15, value15 , index_1_8 ,
    value_1_8);
98
// 2nd stage
100 findmax findmax9(clk , ce , reset , index_1_1 , value_1_1 , index_1_2 , value_1_2 ,
    index_2_1 , value_2_1);
findmax findmax10(clk , ce , reset , index_1_3 , value_1_3 , index_1_4 , value_1_4 ,
    index_2_2 , value_2_2);
102 findmax findmax11(clk , ce , reset , index_1_5 , value_1_5 , index_1_6 , value_1_6 ,
    index_2_3 , value_2_3);
findmax findmax12(clk , ce , reset , index_1_7 , value_1_7 , index_1_8 , value_1_8 ,
    index_2_4 , value_2_4);
104
// 3rd stage
106 findmax findmax13(clk , ce , reset , index_2_1 , value_2_1 , index_2_2 , value_2_2 ,
    index_3_1 , value_3_1);
findmax findmax14(clk , ce , reset , index_2_3 , value_2_3 , index_2_4 , value_2_4 ,
    index_3_2 , value_3_2);
108
//4th stage
110 findmax findmax15(clk , ce , reset , index_3_1 , value_3_1 , index_3_2 , value_3_2 ,
    index_out , value_out);
112 // pass through for debug
always @(posedge clk)
begin
    if(reset)
        passthrough <= 0;
    else
        if(ce)
            passthrough <= value0;
end
118
120 endmodule // findmax16

```

src/rx/findmax16.v

## A.9 CRC-16

```
'timescale 1 ns / 1 ns
// Company:      Wireless @ VT
// Author:       Jeong-O Jeong
//
// Date:          16:39:35 12/17/2011
// Design Name:
// Filename:     crc16.v
// Project Name:
// Target Devices:
// Tool versions:
// Description:   CRC-16 for IEEE 802.15.4
//
// Dependencies:
//
// Revision:
// Revision 0.01 - File Created
// Additional Comments:
//
module crc16
(
    clk ,
    ce ,
    reset ,
    input_bit ,    // input bit
    output_crc    // output crc
);
    input clk;
    input ce;
    input reset;
    input input_bit;
    output reg [15:0] output_crc;
    wire s0 , s1 , s2 ;
    // shift registers
    always @(posedge clk or posedge reset) // asynchronous reset
    begin
        if(reset) begin
            output_crc <= 0;
        end
        else if(ce) begin
            output_crc[0] <= output_crc[1];
            output_crc[1] <= output_crc[2];
            output_crc[2] <= output_crc[3];
        end
    end
endmodule
```

```

50      output_crc[3] <= s1;
51      output_crc[4] <= output_crc[5];
52      output_crc[5] <= output_crc[6];
53      output_crc[6] <= output_crc[7];
54      output_crc[7] <= output_crc[8];
55      output_crc[8] <= output_crc[9];
56      output_crc[9] <= output_crc[10];
57      output_crc[10] <= s2;
58      output_crc[11] <= output_crc[12];
59      output_crc[12] <= output_crc[13];
60      output_crc[13] <= output_crc[14];
61      output_crc[14] <= output_crc[15];
62      output_crc[15] <= s0;
63  end
64 end
65
66 xor(s0 ,input_bit ,output_crc[0]);
67 xor(s1 ,s0 ,output_crc[4]);
68 xor(s2 ,s0 ,output_crc[11]);
69
70 endmodule // crc16

```

src/rx/crc16.v

## A.10 MAC State Machine

```

1  'timescale 1 ns / 1 ns
2  ///////////////////////////////////////////////////////////////////
3  // Company:      Wireless @ VT
4  // Author:       Jeong-O Jeong
5  //
6  // Date:         16:39:59 12/17/2011
7  // Design Name: xbee_mac.v
8  // Project Name:
9  // Target Devices:
10 // Tool versions:
11 // Description: state machine for detecting and decoding the received
12 //                  packet
13 //                  requires preamble and SFD correlation for packet detection
14 //                  outputs decoded byte and strobe for valid CRC
15 //
16 // Dependencies:
17 //
18 // Revision:
19 // Revision 0.01 – File Created
20 // Additional Comments:
21 //

```

```

////////// module xbee_mac
(
    input clk ,
    input reset ,
    input strobe_in ,           // strobe @ 2mhz
    input signed [24:0] in_p_corr , // preamble correlation
    input signed [22:0] in_s_corr , // SFD correlation
    input [3:0] in_sym ,        // symbol
    input signed [24:0] in_THP , // preamble threshold
    input signed [24:0] in_THP_NEXT , // secondary preamble threshold
    input signed [22:0] in_THS , // SFD threshold
    output strobe_sym ,        // symbol strobe @ 62.5k,
    output reg strobe_byte ,   // delayed byte strobe @ 31.25k,
    output strobe_byte_pre ,   // byte strobe 31.25khz
    output reg [7:0] out_byte , // byte output
    output reg [15:0] out_crc , // received crc OTA
    output out_crc_correct ,  // high for correct CRC
    output reg [3:0] state ,   // MAC FSM state
    output [511:0] debug      // debug port
);

// STATES
localparam FIND_PREAMBLE          = 4'd0; // search for preamble
localparam FIND_PEAK_PREAMBLE      = 4'd1; // search for peak preamble
localparam FIND_SFD                = 4'd2; // found a preamble
localparam DECODE_FRAMELENGTH     = 4'd3; // found a valid SFD
localparam DECODE_MHR              = 4'd4;
localparam DECODE_SYMBOLS          = 4'd5;
localparam DECODE_CRC_1ST_HALF    = 4'd6;
localparam DECODE_CRC_2ND_HALF    = 4'd7;
localparam CHECK_CRC               = 4'd8;

// control signals
reg [15:0] counter_since_preamble_found; // interval between preamble and SFD
reg [15:0] counter_since_sfd_found;       // interval between SFD and first
                                           symbol
reg [7:0] framelength;

wire strobe_crc;
wire strobe_preamble;

cic_strober #( .WIDTH(8) ) strober_3(
    .clock(clk),
    .reset(reset),
    .enable(state == DECODE_FRAMELENGTH || state == DECODE_MHR || state ==
            DECODE_SYMBOLS || state == DECODE_CRC_1ST_HALF || state ==
            DECODE_CRC_2ND_HALF || state == CHECK_CRC),

```

```

    .rate(32),
71   .strobe_fast(strobe_in),
    .strobe_slow(strobe_sym)
73 );
75 cic_strober #(.WIDTH(8)) // output 62.5k sym/s * 4 bit/sym = 250kbit/s
// 250kbit/s = 31.25B/s
77 strober_4 (
    .clock(clk),
79   .reset(reset),
    .enable(state == DECODE_SYMBOLS || state == DECODE_MHR || state ==
        DECODE_CRC_1ST_HALF || state == DECODE_CRC_2ND_HALF || state ==
        CHECK_CRC),
81   .rate(2),
    .strobe_fast(strobe_sym),
83   .strobe_slow(strobe_byte_pre)
);
85 cic_strober #(.WIDTH(8)) // strober @ 250khz
87 strober_5(
    .clock(clk),
89   .reset(reset),
    .enable(state == DECODE_MHR || state == DECODE_SYMBOLS || state ==
        DECODE_CRC_1ST_HALF),
91   .rate(8),
    .strobe_fast(strobe_in),
93   .strobe_slow(strobe_crc)
);
95 cic_strober #(.WIDTH(8)) // output 2MHz/32 clk = 62.5k for finding peak
    preamble
97 strober_6(
    .clock(clk),
99   .reset(reset),
    .enable(state == FIND_PEAK_PREAMBLE),
101  .rate(32),
    .strobe_fast(strobe_in),
    .strobe_slow(strobe_preamble)
);
105 // byte strobe
107 always @(posedge clk) begin
    if(reset) begin
        strobe_byte <= 0;
    end
    else begin
        strobe_byte <= strobe_byte_pre;
    end
end
115 // Preamble and SFD thresholds

```

```

117 wire signed [24:0] THP;
118 wire signed [24:0] THP_NEXT;
119 wire signed [22:0] THS; // 130*(pi/4)*2^13

121 assign THP = in_THP;
122 assign THP_NEXT = in_THP_NEXT;
123 assign THS = in_THS;

125 // control signals for FSM
126 wire [15:0] decode_symbols_limit;
127 wire [15:0] decode_crc_1st_half_limit;
128 wire [15:0] decode_crc_2nd_half_limit;
129 wire [15:0] check_crc_limit;

131 assign decode_symbols_limit = 64 + ((framelen-2)*2 * 32);
132 assign decode_crc_1st_half_limit = 64 + 32 + ((framelen-2)*2 * 32);
133 assign decode_crc_2nd_half_limit = 64 + ((framelen)*2 * 32);
134 assign check_crc_limit = 64 + ((framelen)*2 * 32) + 64;

135 //=====
136 // FSM
137 //=====

139 always @(posedge clk) begin
140   if (reset) begin
141     state <= FIND_PREAMBLE;
142   end
143   else begin
144     if(strobe_in) begin // 2mhz strobe
145       case (state)
146         FIND_PREAMBLE:
147           if(in_p_corr > THP) begin
148             state <= FIND_SFD;
149           end
150           else begin
151             state <= FIND_PREAMBLE;
152           end
153
154         FIND_SFD:
155           if(in_s_corr > THS)
156             state <= DECODE_FRAMELENGTH;
157           else if(counter_since_preamble_found > 16'd256)
158             state <= FIND_PREAMBLE;
159           else
160             state <= FIND_SFD;
161
162         DECODE_FRAMELENGTH:
163           if(counter_since_sfd_found <= 64) // This will always be false at
164             some point
165             state <= DECODE_FRAMELENGTH;
166           else
167             state <= DECODE_MHR;

```

```

167
168     DECODE_MHR:
169         if(counter_since_sfd_found <= 64 + 704 /* =11(bytes)*2(sym/byte)*32(
170             chips/sym) */)
171             state <= DECODE_MHR;
172         else
173             state <= DECODE_SYMBOLS;
174
175     DECODE_SYMBOLS:
176         if(counter_since_sfd_found <= decode_symbols_limit)
177             state <= DECODE_SYMBOLS;
178         else
179             state <= DECODE_CRC_1ST_HALF;
180
181     DECODE_CRC_1ST_HALF:
182         if(counter_since_sfd_found <= decode_crc_1st_half_limit)
183             state <= DECODE_CRC_1ST_HALF;
184         else
185             state <= DECODE_CRC_2ND_HALF;
186
187     DECODE_CRC_2ND_HALF:
188         if(counter_since_sfd_found <= decode_crc_2nd_half_limit)
189             state <= DECODE_CRC_2ND_HALF;
190         else
191             state <= CHECK_CRC;
192
193     CHECK_CRC:
194         if(counter_since_sfd_found <= check_crc_limit) begin
195             state <= CHECK_CRC;
196         end
197         else begin
198             state <= FIND_PREAMBLE;
199         end
200     endcase
201     end
202 end
203
204 //=====
205 // counters between preamble and SFD states
206 //=====
207 // start counting if preamble found
208 always @(posedge clk)
209 begin
210     if(reset) begin
211         counter_since_preamble_found <= 9'd0;
212     end
213     else if(strobe_in) begin // 2mhz
214         begin
215             if(state == FIND_SFD)
216                 counter_since_preamble_found <= counter_since_preamble_found + 9'd1;

```

```

217         else
218             counter_since_preamble_found <= 9'd0;
219     end
220 end
221
223 // start counting if sfd found
224 always @(posedge clk)
225 begin
226     if(reset)
227         counter_since_sfd_found <= 0;
228     else if(strobe_in) begin // 2mhz
229         if(state == DECODE_FRAMELENGTH || state == DECODE_MHR || state ==
230             DECODE_SYMBOLS || state == DECODE_CRC_1ST_HALF || state ==
231             DECODE_CRC_2ND_HALF || state == CHECK_CRC)
232             counter_since_sfd_found <= counter_since_sfd_found + 1;
233         else
234             counter_since_sfd_found <= 0;
235     end
236 end
237
238 // save framelength
239 always @(posedge clk)
240 begin
241     if(reset)
242         framelength <= 8'b0111_1111;
243     else if(strobe_sym & state == DECODE_FRAMELENGTH) begin
244         framelength[7:4] <= in_sym;
245         framelength[3:0] <= framelength[7:4];
246     end
247
248     // fix framelength if bigger than max allowed
249     if(framelength >= 8'b0111_1111 && state == DECODE_MHR)
250         framelength <= 8'b0111_1111;
251 end
252
253 // save byte
254 always @(posedge clk)
255 begin
256     if(reset)
257         out_byte[7:0] <= 0;
258     else if(strobe_sym & (state == DECODE_SYMBOLS || state == DECODE_MHR)) begin
259         out_byte[7:4] <= in_sym;
260         out_byte[3:0] <= out_byte[7:4];
261     end
262 end
263
264 // save CRC
265 always @(posedge clk)
266 begin
267     if(reset)

```

```

267     out_crc[15:0] <= 0;
268 else if(strobe_sym & (state == DECODE_CRC_1ST_HALF || state ==
269           DECODE_CRC_2ND_HALF)) begin
270   out_crc[15:12] <= in_sym;
271   out_crc[11:8] <= out_crc[15:12];
272   out_crc[7:4] <= out_crc[11:8];
273   out_crc[3:0] <= out_crc[7:4];
274 end
275 end

276 // count between symbols
277 reg [1:0] counter_between_symbols;
278
279 always @(posedge clk)
280 begin
281   if(reset)
282     counter_between_symbols <= 0;
283   else if((state == DECODEMHR || state == DECODE_SYMBOLS || state ==
284             DECODE_CRC_1ST_HALF)) begin
285     if(strobe_crc) begin
286       counter_between_symbols <= counter_between_symbols + 1;
287     end
288   end
289 end

290 // register symbol
291 reg [3:0] persistent_sym;
292 always @(posedge clk) begin
293   if(reset)
294     persistent_sym <= 0;
295   else if((state == DECODEMHR || state == DECODE_SYMBOLS)) begin
296     if(strobe_sym) begin
297       persistent_sym <= in_sym[3:0];
298     end
299   end
300   else if(state == FIND_PREAMBLE || state == FIND_SFD)
301     persistent_sym <= 0;
302 end

303 // delayed strobe_crc for the multiplexer
304 reg strobe_crc_delayed;
305 always @(posedge clk) begin
306   if(reset) begin
307     strobe_crc_delayed <= 0;
308   end
309   else
310     strobe_crc_delayed <= strobe_crc;
311 end

312 reg input_bit;
313 // multiplexer

```

```

315  always @(counter_between_symbols) begin
316    if(reset) begin
317      input_bit <= 0;
318    end
319    else if((state == DECODEMHR || state == DECODE_SYMBOLS || state ==
320      DECODE_CRC_1ST_HALF)) begin
321      if(strobe_crc_delayed) begin
322        if(counter_between_symbols == 0)
323          input_bit <= persistent_sym[0];                      // get it from input
324        else if(counter_between_symbols == 1)
325          input_bit <= persistent_sym[1];                      // get it from
326            symbol_persistent
327        else if(counter_between_symbols == 2)
328          input_bit <= persistent_sym[2];
329        else if(counter_between_symbols == 3)
330          input_bit <= persistent_sym[3];
331      end
332    end
333    else
334      input_bit <= 0;
335  end
336
337  // CRC16
338  wire [15:0] output_crc;
339  crc16 u_crc16(
340    .clk(clk),
341    .ce(strobe_crc_delayed & (state == DECODEMHR || state == DECODE_SYMBOLS ||
342      state == DECODE_CRC_1ST_HALF)),
343    .reset(reset || state == FIND_PREAMBLE || state == FIND_SFD),
344    .input_bit(input_bit),
345    .output_crc(output_crc)
346  );
347
348  // strobe if CRC is correct
349  assign out_crc_correct = (state == CHECK_CRC && (out_crc == output_crc));
350
351  // debug port
352  assign debug = {out_crc, output_crc};
353
354 endmodule

```

src/rx/xbee\_mac.v

# Appendix B

## Verilog Source Code for IEEE 802.15.4 Transmitter on USRP N210's FPGA

### B.1 Top Level Transmitter

```
1  'timescale 1 ns / 1 ns
2  ///////////////////////////////////////////////////////////////////
3  // Company:      Wireless @ VT
4  // Author:       Jeong-O Jeong
5  //
6  // Date:         10/30/2011
7  // Design Name:
8  // Filename:     zb_ieee_802_15_4_mod.v
9  // Project Name:
10 // Target Devices:
11 // Tool versions:
12 // Description:   Top file for IEEE 802.15.4 Transmitter
13 //
14 // Dependencies:
15 //
16 // Revision:
17 // Revision 0.01 – File Created
18 // Additional Comments:
19 //
20 ///////////////////////////////////////////////////////////////////
21
22 module zb_ieee_802_15_4_mod
23 #(parameter WIDTH=32)
24 (
25   input clk ,
26   input ce ,
27   input rst ,
28   input [7:0] byte ,           // input byte to be modulated
```

```

29   output [WIDTH-1:0] qpsk_pulse ,           // baseband qpsk pulse
30   output [WIDTH-1:0] oqpsk_pulse ,          // baseband oqpsk pulse
31   input  strobe_in ,                      // input strobe @ 4 MHz
32   output strobe_tx ,                     // output strobe @ byte rate
33   output strobe_out_qpsk_sym ,           // qpsk symbol strobe
34   output strobe_out_chips                // chip strobe
35 );

37
38   wire [3:0] bitpos ;
39   wire [31:0] pulse ;
40   wire [31:0] pulse_delayed ;
41   wire [1:0] chunk ;
42   wire [31:0] chip ;
43   wire [15:0] iphase , qphase , iphase_delayed , qphase_delayed ;
44   wire [WIDTH-1:0] output_data ;
45   wire [WIDTH-1:0] symbol ;

46 assign qpsk_pulse = pulse ;
47 assign oqpsk_pulse = pulse_delayed ;
48
49 assign iphase = pulse[31:16];
50 assign qphase = pulse[15:0];
51 assign iphase_delayed = pulse_delayed[31:16];
52 assign qphase_delayed = pulse_delayed[15:0];

53
54 wire strobe_out_pulse ;
55 wire strobe_in_byte ;
56
57 // FIFO signals
58 wire full_1 , empty_1 ;
59 wire full_2 , empty_2 ;
60 wire full_3 , empty_3 ;
61 wire full_4 , empty_4 ;
62 wire full_5 , empty_5 ;
63 wire full_6 , empty_6 ;
64
65 assign strobe_tx = strobe_in_byte ; // strobe @ byte rate
66 assign strobe_out_pulse = strobe_in ; // strobe_tx from dsp_tx_core @ 4 MHz

67
68 // input 4 MHz, output 1 MHz strobe
69 cic_strober #(.WIDTH(8))
70 strober_2(
71   .clock(clk) ,
72   .reset(rst) ,
73   .enable(1'b1) ,
74   .rate(4) ,
75   .strobe_fast(strobe_out_pulse) ,
76   .strobe_slow(strobe_out_qpsk_sym)
77 );
78
79

```

```

// output strobe @ quad-bit symbol rate
81 cic_strober #( .WIDTH(8) )
82 strober_3(
83   .clock(clk),
84   .reset(rst),
85   .enable(1'b1),
86   .rate(16),
87   .strobe_fast(strobe_out_qpsk_sym),
88   .strobe_slow(strobe_out_chips)
89 );
90
91 // output strobe @ byte rate
92 cic_strober #( .WIDTH(8) )
93 strober_4 (
94   .clock(clk),
95   .reset(rst),
96   .enable(1'b1),
97   .rate(2),
98   .strobe_fast(strobe_out_chips),
99   .strobe_slow(strobe_in_byte)
100 );
101
102 // convert bytes to 32-chip sequence symbols
103 zbsymbols_to_chips_fifo u_zb_symbols_to_chips_fifo
104 (
105   .clk(clk),
106   .ce(1'b1),
107   .reset(rst),
108   .strobe_in(strobe_in_byte), // strobe @ byte rate
109   .strobe_out(strobe_out_chips), // strobe @ quad-bit symbol rate
110   .byte(byte), // input byte
111   .chip(chip), // output 32-chip symbol
112   .full(full_1),
113   .empty(empty_1)
114 );
115
116 // convert chips to 2-bit symbols
117 gr_packed_to_unpacked_iififo u_gr_packed_to_unpacked_iififo (
118   .clk(clk),
119   .ce(1'b1),
120   .reset(rst),
121   .strobe_in(strobe_out_chips), // strobe @ quad-bit symbol rate
122   .strobe_out(strobe_out_qpsk_sym), // strobe @ qpsk symbol rate
123   .full(full_2),
124   .empty(empty_2),
125   .packed(chip), // input 32-chip symbol
126   .unpacked(chunk), // output 2-bit symbol
127   .bitpos(bitpos)
128 );
129
130 // map 2-bit symbols to QPSK constellation

```

```

131 gr_chunks_to_symbols_ic_fifo u_gr_chunks_to_symbols_ic_fifo (
132     .clk(clk), // fastest clk 100 MHz
133     .ce(1'b1),
134     .strobe_in(strobe_out_qpsk_sym), // strobe @ qpsk symbol rate
135     .strobe_out(strobe_out_qpsk_sym), // strobe @ qpsk symbol rate
136     .full(full_3),
137     .empty(empty_3),
138     .reset(rst),
139     .chunk(chunk), // input 2-bit symbol
140     .symbol(symbol) // output qpsk constellation symbol
141 );
142
143 // upsample QPSK symbols by 4
144 sp_upsample_cc_fifo #(.L(4))
145 u_sp_upsample_cc_fifo
146 (
147     .clk(clk),
148     .ce(1'b1),
149     .reset(rst),
150     .strobe_in(strobe_out_qpsk_sym), // strobe @ qpsk symbol rate
151     .strobe_out(strobe_out_pulse), // strobe @ 4 MHz
152     .full(full_4),
153     .empty(empty_4),
154     .input_data(symbol), // complex qpsk symbol
155     .output_data(output_data) // output upsampled qpsk symbol
156 );
157
158 // half-sin pulse shaper
159 //zb_half_sin_pulse_no_mult_fifo u_zb_half_sin_pulse_no_mult_fifo
160 zb_half_sin_pulse_fifo u_zb_half_sin_pulse_no_mult_fifo
161 (
162     .clk(clk),
163     .ce(1'b1),
164     .reset(rst),
165     .strobe_in(strobe_out_pulse), // 4 MHz strobe
166     .strobe_out(strobe_out_pulse), // 4 MHz strobe
167     .full(full_5),
168     .empty(empty_5),
169     .symbol(output_data), // 32 bit, 16-bit in-phase, 16-bit q-phase
170     .out(pulse) // output 32 bit I and Q baseband pulse
171 );
172
173 // delay quadarture to make QPSK to O-QPSK
174 zb_delay_cc_fifo u_zb_delay_cc_fifo
175 (
176     .clk(clk),
177     .ce(1'b1),
178     .reset(rst),
179     .strobe_in(strobe_out_pulse), // 4 MHz strobe
180     .strobe_out(strobe_out_pulse), // 4 MHz strobe
181     .full(full_6),

```

```

183     .empty(empty_6),
184     .input_data(pulse),
185     .output_data(pulse_delayed)
186   );
187 endmodule

```

src/tx/zb\_ieee\_802\_15\_4\_mod.v

## B.2 Symbols to Chips

```

1  'timescale 1 ns / 1 ns
2  ///////////////////////////////////////////////////////////////////
3  // Company:      Wireless @ VT
4  // Author:       Jeong-O Jeong
5  //
6  // Date:          10/17/2011
7  // Design Name:  zb_symbols_to_chips_fifo.v
8  // Project Name:
9  // Target Devices:
10 // Tool versions:
11 // Description:  Map quad-bits to chips
12 //
13 // Dependencies:
14 //
15 // Revision:
16 // Revision 0.01 – File Created
17 // Additional Comments:
18 //
19 ///////////////////////////////////////////////////////////////////
20 module zb_symbols_to_chips_fifo
21 (
22   input clk,
23   input ce,
24   input strobe_in,           // strobe @ byte rate
25   input strobe_out,          // strobe @ symbol rate
26   input reset,
27   input [7:0] byte,          // input byte
28   output reg [31:0] chip,    // output chip sequence
29   output full,              // FIFO full
30   output empty               // FIFO empty
31 );
32
33 // store upper and lower 4 bits
34 wire [3:0] lower;
35 wire [3:0] upper;
36
37

```

```

// signal to switch between upper and lower 4 bits
39 reg msb_or_lsb;

41 // FIFO signals
42 wire [31:0] dout;
43 wire wr_en, rd_en;
44 assign wr_en = strobe_in && ~full;
45 assign rd_en = strobe_in && ~empty;

47 // fifo to store input
48 fifo_generator_v6_2 fifo(
49   .rst(reset),
50   .wr_clk(clk),
51   .rd_clk(clk),
52   .din(byte),
53   .wr_en(wr_en),
54   .rd_en(rd_en),
55   .dout(dout),
56   .full(full),
57   .empty(empty)
58 );
59

// switch between upper and lower 4 bits
61 always @(posedge clk) begin
62   if(reset) begin
63     msb_or_lsb <= 0;
64   end
65   else if(strobe_in)
66     msb_or_lsb <= 0;
67   else if(strobe_out) begin
68     msb_or_lsb <= ~msb_or_lsb;
69   end
70 end
71

// split into lower and upper 4 bits
73 assign lower = dout[3:0];
74 assign upper = dout[7:4];
75

// lower 4 bits first
77 // map 4 bits to chips
78 always @(msb_or_lsb or lower or upper) begin
79   if(reset)
80     chip <= 0;
81   else begin
82     if(~msb_or_lsb) begin
83       case(lower)
84         0: chip <= 3653456430;
85         1: chip <= 3986437410;
86         2: chip <= 786023250;
87         3: chip <= 585997365;
88         4: chip <= 1378802115;

```

```

89      5: chip <= 891481500;
91      6: chip <= 3276943065;
93      7: chip <= 2620728045;
95      8: chip <= 2358642555;
97      9: chip <= 3100205175;
99     10: chip <= 2072811015;
101    11: chip <= 2008598880;
103    12: chip <= 125537430;
105    13: chip <= 1618458825;
107    14: chip <= 2517072780;
109    15: chip <= 3378542520;
111    default: chip <= 0;
113    endcase
115  end
117  else begin
119    case(upper)
121      0: chip <= 3653456430;
123      1: chip <= 3986437410;
125      2: chip <= 786023250;
127      3: chip <= 585997365;
129      4: chip <= 1378802115;
131      5: chip <= 891481500;
133      6: chip <= 3276943065;
135      7: chip <= 2620728045;
137      8: chip <= 2358642555;
139      9: chip <= 3100205175;
141      10: chip <= 2072811015;
143      11: chip <= 2008598880;
145      12: chip <= 125537430;
147      13: chip <= 1618458825;
149      14: chip <= 2517072780;
151      15: chip <= 3378542520;
153      default: chip <= 0;
155      endcase
157    end
159  end
161 end
163
165 endmodule

```

src/tx/zb\_symbols\_to\_chips\_fifo.v

### B.3 GNU Radio Packed to Unpacked

```

1  'timescale 1 ns / 1 ns
2  ///////////////////////////////////////////////////////////////////
3 // Company:      Wireless @ VT
// Author:       Jeong-O Jeong

```

```

5 // 
6 // Date:          09/19/2011
7 // Design Name:
8 // Filename:      gr_packed_to_unpacked_ii_fifo.v
9 // Project Name:
10 // Target Devices:
11 // Tool versions:
12 // Description:   break 32-chip sequence into 16 2-bit symbols
13 //
14 // Dependencies:
15 //
16 // Revision:
17 // Revision 0.01 – File Created
18 // Additional Comments:
19 //
20 /////////////////////////////////
21
22 module gr_packed_to_unpacked_ii_fifo
23 #(parameter BITS_PER_CHUNK=2)
24 (
25     input clk ,
26     input ce ,
27     input reset ,
28     input strobe_in ,                                // strobe @ quad-bit symbol rate
29     input strobe_out ,                               // strobe @ qpsk symbol rate
30     output full ,                                  // FIFO full
31     output empty,                                 // FIFO empty
32     input [31:0] packed ,                           // 32-chip sequence
33     output reg [BITS_PER_CHUNK-1:0] unpacked ,    // 2-chip symbol
34     output reg [3:0] bitpos                      // 0-15 bi-chip position
35 );
36
37 // output of FIFO (chip sequence)
38 wire [31:0] dout;
39
40 // ASSUME MSB, 2 bits per chunk
41 // unpacked
42 always @(bitpos or dout or reset) begin
43     if(reset)
44         unpacked <= 0;
45     else begin
46         case(bitpos)
47             0: unpacked <= dout[31:30];
48             1: unpacked <= dout[29:28];
49             2: unpacked <= dout[27:26];
50             3: unpacked <= dout[25:24];
51             4: unpacked <= dout[23:22];
52             5: unpacked <= dout[21:20];
53             6: unpacked <= dout[19:18];
54             7: unpacked <= dout[17:16];
55             8: unpacked <= dout[15:14];

```

```

57      9: unpacked <= dout[13:12];
58      10: unpacked <= dout[11:10];
59      11: unpacked <= dout[9:8];
60      12: unpacked <= dout[7:6];
61      13: unpacked <= dout[5:4];
62      14: unpacked <= dout[3:2];
63      15: unpacked <= dout[1:0];
64  endcase
65 end
66
67 // FIFO signals
68 wire wr_en, rd_en;
69 assign wr_en = strobe_in && ~full;
70 assign rd_en = strobe_in && ~empty;
71
72 // FIFO to store input
73 fifo_generator_v6_2 fifo(
74     .rst(reset),
75     .wr_clk(clk),
76     .rd_clk(clk),
77     .din(packed),
78     .wr_en(wr_en),
79     .rd_en(rd_en),
80     .dout(dout),
81     .full(full),
82     .empty(empty)
83 );
84
85 // bitpos
86 // control which two chips out of 32 chips to map to output
87 always @(posedge clk or posedge reset or posedge strobe_in) begin
88     if(reset)
89         bitpos <= 0;
90     else if(strobe_in)
91         bitpos <= 0;
92     else begin
93         if(strobe_out)
94             bitpos <= bitpos + 1;
95     end
96 end
97
98 endmodule

```

src/tx/gr\_packed\_to\_unpacked\_ii\_fifo.v

## B.4 GNU Radio Chunks to Symbols

```

'timescale 1 ns / 1 ns
2 ///////////////////////////////////////////////////////////////////
// Company:      Wireless @ VT
4 // Author:       Jeong-O Jeong
//
6 // Date:          09/19/2011
// Design Name:
8 // Filename:     gr_chunks_to_symbols_ic_fifo.v
// Project Name:
10 // Target Devices:
// Tool versions:
12 // Description: Convert 2 bits into QPSK symbols
//
14 // Dependencies:
//
16 // Revision:
// Revision 0.01 - File Created
18 // Additional Comments:
//
20 ///////////////////////////////////////////////////////////////////
22 module gr_chunks_to_symbols_ic_fifo
#(  parameter BITS_PER_CHUNK=2,
24 parameter WIDTH=32)
(
26   input clk ,
27   input ce ,
28   input reset ,
29   input strobe_in ,           // strobe @ qpsk symbol rate
30   input strobe_out ,         // strobe @ qpsk symbol rate
31   output full ,             // FIFO full
32   output empty ,            // FIFO empty
33   input [BITS_PER_CHUNK-1:0] chunk, // 32 bit integer
34   output reg [WIDTH-1:0] symbol // MSB=in-phase , LSB=q-phase
);
36
38 // FIFO signals
39 wire [31:0] dout;
40 wire wr_en , rd_en ;
42 // ASSUME QPSK
43 // symbol
44 // -1 : 16'b1111111111111111
45 // 1 : 16'b0000000000000001
46 always @(reset or dout) begin
47   if(reset)
48     symbol <= 0;
49   else begin
50     case(dout[1:0])

```

```

52      0: symbol <= 32'b1000000000000000_1000000000000000; // -1-1j
53      1: symbol <= 32'b1000000000000000_0111111111111111; // -1+1j
54      2: symbol <= 32'b0111111111111111_1000000000000000; // 1-1j
55      3: symbol <= 32'b0111111111111111_0111111111111111; // 1+1j
56    endcase
57  end
58
59 // FIFO to store input
60 assign wr_en = strobe_in && ~full;
61 assign rd_en = strobe_in && ~empty;
62
63 fifo_generator_v6_2 fifo(
64   .rst(reset),
65   .wr_clk(clk),
66   .rd_clk(clk),
67   .din(chunk),
68   .wr_en(wr_en),
69   .rd_en(rd_en),
70   .dout(dout),
71   .full(full),
72   .empty(empty)
73 );
74
75 endmodule

```

src/tx/gr\_chunks\_to\_symbols\_ic\_fifo.v

## B.5 Upsampler K=4

```

1  'timescale 1 ns / 1 ns
2  ///////////////////////////////////////////////////////////////////
3  // Company:      Wireless @ VT
4  // Author:       Jeong-O Jeong
5  //
6  // Date:          10/18/2011
7  // Design Name:
8  // Filename:     sp_upsample_cc_fifo
9  // Project Name:
10 // Target Devices:
11 // Tool versions:
12 // Description:   Upsample by a factor of 4
13 //
14 // Dependencies:
15 //
16 // Revision:
17 // Revision 0.01 - File Created
18 // Additional Comments:

```

```

19 ///////////////////////////////////////////////////////////////////
21 module sp_upsample_cc_fifo
23 # ( parameter WIDTH=32,
24   parameter L=4           // upsample factor
25 )
26 (
27   input clk ,
28   input ce ,
29   input reset ,
30   input strobe_in ,          // strobe @ qpsk symbol rate
31   input strobe_out ,         // strobe @ 4 MHz
32   output full ,
33   output empty ,
34   input [WIDTH-1:0] input_data , // I and Q baseband input
35   output reg [WIDTH-1:0] output_data // upsampled output
36 );
37
38 wire [31:0] dout;
39 reg [31:0] counter; // 32bit should be big enough
40
41 // counter for upsampling
42 always @(posedge clk or posedge reset) begin
43   if(reset) begin
44     counter <= 0;
45   end
46   else if(strobe_in)
47     counter <= 0;
48   else begin
49     if(strobe_out) begin
50       if(counter == L-1)
51         counter <= 0;
52       else
53         counter <= counter + 1'b1;
54     end
55   end
56 end
57
58 // output input sample and insert zeros for the rest
59 always @(posedge clk or posedge reset) begin
60   if(reset)
61     output_data <= 0;
62   else begin
63     if(strobe_out) begin
64       if(counter == 0)
65         output_data <= dout;
66       else
67         output_data <= 0;
68     end
69   end

```

```

1 end
71
2 // FIFO for input
3 wire wr_en, rd_en;
4 assign wr_en = strobe_in && ~full;
5 assign rd_en = strobe_in && ~empty;
6
7 fifo_generator_v6_2 fifo(
8   .rst(reset),
9   .wr_clk(clk),
10  .rd_clk(clk),
11  .din(input_data),
12  .wr_en(wr_en),
13  .rd_en(rd_en),
14  .dout(dout),
15  .full(full),
16  .empty(empty)
17 );
18
19 endmodule

```

src/tx/sp\_upsample\_cc\_fifo.v

## B.6 Half-Sine Pulse Shaper

```

1 'timescale 1 ns / 1 ns
2 /////////////////////////////////
3 // Company:      Wireless @ VT
4 // Author:       Jeong-O Jeong
5 //
6 // Date:          10/07/2011
7 // Design Name:  zb_half_sin_pulse_fifo.v
8 // Project Name: 
9 // Target Devices:
10 // Tool versions:
11 // Description:  Half-sine pulse shaper
12 //
13 // Dependencies:
14 //
15 // Revision:
16 // Revision 0.01 – File Created
17 // Additional Comments:
18 //
19 /////////////////////////////////
20 module zb_half_sin_pulse_fifo
21 # (parameter WIDTH=32)
22 (
23

```

```

25    input clk ,
26    input ce ,
27    input reset ,
28    input strobe_in ,           // strobe @ 4 MHz
29    input strobe_out ,         // strobe @ 4 MHz
30    output full ,
31    output empty ,
32    input [WIDTH-1:0] symbol , // 32 bit , 16-bit I, 16-bit Q
33    output reg [WIDTH-1:0] out // 32 bit , 16-bit I, 16-bit Q
);
// Samples per symbol = 4
35
36    wire [31:0] dout;
37
38 // input signals
39    wire signed [15:0] iphase = dout[31:16];
40    wire signed [15:0] qphase = dout[15:0];
41
42 // output signals
43    wire signed [31:0] i_out0 , q_out0;
44    wire signed [31:0] i_out1 , q_out1;
45    wire signed [31:0] i_out2 , q_out2;
46    wire signed [31:0] i_out3 , q_out3;
47
48    reg signed [15:0] i_z0 , i_z1 , i_z2 , i_z3 ;
49    reg signed [15:0] q_z0 , q_z1 , q_z2 , q_z3 ;
50
51 // (i , q) * 0.70710678 = 23170/32768
52    mult i_mult0 (.clk(clk) , .ce(ce) , .reset(reset) , .a(iphase) , .b(16'
53      b0101_1010_1000_0010) , .out(i_out0));
54    mult i_mult1 (.clk(clk) , .ce(ce) , .reset(reset) , .a(iphase) , .b(16'
55      b0111_1111_1111_1111) , .out(i_out1));
56    mult i_mult2 (.clk(clk) , .ce(ce) , .reset(reset) , .a(iphase) , .b(16'
57      b0101_1010_1000_0010) , .out(i_out2));
58    mult i_mult3 (.clk(clk) , .ce(ce) , .reset(reset) , .a(iphase) , .b(16'
59      b0000_0000_0000_0000) , .out(i_out3));
60
61 // (i , q) * 0.70710678 = 23170/32768
62    mult q_mult0 (.clk(clk) , .ce(ce) , .reset(reset) , .a(qphase) , .b(16'
63      b0101_1010_1000_0010) , .out(q_out0));
64    mult q_mult1 (.clk(clk) , .ce(ce) , .reset(reset) , .a(qphase) , .b(16'
65      b0111_1111_1111_1111) , .out(q_out1));
66    mult q_mult2 (.clk(clk) , .ce(ce) , .reset(reset) , .a(qphase) , .b(16'
67      b0101_1010_1000_0010) , .out(q_out2));
68    mult q_mult3 (.clk(clk) , .ce(ce) , .reset(reset) , .a(qphase) , .b(16'
69      b0000_0000_0000_0000) , .out(q_out3));
70
71 // pipeline registers
72 // simple FIR filter
73 always @(posedge clk) begin
74   if(reset) begin

```

```

67      i_z0 <= 0;
68      i_z1 <= 0;
69      i_z2 <= 0;
70  end
71 else begin
72   if(strobe_out) begin
73     i_z0 <= i_out0[31:16];
74     i_z1 <= i_z0 + i_out1[31:16];
75     i_z2 <= i_z1 + i_out2[31:16];
76   end
77 end
78
79 always @ (posedge clk) begin
80   if(reset) begin
81     q_z0 <= 0;
82     q_z1 <= 0;
83     q_z2 <= 0;
84   end
85 else begin
86   if(strobe_out) begin
87     q_z0 <= q_out0[31:16];
88     q_z1 <= q_z0 + q_out1[31:16];
89     q_z2 <= q_z1 + q_out2[31:16];
90   end
91 end
92
93 end
94
95 // register output
96 always @ (posedge clk) begin
97   if(reset) begin
98     out <= 0;
99   end
100 else begin
101   if(strobe_out) begin
102     out <= {i_z2 + i_out3[31:16], q_z2 + q_out3[31:16]};
103   end
104 end
105
106
107 // FIFO for input
108 wire wr_en, rd_en;
109 assign wr_en = strobe_in && ~full;
110 assign rd_en = strobe_in && ~empty;
111
112 fifo_generator_v6_2 fifo(
113   .rst(reset),
114   .wr_clk(clk),
115   .rd_clk(clk),
116   .din(symbol),
117   .wr_en(wr_en),

```

```

119     .rd_en(rd_en),
120     .dout(dout),
121     .full(full),
122     .empty(empty)
123 );
endmodule

```

src/tx/zb\_half\_sin\_pulse\_fifo.v

## B.7 Delay Quadrature

```

'timescale 1 ns / 1 ns
////////////////////////////////////////////////////////////////
// Company:      Wireless @ VT
// Author:       Jeong-O Jeong
//
// Date:          10/07/2011
// Design Name:
// Filename:     zb_delay_cc_fifo.v
// Project Name:
// Target Devices:
// Tool versions:
// Description:   delay quadrature phase
//
// Dependencies:
//
// Revision:
// Revision 0.01 - File Created
// Additional Comments:
//
////////////////////////////////////////////////////////////////

22 module zb_delay_cc_fifo
# ( parameter WIDTH=32 )
(
24   input clk,
26   input ce,
27   input reset,
28   input strobe_in,           // 4 MHz strobe
29   input strobe_out,          // 4 MHz strobe
30   output full,
31   output empty,
32   input [WIDTH-1:0] input_data, // input
33   output reg [WIDTH-1:0] output_data // output with delayed quadrature
34 );
35
36 wire [31:0] dout;

```

```

38 wire [15:0] iphase;
39 reg [15:0] qphase[0:1];
40
41 assign iphase = dout[31:16];
42
43 // delay pipeline for quadrature component
44 always @(posedge clk) begin
45   if(reset) begin
46     qphase[0] <= 0;
47     qphase[1] <= 0;
48   end
49   else if(strobe_out) begin
50     qphase[0] <= dout[15:0];
51     qphase[1] <= qphase[0];
52   end
53 end
54
55 // register output
56 always @(posedge clk) begin
57   if(reset)
58     output_data <= 0;
59   else begin
60     if(strobe_out) begin
61       output_data <= {iphase,qphase[1]};
62     end
63   end
64 end
65
66 // FIFO for input
67 wire wr_en, rd_en;
68 assign wr_en = strobe_in && ~full;
69 assign rd_en = strobe_in && ~empty;
70
71 fifo_generator_v6_2 fifo(
72   .rst(reset),
73   .wr_clk(clk),
74   .rd_clk(clk),
75   .din(input_data),
76   .wr_en(wr_en),
77   .rd_en(rd_en),
78   .dout(dout),
79   .full(full),
80   .empty(empty)
81 );
82
83 endmodule

```

src/tx/zb\_delay\_cc\_fifo.v

# Appendix C

## Verilog Source Code for Multi-channel IEEE 802.15.4 Receiver

### C.1 Top Level Multi-Channel Receiver

```
'timescale 1ns / 1ps
1 ///////////////////////////////////////////////////////////////////
// Company:
2 // Engineer:
3 //
4 // Create Date:      16:04:46 05/04/2012
5 // Design Name:
6 // Module Name:    multi_channel_xbee
7 // Project Name:
8 // Target Devices:
9 // Tool versions:
10 // Description:   Top file for 4:1 channelizer, energy detector, and
11 //                  resampler
12 //
13 // Dependencies:
14 //
15 // Revision:
16 // Revision 0.01 - File Created
17 // Additional Comments:
18 //
19 ///////////////////////////////////////////////////////////////////
20 module multi_channel_xbee
21 #(parameter WIDTH=16)
22 (
23     input clk,           // 100 MHz
24     input ce,
25     input rst,
26     input [WIDTH-1:0] in_i, // input I
```

```

28   input [WIDTH-1:0] in_q ,      // input Q
29   input in_valid ,           // 20 MHz
30   output [WIDTH-1:0] out_i ,    // output I
31   output [WIDTH-1:0] out_q ,    // output Q
32   output out_valid ,         // 4 MHz
33   output [35:0] control
34 );

36 // commutator
37 wire signed [WIDTH-1:0] out_i0_commutator;
38 wire signed [WIDTH-1:0] out_i1_commutator;
39 wire signed [WIDTH-1:0] out_i2_commutator;
40 wire signed [WIDTH-1:0] out_i3_commutator;
41 wire signed [WIDTH-1:0] out_q0_commutator;
42 wire signed [WIDTH-1:0] out_q1_commutator;
43 wire signed [WIDTH-1:0] out_q2_commutator;
44 wire signed [WIDTH-1:0] out_q3_commutator;

46 // channelizer
47 wire signed [WIDTH-1:0] out_r0;
48 wire signed [WIDTH-1:0] out_r1;
49 wire signed [WIDTH-1:0] out_r2;
50 wire signed [WIDTH-1:0] out_r3;
51 wire signed [WIDTH-1:0] out_i0;
52 wire signed [WIDTH-1:0] out_i1;
53 wire signed [WIDTH-1:0] out_i2;
54 wire signed [WIDTH-1:0] out_i3;
55 reg signed [WIDTH-1:0] out_r0_valid;
56 reg signed [WIDTH-1:0] out_r1_valid;
57 reg signed [WIDTH-1:0] out_r2_valid;
58 reg signed [WIDTH-1:0] out_r3_valid;
59 reg signed [WIDTH-1:0] out_i0_valid;
60 reg signed [WIDTH-1:0] out_i1_valid;
61 reg signed [WIDTH-1:0] out_i2_valid;
62 reg signed [WIDTH-1:0] out_i3_valid;

64 // valid signals
65 wire out_valid_commutator;
66 wire out_valid_channelizer;

68 // commutator
69 commutator_4_1 commutator_4_1 (
70   .clk(clk),
71   .ce(ce),
72   .rst(rst),
73   .in_valid(in_valid),          // 20 MHz input to commutator
74   .in_i(in_i),
75   .in_q(in_q),
76   .out_i0(out_i0_commutator), // 5 MHz output of commutator
77   .out_i1(out_i1_commutator),
78   .out_i2(out_i2_commutator),

```

```

80     .out_i3(out_i3_commutator),
81     .out_q0(out_q0_commutator),
82     .out_q1(out_q1_commutator),
83     .out_q2(out_q2_commutator),
84     .out_q3(out_q3_commutator),
85     .out_valid(out_valid_commutator)
86   );
87
88 // four-channel channelizer
89 channelizer4 channelizer4 (
90   .clk(clk),
91   .rst(rst),
92   .ce(ce),
93   .in_valid(in_valid),
94   .in_r0(out_i0_commutator), // input
95   .in_r1(out_i1_commutator),
96   .in_r2(out_i2_commutator),
97   .in_r3(out_i3_commutator),
98   .in_i0(out_q0_commutator),
99   .in_i1(out_q1_commutator),
100  .in_i2(out_q2_commutator),
101  .in_i3(out_q3_commutator),
102  .out_r0(out_r0),           // output of each channel
103  .out_r1(out_r1),
104  .out_r2(out_r2),
105  .out_r3(out_r3),
106  .out_i0(out_i0),
107  .out_i1(out_i1),
108  .out_i2(out_i2),
109  .out_i3(out_i3),
110  .out_valid(out_valid_channelizer)
111 );
112
113 // register valid output from channelizer
114 always @(posedge clk) begin
115   if(rst) begin
116     out_r0_valid <= 0;
117     out_r1_valid <= 0;
118     out_r2_valid <= 0;
119     out_r3_valid <= 0;
120     out_i0_valid <= 0;
121     out_i1_valid <= 0;
122     out_i2_valid <= 0;
123     out_i3_valid <= 0;
124   end
125   else if(out_valid_channelizer) begin
126     out_r0_valid <= out_r0;
127     out_r1_valid <= out_r1;
128     out_r2_valid <= out_r2;
129     out_r3_valid <= out_r3;
130     out_i0_valid <= out_i0;

```

```

130      out_i1_valid <= out_i1;
131      out_i2_valid <= out_i2;
132      out_i3_valid <= out_i3;
133  end
134 end

136 wire [WIDTH-1:0] out_r_energydetector;
137 wire [WIDTH-1:0] out_i_energydetector;
138 wire out_valid_energydetector;

140 // energy detector
141 energydetector energydetector(
142     .clk(clk),
143     .rst(rst),
144     .ce(ce),
145     .in_valid(out_valid_channelizer),
146     .in_r0(out_r0),           // input
147     .in_r1(out_r1),
148     .in_r2(out_r2),
149     .in_r3(out_r3),
150     .in_i0(out_i0),
151     .in_i1(out_i1),
152     .in_i2(out_i2),
153     .in_i3(out_i3),
154     .out_r(out_r_energydetector), // output
155     .out_i(out_i_energydetector),
156     .out_valid(out_valid_energydetector)
157 );
158
159 // resampler K=4/5
160 resampler_2_4 resampler_2_4 (
161     .clk(clk),
162     .ce(ce),
163     .rst(rst),
164     .in_valid(out_valid_energydetector),
165     .in_i(out_r_energydetector), // input
166     .in_q(out_i_energydetector),
167     .out_valid(out_valid),       // output
168     .out_i(out_i),
169     .out_q(out_q)
170 );
171
172 endmodule

```

src/multi\_channel/multi\_channel\_xbee.v

## C.2 1:4 Commutator

```

'timescale 1ns / 1ps
// Company:      Wireless @ VT
// Engineer:     Jeong-O Jeong
//
// Create Date:   16:15:01 05/04/2012
// Design Name:
// Module Name:   commutator_4_1
// Project Name:
// Target Devices:
// Tool versions:
// Description:   Commutator 1:4, used before four-channel channelizer
//
// Dependencies:
//
// Revision:
// Revision 0.01 - File Created
// Additional Comments:
//
module commutator_4_1
#(parameter WIDTH=16)
(
    input clk ,
    input ce ,
    input rst ,
    input in_valid ,
    input signed [WIDTH-1:0] in_i ,           // input
    input signed [WIDTH-1:0] in_q ,
    output reg signed [WIDTH-1:0] out_i0 , // output at 1/4 rate
    output reg signed [WIDTH-1:0] out_q0 ,
    output reg signed [WIDTH-1:0] out_i1 ,
    output reg signed [WIDTH-1:0] out_q1 ,
    output reg signed [WIDTH-1:0] out_i2 ,
    output reg signed [WIDTH-1:0] out_q2 ,
    output reg signed [WIDTH-1:0] out_i3 ,
    output reg signed [WIDTH-1:0] out_q3 ,
    output reg out_valid
);
// commutate input
always @(posedge clk) begin
    if(rst) begin
        out_i0 <= 0;
        out_q0 <= 0;
        out_i1 <= 0;
        out_q1 <= 0;
        out_i2 <= 0;
        out_q2 <= 0;
        out_i3 <= 0;

```

```

      out_q3 <= 0;
52    end
53    else if(ce & in_valid) begin
54      out_i0 <= in_i;
55      out_q0 <= in_q;
56      out_i1 <= out_i0;
57      out_q1 <= out_q0;
58      out_i2 <= out_i1;
59      out_q2 <= out_q1;
60      out_i3 <= out_i2;
61      out_q3 <= out_q2;
62    end
63  end
64
// counter 0-3
65 reg [1:0] counter;
66 always @(posedge clk) begin
67   if(rst) begin
68     counter <= 0;
69   end
70   else if(ce & in_valid) begin
71     counter <= counter + 1'b1;
72   end
73 end
74
// output valid
75 always @(posedge clk) begin
76   if(rst) begin
77     out_valid <= 0;
78   end
79   else if(ce) begin
80     out_valid <= (counter == 0) & in_valid;
81   end
82 end
83
84
85 endmodule

```

src/multi\_channel/commutator\_4\_1.v

### C.3 Four-Channel Channelizer

```

'timescale 1ns / 1ps
1 ///////////////////////////////////////////////////////////////////
2 // Company:      Wireless @ VT
3 // Engineer:    Jeong-O Jeong
4 //
5 // Create Date:  15:02:13 04/27/2012
6 // Design Name:

```

```

8 // Module Name:      channelizer4
9 // Project Name:
10 // Target Devices:
11 // Tool versions:
12 // Description:     Four-channel channelizer with complex coefficients
13 //
14 // Dependencies:
15 //
16 // Revision:
17 // Revision 0.01 – File Created
18 // Additional Comments:
19 //
20 /////////////////////////////////
21 module channelizer4
22 #(parameter WIDTH.INPUT=16,
23   parameter WIDTH.OUTPUT=16)
24 (
25   input clk ,
26   input rst ,
27   input ce ,
28   input in_valid ,                                // input
29   input signed [WIDTH.INPUT-1:0] in_r0 ,
30   input signed [WIDTH.INPUT-1:0] in_r1 ,
31   input signed [WIDTH.INPUT-1:0] in_r2 ,
32   input signed [WIDTH.INPUT-1:0] in_r3 ,
33   input signed [WIDTH.INPUT-1:0] in_i0 ,
34   input signed [WIDTH.INPUT-1:0] in_i1 ,
35   input signed [WIDTH.INPUT-1:0] in_i2 ,
36   input signed [WIDTH.INPUT-1:0] in_i3 ,
37   output signed [WIDTH.OUTPUT-1:0] out_r0 , // output
38   output signed [WIDTH.OUTPUT-1:0] out_r1 ,
39   output signed [WIDTH.OUTPUT-1:0] out_r2 ,
40   output signed [WIDTH.OUTPUT-1:0] out_r3 ,
41   output signed [WIDTH.OUTPUT-1:0] out_i0 ,
42   output signed [WIDTH.OUTPUT-1:0] out_i1 ,
43   output signed [WIDTH.OUTPUT-1:0] out_i2 ,
44   output signed [WIDTH.OUTPUT-1:0] out_i3 ,
45   output out_valid
46 );
47
48 // output of polyphase filter bank
49 parameter WIDTH.POLY_OUTPUT = 16;
50 wire signed [WIDTH.POLY_OUTPUT-1:0] out_poly_r0 ;
51 wire signed [WIDTH.POLY_OUTPUT-1:0] out_poly_r1 ;
52 wire signed [WIDTH.POLY_OUTPUT-1:0] out_poly_r2 ;
53 wire signed [WIDTH.POLY_OUTPUT-1:0] out_poly_r3 ;
54 wire signed [WIDTH.POLY_OUTPUT-1:0] out_poly_i0 ;
55 wire signed [WIDTH.POLY_OUTPUT-1:0] out_poly_i1 ;
56 wire signed [WIDTH.POLY_OUTPUT-1:0] out_poly_i2 ;
57 wire signed [WIDTH.POLY_OUTPUT-1:0] out_poly_i3 ;

```

```

59   wire out_valid_poly;
60
61 // output of 4-point FFT
62 parameter WIDTH_FFT_OUTPUT = 16;
63 wire signed [WIDTH_FFT_OUTPUT-1:0] out_fft_r0;
64 wire signed [WIDTH_FFT_OUTPUT-1:0] out_fft_r1;
65 wire signed [WIDTH_FFT_OUTPUT-1:0] out_fft_r2;
66 wire signed [WIDTH_FFT_OUTPUT-1:0] out_fft_r3;
67 wire signed [WIDTH_FFT_OUTPUT-1:0] out_fft_i0;
68 wire signed [WIDTH_FFT_OUTPUT-1:0] out_fft_i1;
69 wire signed [WIDTH_FFT_OUTPUT-1:0] out_fft_i2;
70 wire signed [WIDTH_FFT_OUTPUT-1:0] out_fft_i3;
71 wire out_valid_fft;
72
73 // polyphase filter bank
74 polyphase4 #(.WIDTH_INPUT(WIDTH_INPUT), .WIDTH_OUTPUT(WIDTH_POLY_OUTPUT))
75     polyphase4 (
76         .clk(clk),
77         .rst(rst),
78         .ce(ce),
79         .in_valid(in_valid),           // input
80         .in_r0(in_r0),
81         .in_r1(in_r1),
82         .in_r2(in_r2),
83         .in_r3(in_r3),
84         .in_i0(in_i0),
85         .in_i1(in_i1),
86         .in_i2(in_i2),
87         .in_i3(in_i3),
88         .out_r0(out_poly_r0),        // output
89         .out_r1(out_poly_r1),
90         .out_r2(out_poly_r2),
91         .out_r3(out_poly_r3),
92         .out_i0(out_poly_i0),
93         .out_i1(out_poly_i1),
94         .out_i2(out_poly_i2),
95         .out_i3(out_poly_i3),
96         .out_valid(out_valid_poly)
97 );
98
99 // FFT-4
100 fft4_complex #(.WIDTH_INPUT(WIDTH_POLY_OUTPUT)) fft4_complex (
101     .clk(clk),
102     .ce(ce),
103     .rst(rst),
104     .in_valid(out_valid_poly), // input
105     .in_r0(out_poly_r0),
106     .in_r1(out_poly_r1),
107     .in_r2(out_poly_r2),
108     .in_r3(out_poly_r3),
109     .in_i0(out_poly_i0),
110

```

```

110     .in_i1(out_poly_i1),
111     .in_i2(out_poly_i2),
112     .in_i3(out_poly_i3),
113     .out_r0(out_fft_r0),      // output
114     .out_r1(out_fft_r1),
115     .out_r2(out_fft_r2),
116     .out_r3(out_fft_r3),
117     .out_i0(out_fft_i0),
118     .out_i1(out_fft_i1),
119     .out_i2(out_fft_i2),
120     .out_i3(out_fft_i3),
121     .out_valid(out_valid_fft)
122 );
123
124 // complex mixer for downconversion
125 complex_mixer complex_mixer(
126     .clk(clk),
127     .ce(ce),
128     .rst(rst),
129     .in_valid(out_valid_fft), // input
130     .in_r0(out_fft_r0),
131     .in_r1(out_fft_r1),
132     .in_r2(out_fft_r2),
133     .in_r3(out_fft_r3),
134     .in_i0(out_fft_i0),
135     .in_i1(out_fft_i1),
136     .in_i2(out_fft_i2),
137     .in_i3(out_fft_i3),
138     .out_r0(out_r0),          // output
139     .out_r1(out_r1),
140     .out_r2(out_r2),
141     .out_r3(out_r3),
142     .out_i0(out_i0),
143     .out_i1(out_i1),
144     .out_i2(out_i2),
145     .out_i3(out_i3),
146     .out_valid(out_valid)
147 );
148

```

src/multi\_channel/channelizer4.v

## C.4 Polyphase Filter Bank

```

'timescale 1ns / 1ps
2 ///////////////////////////////////////////////////////////////////
// Company:      Wireless @ VT

```

```

4 // Engineer:      Jeong-O Jeong
5 //
6 // Create Date:   10:21:55 04/28/2012
7 // Design Name:
8 // Module Name:   polyphase4
9 // Project Name:
10 // Target Devices:
11 // Tool versions:
12 // Description:   polyphase filterbank with complex coefficients
13 //
14 // Dependencies:
15 //
16 // Revision:
17 // Revision 0.01 - File Created
18 // Additional Comments:
19 //
20 ///////////////////////////////
21 module polyphase4
22 #(parameter WIDTH.INPUT=16,
23  parameter WIDTH.OUTPUT=16,
24  parameter WIDTH.FIR.OUTPUT=30)
25 (
26   input clk ,
27   input rst ,
28   input ce ,
29   input in_valid ,
30   input signed [WIDTH.INPUT-1:0] in_r0 ,           // inputs that are already
31   commutated
32   input signed [WIDTH.INPUT-1:0] in_r1 ,
33   input signed [WIDTH.INPUT-1:0] in_r2 ,
34   input signed [WIDTH.INPUT-1:0] in_r3 ,
35   input signed [WIDTH.INPUT-1:0] in_i0 ,
36   input signed [WIDTH.INPUT-1:0] in_i1 ,
37   input signed [WIDTH.INPUT-1:0] in_i2 ,
38   input signed [WIDTH.INPUT-1:0] in_i3 ,
39   output out_valid ,
40   output signed [WIDTH.OUTPUT-1:0] out_r0 ,          // outputs of the filters
41   output signed [WIDTH.OUTPUT-1:0] out_r1 ,
42   output signed [WIDTH.OUTPUT-1:0] out_r2 ,
43   output signed [WIDTH.OUTPUT-1:0] out_r3 ,
44   output signed [WIDTH.OUTPUT-1:0] out_i0 ,          // outputs of the filters
45   output signed [WIDTH.OUTPUT-1:0] out_i1 ,
46   output signed [WIDTH.OUTPUT-1:0] out_i2 ,
47   output signed [WIDTH.OUTPUT-1:0] out_i3
48 );
49
50 // control signals for FIR filters
51 wire rdy_r0 , rdy_i0 ;
52 wire rdy_r1 , rdy_i1 ;
53 wire rdy_r2 , rdy_i2 ;
54 wire rdy_r3 , rdy_i3 ;

```

```

54   wire rfd_r0 , rfd_i0 ;
56   wire rfd_r1 , rfd_i1 ;
58   wire rfd_r2 , rfd_i2 ;
59   wire rfd_r3 , rfd_i3 ;

60 // output of FIR filters
61   wire signed [WIDTH_FIR_OUTPUT-1:0] dout_r0 , dout_i0 , dout_ir0 , dout_ri0 ;
62   wire signed [WIDTH_FIR_OUTPUT-1:0] dout_r1 , dout_i1 , dout_ir1 , dout_ri1 ;
63   wire signed [WIDTH_FIR_OUTPUT-1:0] dout_r2 , dout_i2 , dout_ir2 , dout_ri2 ;
64   wire signed [WIDTH_FIR_OUTPUT-1:0] dout_r3 , dout_i3 , dout_ir3 , dout_ri3 ;

66 // temporary registers for addition and subtraction
67   reg signed [WIDTH_FIR_OUTPUT:0] tmp_r0 , tmp_i0 ;
68   reg signed [WIDTH_FIR_OUTPUT:0] tmp_r1 , tmp_i1 ;
69   reg signed [WIDTH_FIR_OUTPUT:0] tmp_r2 , tmp_i2 ;
70   reg signed [WIDTH_FIR_OUTPUT:0] tmp_r3 , tmp_i3 ;
71   reg tmp_valid ;

72 // combine the output of filters this way since the filter coefficients are
73 // complex
74 always @(posedge clk) begin
75   if(rst) begin
76     tmp_r0 <= 0;
77     tmp_r1 <= 0;
78     tmp_r2 <= 0;
79     tmp_r3 <= 0;
80     tmp_i0 <= 0;
81     tmp_i1 <= 0;
82     tmp_i2 <= 0;
83     tmp_i3 <= 0;
84     tmp_valid <= 0;
85   end
86   else if(ce) begin
87     tmp_r0 <= dout_r0-dout_i0;
88     tmp_r1 <= dout_r1-dout_i1;
89     tmp_r2 <= dout_r2-dout_i2;
90     tmp_r3 <= dout_r3-dout_i3;
91     tmp_i0 <= dout_ir0+dout_ir0;
92     tmp_i1 <= dout_ir1+dout_ir1;
93     tmp_i2 <= dout_ir2+dout_ir2;
94     tmp_i3 <= dout_ir3+dout_ir3;
95     tmp_valid <= rdy_i3;
96   end
97 end
98
99 parameter OFFSET=0;
100
101 // output
102 assign out_r0 = tmp_r0[WIDTH_FIR_OUTPUT+1-1-OFFSET:WIDTH_FIR_OUTPUT+1-1-
103 WIDTH_OUTPUT+1-OFFSET];

```

```

assign out_r1 = tmp_r1[WIDTH_FIR_OUTPUT+1-1-OFFSET:WIDTH_FIR_OUTPUT+1-1-
    WIDTH_OUTPUT+1-OFFSET];
104 assign out_r2 = tmp_r2[WIDTH_FIR_OUTPUT+1-1-OFFSET:WIDTH_FIR_OUTPUT+1-1-
    WIDTH_OUTPUT+1-OFFSET];
assign out_r3 = tmp_r3[WIDTH_FIR_OUTPUT+1-1-OFFSET:WIDTH_FIR_OUTPUT+1-1-
    WIDTH_OUTPUT+1-OFFSET];
106
assign out_i0 = tmp_i0[WIDTH_FIR_OUTPUT+1-1-OFFSET:WIDTH_FIR_OUTPUT+1-1-
    WIDTH_OUTPUT+1-OFFSET];
108 assign out_i1 = tmp_i1[WIDTH_FIR_OUTPUT+1-1-OFFSET:WIDTH_FIR_OUTPUT+1-1-
    WIDTH_OUTPUT+1-OFFSET];
assign out_i2 = tmp_i2[WIDTH_FIR_OUTPUT+1-1-OFFSET:WIDTH_FIR_OUTPUT+1-1-
    WIDTH_OUTPUT+1-OFFSET];
110 assign out_i3 = tmp_i3[WIDTH_FIR_OUTPUT+1-1-OFFSET:WIDTH_FIR_OUTPUT+1-1-
    WIDTH_OUTPUT+1-OFFSET];

112 assign out_valid = tmp_valid;
//=====
114 // real coefficients , real inputs
//=====
116 fir_real fir_h0 (
    .clk(clk), // input clk
118    .ce(ce), // input ce
    .nd(in_valid), // input nd
120    .filter_sel(2'd0), // input [1 : 0] filter_sel
    .rfd(rfd_r0), // output rfd
122    .rdy(rdy_r0), // output rdy
    .din(in_r0), // input [15 : 0] din_1
124    .dout(dout_r0)); // output [29 : 0] dout_1

126 fir_real fir_h1 (
    .clk(clk), // input clk
128    .ce(ce), // input ce
    .nd(in_valid), // input nd
130    .filter_sel(2'd1), // input [2 : 0] filter_sel
    .rfd(rfd_r1), // output rfd
132    .rdy(rdy_r1), // output rdy
    .din(in_r1), // input [15 : 0] din_1
134    .dout(dout_r1)); // output [29 : 0] dout_1

136 fir_real fir_h2 (
    .clk(clk), // input clk
138    .ce(ce), // input ce
    .nd(in_valid), // input nd
140    .filter_sel(2'd2), // input [2 : 0] filter_sel
    .rfd(rfd_r2), // output rfd
142    .rdy(rdy_r2), // output rdy
    .din(in_r2), // input [15 : 0] din_1
144    .dout(dout_r2)); // output [29 : 0] dout_1

146 fir_real fir_h3 (

```

```

148 .clk(clk), // input clk
149 .ce(ce), // input ce
150 .nd(in_valid), // input nd
151 .filter_sel(2'd3), // input [2 : 0] filter_sel
152 .rfd(rfd_r3), // output rfd
153 .rdy(rdy_r3), // output rdy
154 .din(in_r3), // input [15 : 0] din_1
155 .dout(dout_r3)); // output [29 : 0] dout_1

156 //=====
157 // imaginary coefficients , imag inputs
158 //=====
159 fir_imag fir_imag_h0 (
160   .clk(clk), // input clk
161   .ce(ce), // input ce
162   .nd(in_valid), // input nd
163   .filter_sel(2'd0), // input [2 : 0] filter_sel
164   .rfd(rfd_i0), // output rfd
165   .rdy(rdy_i0), // output rdy
166   .din(in_i0), // input [15 : 0] din_1
167   .dout(dout_i0)); // output [29 : 0] dout_1

168 fir_imag fir_imag_h1 (
169   .clk(clk), // input clk
170   .ce(ce), // input ce
171   .nd(in_valid), // input nd
172   .filter_sel(2'd1), // input [2 : 0] filter_sel
173   .rfd(rfd_i1), // output rfd
174   .rdy(rdy_i1), // output rdy
175   .din(in_i1), // input [15 : 0] din_1
176   .dout(dout_i1)); // output [29 : 0] dout_1

177 fir_imag fir_imag_h2 (
178   .clk(clk), // input clk
179   .ce(ce), // input ce
180   .nd(in_valid), // input nd
181   .filter_sel(2'd2), // input [2 : 0] filter_sel
182   .rfd(rfd_i2), // output rfd
183   .rdy(rdy_i2), // output rdy
184   .din(in_i2), // input [15 : 0] din_1
185   .dout(dout_i2)); // output [29 : 0] dout_1

186 fir_imag fir_imag_h3 (
187   .clk(clk), // input clk
188   .ce(ce), // input ce
189   .nd(in_valid), // input nd
190   .filter_sel(2'd3), // input [2 : 0] filter_sel
191   .rfd(rfd_i3), // output rfd
192   .rdy(rdy_i3), // output rdy
193   .din(in_i3), // input [15 : 0] din_1
194   .dout(dout_i3)); // output [29 : 0] dout_1

```

```

198 //=====
200 // imaginary coefficients , real inputs
//=====
202 fir_imag fir_ir_h0 (
203     .clk(clk), // input clk
204     .ce(ce), // input ce
205     .nd(in_valid), // input nd
206     .filter_sel(2'd0), // input [2 : 0] filter_sel
207     .rfd(), // output rfd
208     .rdy(), // output rdy
209     .din(in_r0), // input [15 : 0] din_1
210     .dout(dout_ir0)); // output [29 : 0] dout_1

212 fir_imag fir_ir_h1 (
213     .clk(clk), // input clk
214     .ce(ce), // input ce
215     .nd(in_valid), // input nd
216     .filter_sel(2'd1), // input [2 : 0] filter_sel
217     .rfd(), // output rfd
218     .rdy(), // output rdy
219     .din(in_r1), // input [15 : 0] din_1
220     .dout(dout_ir1)); // output [29 : 0] dout_1

222 fir_imag fir_ir_h2 (
223     .clk(clk), // input clk
224     .ce(ce), // input ce
225     .nd(in_valid), // input nd
226     .filter_sel(2'd2), // input [2 : 0] filter_sel
227     .rfd(), // output rfd
228     .rdy(), // output rdy
229     .din(in_r2), // input [15 : 0] din_1
230     .dout(dout_ir2)); // output [29 : 0] dout_1

232 fir_imag fir_ir_h3 (
233     .clk(clk), // input clk
234     .ce(ce), // input ce
235     .nd(in_valid), // input nd
236     .filter_sel(2'd3), // input [2 : 0] filter_sel
237     .rfd(), // output rfd
238     .rdy(), // output rdy
239     .din(in_r3), // input [15 : 0] din_1
240     .dout(dout_ir3)); // output [29 : 0] dout_1

242 //=====
243 // real coefficients , imag inputs
//=====
244 fir_real fir_ri_h0 (
245     .clk(clk), // input clk
246     .ce(ce), // input ce
247     .nd(in_valid), // input nd

```

```

250    .filter_sel(2'd0), // input [1 : 0] filter_sel
251    .rfd(), // output rfd
252    .rdy(), // output rdy
253    .din(in_i0), // input [15 : 0] din_1
254    .dout(dout_ri0)); // output [29 : 0] dout_1

254
255    fir_real fir_ri_h1 (
256        .clk(clk), // input clk
257        .ce(ce), // input ce
258        .nd(in_valid), // input nd
259        .filter_sel(2'd1), // input [2 : 0] filter_sel
260        .rfd(), // output rfd
261        .rdy(), // output rdy
262        .din(in_i1), // input [15 : 0] din_1
263        .dout(dout_ri1)); // output [29 : 0] dout_1

264
265    fir_real fir_ri_h2 (
266        .clk(clk), // input clk
267        .ce(ce), // input ce
268        .nd(in_valid), // input nd
269        .filter_sel(2'd2), // input [2 : 0] filter_sel
270        .rfd(), // output rfd
271        .rdy(), // output rdy
272        .din(in_i2), // input [15 : 0] din_1
273        .dout(dout_ri2)); // output [29 : 0] dout_1

274
275    fir_real fir_ri_h3 (
276        .clk(clk), // input clk
277        .ce(ce), // input ce
278        .nd(in_valid), // input nd
279        .filter_sel(2'd3), // input [2 : 0] filter_sel
280        .rfd(), // output rfd
281        .rdy(), // output rdy
282        .din(in_i3), // input [15 : 0] din_1
283        .dout(dout_ri3)); // output [29 : 0] dout_1

284
endmodule

```

src/multi\_channel/polyphase4.v

## C.5 Four-point FFT

```

'timescale 1ns / 1ps
///////////////////////////////
// Company:      Wireless@VT
// Engineer:     Jeong-O Jeong
//
// Create Date:   15:08:58 04/27/2012

```

```

// Design Name:
8 // Module Name:      fft4_complex
// Project Name:
// Target Devices:
// Tool versions:
12 // Description:     Complex four-point FFT
//
14 // Dependencies:
//
16 // Revision:
// Revision 0.01 - File Created
18 // Additional Comments:
//
20 ///////////////////////////////
module fft4_complex
22 #(  parameter WIDTH.INPUT=16,
      parameter WIDTH.OUTPUT=16)
(
  input clk ,
  input rst ,
  input ce ,
  input in_valid ,
  input signed [WIDTH.INPUT-1:0] in_r0 ,           // inputs
30  input signed [WIDTH.INPUT-1:0] in_r1 ,
  input signed [WIDTH.INPUT-1:0] in_r2 ,
32  input signed [WIDTH.INPUT-1:0] in_r3 ,
  input signed [WIDTH.INPUT-1:0] in_i0 ,
34  input signed [WIDTH.INPUT-1:0] in_i1 ,
  input signed [WIDTH.INPUT-1:0] in_i2 ,
36  input signed [WIDTH.INPUT-1:0] in_i3 ,
  output reg signed [WIDTH.OUTPUT-1:0] out_r0 ,   // outputs
38  output reg signed [WIDTH.OUTPUT-1:0] out_r1 ,
  output reg signed [WIDTH.OUTPUT-1:0] out_r2 ,
40  output reg signed [WIDTH.OUTPUT-1:0] out_r3 ,
  output reg signed [WIDTH.OUTPUT-1:0] out_i0 ,
42  output reg signed [WIDTH.OUTPUT-1:0] out_i1 ,
  output reg signed [WIDTH.OUTPUT-1:0] out_i2 ,
44  output reg signed [WIDTH.OUTPUT-1:0] out_i3 ,
  output reg out_valid
46 );
//
48 // Radix 4 FFT reduces to additions and subtractions as the following
50 wire [18:0] sum_r0 = in_r0 + in_r1 + in_r2 + in_r3;
  wire [18:0] sum_r1 = in_r0 + in_i1 - in_r2 - in_i3;
52  wire [18:0] sum_r2 = in_r0 - in_r1 + in_r2 - in_r3;
  wire [18:0] sum_r3 = in_r0 - in_i1 - in_r2 + in_i3;
54  wire [18:0] sum_r4 = in_i0 + in_i1 + in_i2 + in_i3;
  wire [18:0] sum_r5 = in_i0 - in_r1 - in_i2 + in_r3;
56  wire [18:0] sum_r6 = in_i0 - in_i1 + in_i2 - in_i3;
  wire [18:0] sum_r7 = in_i0 + in_r1 - in_i2 - in_r3;

```

```

58 // Register outputs
59 always @(posedge clk) begin
60   if(rst) begin
61     out_r0 <= 0;
62     out_r1 <= 0;
63     out_r2 <= 0;
64     out_r3 <= 0;
65     out_i0 <= 0;
66     out_i1 <= 0;
67     out_i2 <= 0;
68     out_i3 <= 0;
69     out_valid <= 0;
70   end
71   else if(ce) begin
72     out_r0 <= sum_r0[18:3];
73     out_r1 <= sum_r1[18:3];
74     out_r2 <= sum_r2[18:3];
75     out_r3 <= sum_r3[18:3];
76     out_i0 <= sum_r4[18:3];
77     out_i1 <= sum_r5[18:3];
78     out_i2 <= sum_r6[18:3];
79     out_i3 <= sum_r7[18:3];
80     out_valid <= in_valid;
81   end
82 end
83
84 endmodule

```

src/multi\_channel/fft4\_complex.v

## C.6 Complex Mixer

```

1 'timescale 1ns / 1ps
2 ///////////////////////////////////////////////////////////////////
3 // Company:      Wireless @ VT
4 // Engineer:     Jeong-O Jeong
5 //
6 // Create Date:  11:08:59 05/02/2012
7 // Design Name:
8 // Module Name:  complex_mixer
9 // Project Name:
10 // Target Devices:
11 // Tool versions:
12 // Description:  Downconvert input by F=0.5
13 //
14 // Dependencies:
15 //

```

```

// Revision:
// Revision 0.01 - File Created
// Additional Comments:
//
////////////////////////////////////////////////////////////////
21 module complex_mixer
#( parameter WIDTH_INPUT=16,
23 parameter WIDTH_OUTPUT=16)
(
25   input clk ,
26   input rst ,
27   input ce ,
28   input in_valid , // input
29   input signed [WIDTH_INPUT-1:0] in_r0 ,
30   input signed [WIDTH_INPUT-1:0] in_r1 ,
31   input signed [WIDTH_INPUT-1:0] in_r2 ,
32   input signed [WIDTH_INPUT-1:0] in_r3 ,
33   input signed [WIDTH_INPUT-1:0] in_i0 ,
34   input signed [WIDTH_INPUT-1:0] in_i1 ,
35   input signed [WIDTH_INPUT-1:0] in_i2 ,
36   input signed [WIDTH_INPUT-1:0] in_i3 ,
37   output reg signed [WIDTH_OUTPUT-1:0] out_r0 , // output
38   output reg signed [WIDTH_OUTPUT-1:0] out_r1 ,
39   output reg signed [WIDTH_OUTPUT-1:0] out_r2 ,
40   output reg signed [WIDTH_OUTPUT-1:0] out_r3 ,
41   output reg signed [WIDTH_OUTPUT-1:0] out_i0 ,
42   output reg signed [WIDTH_OUTPUT-1:0] out_i1 ,
43   output reg signed [WIDTH_OUTPUT-1:0] out_i2 ,
44   output reg signed [WIDTH_OUTPUT-1:0] out_i3 ,
45   output reg out_valid
);
46
47 // toggle between 0 and 1
48 reg counter;
49
50 always @(posedge clk) begin
51   if(rst) begin
52     counter <= 0;
53   end
54   else if(ce & in_valid) begin
55     counter <= counter + 1'b1;
56   end
57 end
58
59 // Downconversion by F=0.5 is simply flipping the sign at every other sample
60 always @(posedge clk) begin
61   if(rst) begin
62     out_r0 <= 0;
63     out_r1 <= 0;
64     out_r2 <= 0;
65     out_r3 <= 0;

```

```

67     out_i0 <= 0;
69     out_i1 <= 0;
71     out_i2 <= 0;
73     out_i3 <= 0;
75     out_valid <= 0;
77   end
78   else if(ce) begin
79     out_valid <= in_valid;
80     if(counter) begin
81       out_r0 <= in_r0;
82       out_r1 <= in_r1;
83       out_r2 <= in_r2;
84       out_r3 <= in_r3;
85       out_i0 <= in_i0;
86       out_i1 <= in_i1;
87       out_i2 <= in_i2;
88       out_i3 <= in_i3;
89     end
90   else begin
91     out_r0 <= -in_r0;
92     out_r1 <= -in_r1;
93     out_r2 <= -in_r2;
94     out_r3 <= -in_r3;
95     out_i0 <= -in_i0;
96     out_i1 <= -in_i1;
97     out_i2 <= -in_i2;
98     out_i3 <= -in_i3;
99   end
100 end
101 end
102 endmodule

```

src/multi\_channel/complex\_mixer.v

## C.7 Energy Detector

```

'timescale 1ns / 1ps
1 ///////////////////////////////////////////////////////////////////
// Company:          Wireless@VT
2 // Engineer:        Jeong-O Jeong
3 //
4 // Create Date:    14:40:59 05/09/2012
5 // Design Name:
6 // Module Name:    energydetector
7 // Project Name:
8 // Target Devices:
9 // Tool versions:
10 // Description:   simple energy detector
11
12

```

```

//           outputs the channel with maximum energy based on 4
// consecutive samples
14 // Dependencies:
16 // Revision:
18 // Revision 0.01 - File Created
20 // Additional Comments:
22 /////////////////////////////////
23 module energydetector
24 #(parameter WIDTH=16)
(
25   input clk ,
26   input rst ,
27   input ce ,
28   input in_valid ,                                // input
29   input signed [WIDTH-1:0] in_r0 ,
30   input signed [WIDTH-1:0] in_r1 ,
31   input signed [WIDTH-1:0] in_r2 ,
32   input signed [WIDTH-1:0] in_r3 ,
33   input signed [WIDTH-1:0] in_i0 ,
34   input signed [WIDTH-1:0] in_i1 ,
35   input signed [WIDTH-1:0] in_i2 ,
36   input signed [WIDTH-1:0] in_i3 ,
37   output reg signed [WIDTH-1:0] out_r , // output
38   output reg signed [WIDTH-1:0] out_i ,
39   output reg signed out_valid
40 );
41
42 reg signed [2*WIDTH-1:0] prod_r0 , prod_r1 , prod_r2 , prod_r3 ;
43 reg signed [2*WIDTH-1:0] prod_i0 , prod_i1 , prod_i2 , prod_i3 ;
44 reg out_valid_prod ;
45
46 // square each real and imag
47 always @ (posedge clk) begin
48   if (rst) begin
49     prod_r0 <= 0;
50     prod_r1 <= 0;
51     prod_r2 <= 0;
52     prod_r3 <= 0;
53     prod_i0 <= 0;
54     prod_i1 <= 0;
55     prod_i2 <= 0;
56     prod_i3 <= 0;
57     out_valid_prod <= 0;
58   end
59   else if (ce) begin
60     if (in_valid) begin
61       prod_r0 <= in_r0 * in_r0 ;
62     end

```

```

64      prod_r1 <= in_r1*in_r1;
prod_r2 <= in_r2*in_r2;
prod_r3 <= in_r3*in_r3;
66      prod_i0 <= in_i0*in_i0;
prod_i1 <= in_i1*in_i1;
68      prod_i2 <= in_i2*in_i2;
prod_i3 <= in_i3*in_i3;
70  end
    out_valid_prod <= in_valid;
72 end
74

76 // pipeline four consecutive samples
reg signed [2*WIDTH:0] pipeline_0[0:3];
78 reg signed [2*WIDTH:0] pipeline_1[0:3];
reg signed [2*WIDTH:0] pipeline_2[0:3];
80 reg signed [2*WIDTH:0] pipeline_3[0:3];

82 always @(posedge clk) begin
    if(rst) begin
84        pipeline_0[0] <= 0;
        pipeline_0[1] <= 0;
86        pipeline_0[2] <= 0;
        pipeline_0[3] <= 0;
88        pipeline_1[0] <= 0;
        pipeline_1[1] <= 0;
90        pipeline_1[2] <= 0;
        pipeline_1[3] <= 0;
92        pipeline_2[0] <= 0;
        pipeline_2[1] <= 0;
94        pipeline_2[2] <= 0;
        pipeline_2[3] <= 0;
96        pipeline_3[0] <= 0;
        pipeline_3[1] <= 0;
98        pipeline_3[2] <= 0;
        pipeline_3[3] <= 0;
100    end
    else if(ce) begin
        if (out_valid_prod) begin
102            pipeline_0[0] <= prod_i0+prod_r0;
            pipeline_0[1] <= pipeline_0[0];
            pipeline_0[2] <= pipeline_0[1];
            pipeline_0[3] <= pipeline_0[2];
104            pipeline_1[0] <= prod_i1+prod_r1;
            pipeline_1[1] <= pipeline_1[0];
            pipeline_1[2] <= pipeline_1[1];
            pipeline_1[3] <= pipeline_1[2];
106            pipeline_2[0] <= prod_i2+prod_r2;
            pipeline_2[1] <= pipeline_2[0];
            pipeline_2[2] <= pipeline_2[1];
108
110
112

```

```

114     pipeline_2[3] <= pipeline_2[2];
116     pipeline_3[0] <= prod_i3+prod_r3;
118     pipeline_3[1] <= pipeline_3[0];
120     pipeline_3[2] <= pipeline_3[1];
122     pipeline_3[3] <= pipeline_3[2];
124   end
126 end

128 // sum pipeline values
130 reg signed [2*WIDTH+3:0] sum_0, sum_1, sum_2, sum_3;
132
134 always @(posedge clk) begin
136   if(rst) begin
138     sum_0 <= 0;
140     sum_1 <= 0;
142     sum_2 <= 0;
144     sum_3 <= 0;
146   end
148   else if(ce) begin
150     sum_0 <= pipeline_0[0][WIDTH-4:0] + pipeline_0[1][WIDTH-4:0] + pipeline_0
152       [2][WIDTH-4:0] + pipeline_0[3][WIDTH-4:0];
154     sum_1 <= pipeline_1[0][WIDTH-4:0] + pipeline_1[1][WIDTH-4:0] + pipeline_1
156       [2][WIDTH-4:0] + pipeline_1[3][WIDTH-4:0];
158     sum_2 <= pipeline_2[0][WIDTH-4:0] + pipeline_2[1][WIDTH-4:0] + pipeline_2
160       [2][WIDTH-4:0] + pipeline_2[3][WIDTH-4:0];
162     sum_3 <= pipeline_3[0][WIDTH-4:0] + pipeline_3[1][WIDTH-4:0] + pipeline_3
164       [2][WIDTH-4:0] + pipeline_3[3][WIDTH-4:0];
166   end
168 end

170 // pick the maximum channel
172 always @(posedge clk) begin
174   if(rst) begin
176     out_r <= 0;
178     out_i <= 0;
180     out_valid <= 0;
182   end
184   else if(ce) begin
186     if(in_valid) begin
188       if(sum_0 >= sum_1 & sum_0 >= sum_2 & sum_0 >= sum_3) begin
190         out_r <= in_r0;
192         out_i <= in_i0;
194       end
196       if(sum_1 >= sum_0 & sum_1 >= sum_2 & sum_1 >= sum_3) begin
198         out_r <= in_r1;
200         out_i <= in_i1;
202       end
204       if(sum_2 >= sum_1 & sum_2 >= sum_0 & sum_2 >= sum_3) begin
206         out_r <= in_r2;
208         out_i <= in_i2;
210       end
212       if(sum_3 >= sum_0 & sum_3 >= sum_1 & sum_3 >= sum_2) begin
214         out_r <= in_r3;
216         out_i <= in_i3;
218       end
220     end
222   end
224 end

```

```

162         out_r <= in_r2 ;
163         out_i <= in_i2 ;
164     end
165     if (sum_3 >= sum_1 & sum_3 >= sum_2 & sum_3 >= sum_0) begin
166         out_r <= in_r3 ;
167         out_i <= in_i3 ;
168     end
169     out_valid <= in_valid ;
170 end
171
172 endmodule

```

src/multi\_channel/energydetector.v

## C.8 Resampler 4/5

```

1  'timescale 1ns / 1ps
///////////////////////////////////////////////////////////////////
3 // Company:      Wireless @ VT
// Engineer:     Jeong-O Jeong
5 //
// Create Date:   13:50:14 05/04/2012
7 // Design Name:
// Module Name:   resampler_2_4
9 // Project Name:
// Target Devices:
// Tool versions:
// Description:   Arbitrary resampler, Max interpolation rate = 2
11 //
13 //
15 // Dependencies:
16 //
17 // Revision:
18 // Revision 0.01 - File Created
19 // Additional Comments:
20 //
21 module resampler_2_4(
22     input clk ,
23     input ce ,
24     input rst ,
25     input in_valid ,           // input
26     input [15:0] in_i ,

```

```

27   input [15:0] in_q ,
28   output reg out_valid , // output
29   output reg [15:0] out_i ,
30   output reg [15:0] out_q
31 );
32
33 // output of accum_overflow
34 wire overflow;
35 wire index;
36
37 // fifo control
38 wire full;
39 wire empty;
40 wire valid;
41 wire wr_en = in_valid & ~full;
42 wire rd_en = overflow & ~empty;
43 wire [31:0] dout_fifo;
44
45 // FIFO to store incoming samples
46 fifo_resampler fifo_resampler(
47   .clk(clk), // input clk
48   .rst(rst), // input rst
49   .din({in_i, in_q}), // input [31 : 0] din
50   .wr_en(wr_en), // input wr_en
51   .rd_en(rd_en), // input rd_en
52   .dout(dout_fifo), // output [31 : 0] dout
53   .full(full), // output full
54   .empty(empty), // output empty
55   .valid(valid) // output valid
56 );
57
58 // Clock enable
59 reg ce_internal;
60 always @(posedge clk) begin
61   if(rst)
62     ce_internal <= 0;
63   else
64     // If FIFO is empty, stop until FIFO is not empty
65     ce_internal <= ce & ~empty;
66 end
67
68
69 wire rfd0, rdy0;
70 wire rfd1, rdy1;
71
72 wire [29:0] dout0_i, dout0_q;
73 wire [29:0] dout1_i, dout1_q;
74
75 // Accumulator and overflow detector
76 accum_overflow #(.DELTA(5<<7), .NBANK(2<<8), .RATE(10), .FRACTIONAL(8),
77   .WIDTHINDEX(1)) accum_overflow(

```

```

77  // K = Resample rate
78  // DELTA = NBANK/K = 2.5
79  // RATE has to be the same as FIR clock cycles
80  // and FIR clock cycles have to be half of input strobe rate?
81  .clk(clk),
82  .ce(ce_internal),
83  .rst(rst),
84  .overflow(overflow),
85  .index(index)
86 );
87
88 // Pick output of the filter bank
89 always @(posedge clk) begin
90   if(rst) begin
91     out_valid <= 0;
92     out_i <= 0;
93     out_q <= 0;
94   end
95   // Output of filter 0
96   else if(ce_internal & rdy0 & ~valid & index==0) begin
97     out_valid <= 1'b1;
98     out_i <= dout0_i[29:29-15];
99     out_q <= dout0_q[29:29-15];
100  end
101  // Output of filter 1
102  else if(ce_internal & rdy0 & ~valid & index==1) begin
103    out_valid <= 1'b1;
104    out_i <= dout1_i[29:29-15];
105    out_q <= dout1_q[29:29-15];
106  end
107  else
108    out_valid <= 0;
109 end
110
111 // Filter banks
112 fir_resampler_2_4 fir_resampler0(
113   .clk(clk), // input clk
114   .ce(ce_internal), // input ce
115   .nd(valid), // input nd
116   .filter_sel(1'd0), // input [1 : 0] filter_sel
117   .rfd(rfd0), // ouput rfd
118   .rdy(rdy0), // ouput rdy
119   .din_1(dout_fifo[31:16]), // input [15 : 0] din_1
120   .din_2(dout_fifo[15:0]), // input [15 : 0] din_2
121   .dout_1(dout0_i), // ouput [29 : 0] dout_1
122   .dout_2(dout0_q)); // ouput [29 : 0] dout_2
123
124 fir_resampler_2_4 fir_resampler1(
125   .clk(clk), // input clk
126   .ce(ce_internal), // input ce
127   .nd(valid), // input nd

```

```

129 .filter_sel(1'd1), // input [1 : 0] filter_sel
130 .rfd(rfd1), // output rfd
131 .rdy(rdy1), // output rdy
132 .din_1(dout_fifo[31:16]), // input [15 : 0] din_1
133 .din_2(dout_fifo[15:0]), // input [15 : 0] din_2
134 .dout_1(dout1_i), // output [29 : 0] dout_1
135 .dout_2(dout1_q)); // output [29 : 0] dout_2
endmodule

```

src/multi\_channel/resampler\_2\_4.v

## C.9 Accumulator and Overflow Detector

```

'timescale 1ns / 1ps
////////////////////////////////////////////////////////////////
// Company:      Wireless @ VT
// Engineer:     Jeong-O Jeong
//
// Create Date:   11:26:51 05/03/2012
// Design Name:
// Module Name:   accum_overflow
// Project Name:
// Target Devices:
// Tool versions:
// Description:   accumulator and overflow detector
//
// Dependencies:
//
// Revision:
// Revision 0.01 – File Created
// Additional Comments:
//
////////////////////////////////////////////////////////////////
module accum_overflow
#(
    parameter DELTA = 5,           //delta = n_bank/K = 4/(4/5) = 5;
    parameter N_BANK = 4,
    parameter RATE = 20,
    parameter FRACTIONAL = 0,
    parameter WIDTH_INDEX = 2
)
(
    input clk ,
    input ce ,
    input rst ,
    output overflow ,           // high when overflow
    output [WIDTH_INDEX-1:0] index // index of filter bank
);

```

```

36 // 16 bit delta, 8 bit fractional
37 localparam WIDTHACCUM = 16; // 5 for 16 banks, 57 for fractional
38
39 reg signed [WIDTHACCUM-1:0] accum;
40 wire strobe;
41 reg strobe_delay;
42
43 // delay strobe
44 always @(posedge clk) begin
45   if(rst) begin
46     strobe_delay <= 0;
47   end
48   else if(ce) begin
49     strobe_delay <= strobe;
50   end
51 end
52
53 // strober
54 cic_strober_ce #(.WIDTH(8)) cic_strober_5mhz(
55   .clock(clk),
56   .reset(rst),
57   .ce(ce),
58   .enable(1'b1),
59   .rate(RATE),
60   .strobe_fast(1'b1),
61   .strobe_slow(strobe)
62 );
63
64 // accumulator
65 always @(posedge clk) begin
66   if(rst) begin
67     accum <= 0;
68   end
69   else if(ce) begin
70     if(strobe & (accum >= NBANK)) begin
71       accum <= accum - NBANK;
72     end
73     else if(strobe & (accum < NBANK)) begin
74       accum <= accum + DELTA;
75     end
76   end
77 end
78
79 // overflow detector
80 assign overflow = ce & strobe_delay & (accum >= NBANK);
81
82 // choose filter
83 assign index = accum[WIDTHINDEX+FRACTIONAL-1:FRACTIONAL];
84
85 endmodule

```

---

src/multi\_channel/accum\_overflow.v

# Appendix D

## C++ Source Code for GNU Radio Blocks

### D.1 GNU Radio Transmitter.h and .cc

```
1 /* -*- c++ -*- */
2 /*
3 * Copyright 2004 Free Software Foundation, Inc.
4 *
5 * This file is part of GNU Radio
6 *
7 * GNU Radio is free software; you can redistribute it and/or modify
8 * it under the terms of the GNU General Public License as published by
9 * the Free Software Foundation; either version 3, or (at your option)
10 * any later version.
11 *
12 * GNU Radio is distributed in the hope that it will be useful,
13 * but WITHOUT ANY WARRANTY; without even the implied warranty of
14 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15 * GNU General Public License for more details.
16 *
17 * You should have received a copy of the GNU General Public License
18 * along with GNU Radio; see the file COPYING. If not, write to
19 * the Free Software Foundation, Inc., 51 Franklin Street,
20 * Boston, MA 02110-1301, USA.
21 */
22
23 // WARNING: this file is machine generated. Edits will be over written
24
25 #ifndef INCLUDED_ZIGBEE_SIG_SOURCE_C_H
26 #define INCLUDED_ZIGBEE_SIG_SOURCE_C_H
27
28 #include <gr_sync_block.h>
29
30 class zigbee_sig_source_c;
```

```

31 typedef boost::shared_ptr<zigbee_sig_source_c> zigbee_sig_source_c_sptr;
33 /*!
34 * \brief signal generator with zigbee_complex output.
35 * \ingroup source_blk
36 */
37
38 class zigbee_sig_source_c : public gr_sync_block {
39     friend zigbee_sig_source_c_sptr
40     zigbee_make_sig_source_c (double ampl, gr_complex offset);
41
42     double d_ampl;
43     gr_complex d_offset;
44     int d_cntr;           // counter to keep track of state
45     int d_N;
46     int d_sent;
47     double d_factor;
48
49     zigbee_sig_source_c (double ampl, gr_complex offset);
50
51 public:
52     virtual int work (int noutput_items,
53                         gr_vector_const_void_star &input_items,
54                         gr_vector_void_star &output_items);
55
56     // ACCESSORS
57     double amplitude () const { return d_ampl; }
58     gr_complex offset () const { return d_offset; }
59
60     // MANIPULATORS
61     void set_amplitude (double ampl);
62     void set_offset (gr_complex offset);
63     char generate_crc(char msg[], int index);
64 };
65
66 zigbee_sig_source_c_sptr
67 zigbee_make_sig_source_c (double ampl, gr_complex offset = 0);
68
69 #endif

```

src/gr-zigbee/zigbee\_sig\_source.c.h

```

1 /* -- c++ -- */
2 /*
3 * Copyright 2004,2010 Free Software Foundation, Inc.
4 *
5 * This file is part of GNU Radio
6 *
7 * GNU Radio is free software; you can redistribute it and/or modify
8 * it under the terms of the GNU General Public License as published by
9 * the Free Software Foundation; either version 3, or (at your option)

```

```

* any later version.
11 *
* GNU Radio is distributed in the hope that it will be useful,
13 * but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15 * GNU General Public License for more details.
*
17 * You should have received a copy of the GNU General Public License
* along with GNU Radio; see the file COPYING. If not, write to
19 * the Free Software Foundation, Inc., 51 Franklin Street,
* Boston, MA 02110-1301, USA.
21 */

23 // WARNING: this file is machine generated. Edits will be over written

25 #ifdef HAVE_CONFIG_H
#include <config.h>
27 #endif
#include <zigbee_sig_source_c.h>
29 #include <algorithm>
#include <gr_io_signature.h>
31 #include <stdexcept>
#include <gr_complex.h>
33 #include <iostream>
//char msg[]={0x14, 0x61, 0x88, 0x5E, 0x32, 0x33, 0x00, 0x00, 0x00, 0x00,
34 , 0x00, 'H', 'E', 'L', 'O', '!', '!', '!'};
35 //char msg[]={0x61, 0x88, 0xc0, 0x32, 0x33, 0x0, 0x0, 0x0, 0xa2, 0x0, 0
36 x48, 0x45, 0x4c, 0x4f};
// message to be transmitted
37 char msg[]={0x41, 0xcc, 0xe5, 0x78, 0x56, 0x22, 0x5a, 0x63, 0x40, 0x00, 0xa2,
38 0x13, 0x00, 0x70, 0x5a, 0x63, 0x40, 0x00, 0xa2, 0x13, 0x00, 'M', 'E', 'S', 'S',
39 'A', 'G', 'E', ' ', 'C', 'O', 'M', 'I', 'N', 'G', ' ', 'F', 'R', 'O', 'M', ' ', 'U', 'S',
40 'R', 'P', ' ', 'F', 'P', 'G', 'A',0xd};
char chk[]={0x1f, 0xce};//{0xE7, 0x3F};

41 zigbee_sig_source_c::zigbee_sig_source_c (double ampl, gr_complex offset)
42 : gr_sync_block ("sig_source_c",
43     gr_make_io_signature (0, 0, 0),
44     gr_make_io_signature (1, 1, sizeof (gr_complex))),
45     d_ampl (ampl), d_offset (offset)
46 {
47     d_ampl=0;
48     d_cntr=0;
49     d_N=53;           // length of packet
50     d_factor=32768.0;
51     d_sent=0;
52     chk[0]=generate_crc(msg, 0);
53     chk[1]=generate_crc(msg, 1);
54     std :: cout << "======" << std :: endl;
55     std :: cout << (unsigned short)(chk[0]) << std :: endl;
      std :: cout << (unsigned short)(chk[1]) << std :: endl;

```

```

57     std :: cout << "======" << std :: endl;
58 }

59     zigbee_sig_source_c_sptr
60     zigbee_make_sig_source_c (double ampl, gr_complex offset)
61 {
62     return gnuradio::get_initial_sptr(new zigbee_sig_source_c (ampl, offset));
63 }

65 // Generate checksum
66 char zigbee_sig_source_c::generate_crc(char msg[], int index)
67 {
68     char checksum[2]={0x00,0x00};
69     char FCS[16];
70     // initialize FCS vector
71     for(int i=0; i<16; i++){
72         FCS[i]=0;
73     }
74     int j=0;
75     char input;
76     char s0, s1, s2;
77     for(int i=0; i<(d_N-2)*8; i++)
78     {
79         input = 0x01 & ((char)msg[i/8] >> j);
80         s0 = input ^ FCS[0];
81         s1 = s0 ^ FCS[4];
82         s2 = s0 ^ FCS[11];
83         FCS[0] = FCS[1];
84         FCS[1] = FCS[2];
85         FCS[2] = FCS[3];
86         FCS[3] = s1;
87         FCS[4] = FCS[5];
88         FCS[5] = FCS[6];
89         FCS[6] = FCS[7];
90         FCS[7] = FCS[8];
91         FCS[8] = FCS[9];
92         FCS[9] = FCS[10];
93         FCS[10] = s2;
94         FCS[11] = FCS[12];
95         FCS[12] = FCS[13];
96         FCS[13] = FCS[14];
97         FCS[14] = FCS[15];
98         FCS[15] = s0;
99         j=(j+1) % 8;
100    }
101
102    for(int i=0; i<8; i++)
103    {
104        checksum[0]=checksum[0] | ((FCS[i] << i));
105    }
106    for(int i=8; i<16; i++)
107    {
108        checksum[1]=checksum[1] | ((FCS[i] << (i-8)));
109    }
110
111    return checksum;
112 }

```

```

107     {
108         checksum[1]=checksum[1] | ((FCS[i] << (i-8)));
109     }
110 /*std::cout << "======" << std::endl;
111    std::cout << (unsigned short)(0xFF & checksum[0]) << std::endl;
112    std::cout << (unsigned short)(0xFF & checksum[1]) << std::endl;
113    std::cout << "======" << std::endl; */
114     return checksum[index];
115 }

116 int
117 zigbee_sig_source_c::work (int noutput_items,
118     gr_vector_const_void_star &input_items,
119     gr_vector_void_star &output_items)
120 {
121     gr_complex *optr = (gr_complex *) output_items[0];
122     gr_complex t;

123     t = (gr_complex) d_ampl + d_offset;
124 //=====
125 // d_ampl can be between 0 and 255
126 //=====
127     std::cout << "noutput_items: " << noutput_items << std::flush << std::endl;
128     for (int i = 0; i < noutput_items; i++)
129     {
130         // preamble      (4 bytes)
131         // SFD          (1 bytes)
132         // framelength  (1 bytes)
133         // payload       (N bytes)
134         // FCS          (2 bytes)

135         if(d_cntr < 4)           // preamble
136             d_ampl = 0;
137         else if(d_cntr < 5)       // SFD
138             d_ampl = 167;
139         else if(d_cntr < 6)       // framelength
140             d_ampl = d_N;
141         else if(d_cntr < 6+d_N-2) // payload
142             d_ampl=msg[d_cntr-6];
143         else if(d_cntr < 6+d_N-1) // FCS
144             d_ampl=chk[0];
145         else if(d_cntr < 6+d_N)   // FCS
146         {
147             d_ampl=chk[1];
148         }

149         // convert char into complex short
150         optr[i] = gr_complex(d_ampl/d_factor, 1.0/d_factor);

151         // reset if finished encoding a packet
152         if(d_cntr >= 6+d_N+2)

```

```

159     d_cntr = 0;
160     else
161         d_cntr++;
162     }
163     std::cout << "Generating packets .." << std::flush << std::endl;
164     return noutput_items;
165 }

166     void
167 zigbee_sig_source_c::set_amplitude (double ampl)
168 {
169     // d_ampl = ampl;
170 }

171     void
172 zigbee_sig_source_c::set_offset (gr_complex offset)
173 {
174     d_offset = offset;
175 }

```

src/gr-zigbee/zigbee\_sig\_source\_c.cc

## D.2 GNU Radio Receiver .h and .cc

```

1 #ifndef INCLUDED_ZIGBEE_RX_MOD_SS_H
2 #define INCLUDED_ZIGBEE_RX_MOD_SS_H

4 #include <gr_block.h>
5 #include <fstream>

6 class zigbee_rx_mod_ss;

8 typedef boost::shared_ptr<zigbee_rx_mod_ss> zigbee_rx_mod_ss_sptr;
9
10 // public interface for creating new instances
11 // howto_square_ff's constructor is private
12 zigbee_rx_mod_ss_sptr zigbee_make_rx_mod_ss(size_t itemsize);

13 class zigbee_rx_mod_ss : public gr_block
14 {
15     private:
16         // allow howto_make_square_ff to access the private constructor
17         // "friend" – once a non-member function is declared as a friend, it can
18         // access private data of the class
19         friend zigbee_rx_mod_ss_sptr zigbee_make_rx_mod_ss(size_t itemsize);

20         zigbee_rx_mod_ss(size_t itemsize); // private constructor
21
22

```

```

24     //enum state_t {STATE_FRAMELENGTH, STATE_PAYLOAD, STATE_CRC}
25     //int d_state;                                // FRAMELENGTH, PAYLOAD, CRC
26     int d_counter_detected;
27     int d_counter64;
28     int d_byte_counter;
29     int d_total_crc_correct;
30     int d_total_bits;
31     int d_error_bits;
32     std::ofstream d_ofile;
33     std::ofstream d_preamble_file;
34
35 public:
36     ~zigbee_rx_mod_ss(); // public destructor
37
38     int general_work (int noutput_items,
39                       gr_vector_int &ninput_items,
40                       gr_vector_const_void_star &input_items,
41                       gr_vector_void_star &output_items);
42
43 };
44
45 #endif /* INCLUDED_HOWTO_SQUARE_FF_H */

```

src/gr-zigbee/zigbee\_rx\_mod\_ss.h

```

1 #ifdef HAVE_CONFIG_H
2 #include "config.h"
3 #endif
4
5 #include <zigbee_rx_mod_ss.h>
6 #include <gr_io_signature.h>
7 #include <iostream>
8 #include <iomanip>
9 #include <cstring>
10 #include <stdio.h>
11 #include <fstream>
12
13 std::ofstream output_file;
14
15 zigbee_rx_mod_ss_sptr zigbee_make_rx_mod_ss(size_t itemsize)
16 {
17     return zigbee_rx_mod_ss_sptr (new zigbee_rx_mod_ss (itemsize));
18 }
19
20 static const int MIN_IN = 1;
21 static const int MAX_IN = 1;
22 static const int MIN_OUT = 0;
23 static const int MAXOUT = 0;
24
25 // Private constructor

```

```

26 zigbee_rx_mod_ss::zigbee_rx_mod_ss(size_t itemsize) : gr_block(""
27   zigbee_rx_mod_ss",
28   gr_make_io_signature(MIN_IN, MAX_IN, itemsize),
29   gr_make_io_signature(MIN_OUT, MAX_OUT, sizeof(short)))
30 {
31   d_counter64=0;
32   d_byte_counter=0;
33   d_total_crc_correct=0;
34   d_total_bits=0;
35   d_error_bits=0;
36   output_file.open("output_zigbee.txt");
37
38 // Virtual destructor
39 zigbee_rx_mod_ss::~zigbee_rx_mod_ss()
40 {
41   output_file.close();
42   // d_ofile.close();
43   // d_preamble_file.close();
44 }
45
46 //char message[]="0123456789:<=>?@ABCDEFGHIJKLMNO";
47 int LEN=40; // length of transmitted payload
48 // Transmitted message to be compared against the received message for BER
49 // calculation
50 //char message[]="0123456789:<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789:<=>?
51 // @ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
52 char message[]="0123456789012345678901234567890123456789";
53
54 // count the number of bits that are one
55 int bitcount(char n)
56 {
57   int tot=0;
58   int i;
59   for (i=1; i<=128; i=i*2){
60     if(n & (0xFF & i))
61       ++tot;
62   }
63   return tot;
64 }
65
66 // compare bits
67 int compare_bits(char received, char ref)
68 {
69   return bitcount(0xFF & ((ref^received)));
70 }
71
72 int zigbee_rx_mod_ss::general_work(int noutput_items,
73   gr_vector_int &ninput_items,
74   gr_vector_const_void_star &input_items,

```

```

74     gr_vector_void_star &output_items)
75 {
76     int MASK = 0x000000FF;
77     const char *in = (const char *) input_items[0];
78     short *out = (short *) output_items[0];
79
80     char *word = (char *)malloc(4);
81     int state;
82     int prev_state=0;
83     int strobe_byte;
84     char byte;
85     int count_byte=0;
86     int crc_correct=0;
87
88     for(int i=0; i<noutput_items; i++){
89         std::memcpy(word, in, 4);
90
91         // parse 32-bit input
92         state = (0x0F & word[0]);
93         crc_correct = ((0x10 & word[1]) >> 4);
94         strobe_byte = ((0x20 & word[1]) >> 5);
95         byte = (0xFF & word[2]);
96
97         // reset counter if at DECODEFRAMELENGTH or before
98         if(state < 5)
99         {
100             d_byte_counter = 0;
101         }
102         // if at DECODESYMBOLS state
103         if(state == 5 && strobe_byte == 1)
104         {
105             if(d_byte_counter == 0)
106                 std::cout << "PACKET DETECTED " << ++d_counter64 << std::flush << std::endl;
107
108             // if passed header information start printng payload
109             if(d_byte_counter > 9)
110             {
111                 std::cout << byte << std::flush;
112                 // start couting bits in error
113                 if(d_byte_counter-10 < LEN){
114                     d_error_bits = d_error_bits + compare_bits(byte, message[
115                         d_byte_counter-10]);
116                     d_total_bits += 8;
117                 }
118                 // count the number of bits received
119                 d_byte_counter++;
120             }
121
122             // if in CHECKCRC state

```

```

124     if(state == 8 && strobe_byte ==1){
125         std::cout << "\nCRC:" << crc_correct << std::endl;
126         if(crc_correct == 1){
127             d_total_crc_correct++; // count the number of packets that have
128             correct CRC
129             // calculate and print BER
130             std::cout << "Total CRC: " << d_total_crc_correct << std::flush << std
131             ::endl;
132             std::cout << "BER : " << (1.0*(double)d_error_bits) / (1.0*(double)
133                 d_total_bits) << std::flush << std::endl;
134             std::cout << "errors : " << (d_error_bits) << std::flush << std::endl;
135             std::cout << "total : " << d_total_bits << std::flush << std::endl;
136         }
137         else{
138             std::cout << "Total CRC: " << d_total_crc_correct << std::flush << std
139             ::endl;
140             std::cout << "BER : " << (1.0*(double)d_error_bits) / (1.0*(double)
141                 d_total_bits) << std::flush << std::endl;
142             std::cout << "errors : " << (d_error_bits) << std::flush << std::endl;
143             std::cout << "total : " << d_total_bits << std::flush << std::endl;
144         }
145         // store previous state
146         prev_state = state;
147         // increment pointer
148         in+=4;
149     }
150     std::free(word);
151     // tell the scheduler how many items were consumed
152     consume_each(noutput_items);
153     // number of items consumed
154     return noutput_items;
155 }
```

src/gr-zigbee/zigbee\_rx\_mod\_ss.cc