

# Front-End Signal Processing for Red Pitaya Spectrum Analyzer

Bachelor Thesis



**Degree Program:** Electrical Engineering and Information Technology

**Course:** Project 6

**Coaches:** Prof. Dr. Richard Gut, Michael Pichler

**External Expert:** Dr. Jürg Stettbacher

**Team:** Raphael Frey, Noah Hüsser

**Date:** August 23, 2017

**Revision:** 1.0.2

Content © 2017 Raphael Frey  
Noah Hüsser

Design © 2017 Raphael Frey

This bachelor thesis was created in the spring semester of 2017 at the FHNW University of Applied Sciences and Arts Northwestern Switzerland in the degree program *Electrical and Information Technology*.

This is the electronic version of the document, with colored links and color gradients for decorations. The color gradients in this version might pose problems when printing; the print version is recommended in that case. It can be found at <https://github.com/alpenwasser/pitaya/tree/master/doc/report/snapshots> or requested by email from [rmfre@runbox.com](mailto:rmfre@runbox.com).

This document has been compiled at least 7525 times across 4 hosts so far.

#### Version History

|               |                 |                                                                                                                                                                                                                                                                                                                                              |
|---------------|-----------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Version 1.0.0 | August 18, 2017 | Release as handed in for official review                                                                                                                                                                                                                                                                                                     |
| Version 1.0.1 |                 | Skipped to keep in sync with global repository                                                                                                                                                                                                                                                                                               |
| Version 1.0.2 | August 23, 2017 | Removes <i>Media</i> appendix, adds edition notice (this page), tweaks plots in <i>Verification</i> chapter for clarity and consistency, adjusts height of axes in Figure 6.5, adjusts x axis labels in Figures 1.10 and 1.11 (bottom), fixes y axis label in Figure 1.11, adds vertical space between chapters in <i>List of Listings</i> . |

# Abstract

The objective of this project is to realize a system for measuring analog signals in the kilohertz to low megahertz range with a digital signal processing system in real time, offering an affordable alternative to expensive oscilloscopes and spectrum analyzers. The basic components of the system are a Red Pitaya STEMlab for signal acquisition and processing and a personal computer or mobile device for data visualization and further analysis.

To allow transmission over an Ethernet connection and to improve signal quality, the high-rate data stream coming from the STEMlab's analog-to-digital converter is decimated to a lower-rate signal using the integrated FPGA. During decimation, the signal passes through one of six filter chains to attenuate aliasing effects.

The chains run on the STEMlab's FPGA. They are based on a combination of FIR and CIC filters, and decimate the incoming 125 MHz signal to output frequencies between 50 kHz and 25 MHz, depending on the chain. They achieve an aliasing attenuation of 60 dB and exhibit negligible passband droop. A signal-to-noise ratio of up to 84 dB has been measured.

The decimated signal is then passed on to an embedded GNU/Linux and sent to a client over a network connection. A newly developed oscilloscope application allows observation of the signal from a client device. The application is based on JavaScript, allowing for easy deployment across any platform with a modern browser available.

All components created for this project are provided under the MIT license. A comprehensive toolchain covering filter design, FPGA tools, embedded GNU/Linux and front end development, along with documentation, allows anyone to extend and modify the system and to tailor it to their needs.

**Key Words:** Red Pitaya, STEMlab, Xilinx, FPGA, Oscilloscope, Spectrum Analyzer, DSP, Digital Signal Processing, Digital Filters, FIR Filter, CIC Filter, Lowpass, Sampling, Down-sampling, Decimation, Aliasing, Attenuation, FFT, JavaScript, Web Application, VHDL, C, C++, Open Source, MIT License.

# Thanks

Many thanks go to all the people who have supported us during our studies, and this project in particular:

*Friends and families*, for their patience when we were absent from their lives.

*Richard Gut*, for showing much patience in explaining things again and again when our minds proved resistant to digital signal processing insight.

*Michael Pichler*, for his support when debuggin FPGA problems.

*Hanspeter Schmid*, for his help with understanding the magic of CIC filters and SNR calculations.

*Hans Buchmann*, for his avid enthusiasm in helping us to figure out device trees on GNU/Linux.

*Pavel Demin*, for his ADC core and prompt technical support in helping us to understand the innards of the STEMlab.

# Short Contents

|                                 |           |
|---------------------------------|-----------|
| <b>Introduction</b>             | <b>1</b>  |
| <b>I Project Report</b>         | <b>3</b>  |
| <b>1 Theoretical Background</b> | <b>4</b>  |
| <b>2 Mission</b>                | <b>28</b> |
| <b>3 Filter Design</b>          | <b>37</b> |
| <b>4 FPGA</b>                   | <b>43</b> |
| <b>5 Server</b>                 | <b>49</b> |
| <b>6 Oscilloscope</b>           | <b>52</b> |
| <b>7 Verification</b>           | <b>62</b> |
| <b>8 Conclusions</b>            | <b>71</b> |
| <b>Bibliography</b>             | <b>72</b> |
| <b>II Developer Guide</b>       | <b>75</b> |
| <b>9 Project Structure</b>      | <b>76</b> |
| <b>10 FPGA/SoC Toolchain</b>    | <b>78</b> |
| <b>11 Filter Toolchain</b>      | <b>81</b> |
| <b>12 Server</b>                | <b>86</b> |
| <b>13 Scope</b>                 | <b>94</b> |
| <b>III User Guide</b>           | <b>95</b> |
| <b>14 Setup</b>                 | <b>96</b> |
| <b>15 Operation</b>             | <b>99</b> |

|                                 |            |
|---------------------------------|------------|
| <b>Appendices</b>               | <b>101</b> |
| <b>A Theoretical Background</b> | <b>102</b> |
| <b>B Filter Design</b>          | <b>109</b> |
| <b>C Oscilloscope</b>           | <b>120</b> |
| <b>D Licenses</b>               | <b>126</b> |
| <b>E Problem Statement</b>      | <b>127</b> |

# Contents

|                                                                      |           |
|----------------------------------------------------------------------|-----------|
| <b>Introduction</b>                                                  | <b>1</b>  |
| <b>I Project Report</b>                                              | <b>3</b>  |
| <b>1 Theoretical Background</b>                                      | <b>4</b>  |
| 1.1 The Digital Signal Processing Chain . . . . .                    | 4         |
| 1.2 Digital Filters . . . . .                                        | 9         |
| 1.2.1 IIR Filters . . . . .                                          | 9         |
| 1.2.2 FIR Filters . . . . .                                          | 10        |
| 1.2.3 CIC Filters . . . . .                                          | 14        |
| 1.2.3.1 General Description . . . . .                                | 14        |
| 1.2.3.2 Frequency Characteristics . . . . .                          | 15        |
| 1.2.3.3 Compensators . . . . .                                       | 19        |
| 1.2.3.4 Register Growth . . . . .                                    | 19        |
| 1.2.3.5 Errors Due to Truncation and Rounding . . . . .              | 19        |
| 1.2.3.6 Summary . . . . .                                            | 23        |
| 1.3 Multi-Stage Filter Designs . . . . .                             | 23        |
| <b>2 Mission</b>                                                     | <b>28</b> |
| 2.1 The Red Pitaya STEMlab 125-14 . . . . .                          | 28        |
| 2.1.1 Hardware Overview . . . . .                                    | 28        |
| 2.1.2 Downsampling on the STEMlab With Stock Configuration . . . . . | 29        |
| 2.2 Possible Solutions . . . . .                                     | 32        |
| 2.3 Concept . . . . .                                                | 34        |
| 2.3.1 FPGA Components . . . . .                                      | 35        |
| 2.3.2 Interfacing Layer . . . . .                                    | 35        |
| 2.3.3 Oscilloscope . . . . .                                         | 36        |
| <b>3 Filter Design</b>                                               | <b>37</b> |
| 3.1 Requirements . . . . .                                           | 37        |
| 3.2 Cascade Concept . . . . .                                        | 38        |
| 3.3 Filter Specifications . . . . .                                  | 39        |
| <b>4 FPGA</b>                                                        | <b>43</b> |
| 4.1 The Xilinx Toolchain . . . . .                                   | 43        |
| 4.2 The ADC Core . . . . .                                           | 44        |
| 4.3 The Logger Core . . . . .                                        | 44        |
| 4.4 The Filter Chains . . . . .                                      | 45        |
| 4.4.1 Filter Compilers . . . . .                                     | 45        |

|                                                            |           |
|------------------------------------------------------------|-----------|
| 4.4.2 Bit Propagation Through the Filter Chains . . . . .  | 45        |
| 4.4.3 Ensuring Maximum Dynamic Range . . . . .             | 46        |
| 4.4.4 Errors Due to Truncation in the CIC Filter . . . . . | 47        |
| <b>5 Server</b>                                            | <b>49</b> |
| 5.1 Requirements . . . . .                                 | 49        |
| 5.2 Design Choices . . . . .                               | 49        |
| 5.3 Implementation . . . . .                               | 50        |
| <b>6 Oscilloscope</b>                                      | <b>52</b> |
| 6.1 Requirements . . . . .                                 | 52        |
| 6.2 Design Choices . . . . .                               | 52        |
| 6.2.1 Networking . . . . .                                 | 54        |
| 6.3 Product . . . . .                                      | 55        |
| 6.3.1 Application Structure . . . . .                      | 55        |
| 6.3.2 Graphics . . . . .                                   | 58        |
| 6.3.3 Power Calculation . . . . .                          | 58        |
| 6.3.4 SNR Autodetection . . . . .                          | 59        |
| <b>7 Verification</b>                                      | <b>62</b> |
| 7.1 RMS Frequency Response . . . . .                       | 63        |
| 7.2 Mean Frequency Response . . . . .                      | 63        |
| 7.3 SNR Frequency Response . . . . .                       | 65        |
| 7.4 Stopband Attenuation . . . . .                         | 66        |
| 7.5 Summary . . . . .                                      | 70        |
| <b>8 Conclusions</b>                                       | <b>71</b> |
| <b>Bibliography</b>                                        | <b>72</b> |
| <b>II Developer Guide</b>                                  | <b>75</b> |
| <b>9 Project Structure</b>                                 | <b>76</b> |
| <b>10 FPGA/SoC Toolchain</b>                               | <b>78</b> |
| 10.1 Setting Up the Build Box . . . . .                    | 78        |
| 10.2 Setting up Vivado . . . . .                           | 79        |
| 10.3 Building a Linux . . . . .                            | 79        |
| <b>11 Filter Toolchain</b>                                 | <b>81</b> |
| 11.1 Toolchain Structure . . . . .                         | 81        |
| 11.2 Usage . . . . .                                       | 83        |
| <b>12 Server</b>                                           | <b>86</b> |
| 12.1 Building the Server . . . . .                         | 86        |
| 12.2 Manually Starting the Server . . . . .                | 86        |
| 12.2.1 onHttpRequest . . . . .                             | 87        |
| 12.2.2 onMessage . . . . .                                 | 88        |
| 12.3 Instruction Set . . . . .                             | 88        |
| 12.3.1 Forcing a New Trigger Event . . . . .               | 88        |
| 12.3.2 Configuring the Frame Sent by the Server . . . . .  | 89        |

|                                                                         |            |
|-------------------------------------------------------------------------|------------|
| 12.3.3 Setting the Number of Logged Channels . . . . .                  | 90         |
| 12.3.4 Reading the Currently Stored Frame . . . . .                     | 90         |
| 12.3.5 Requesting a New Frame and Reading It When It Is Ready . . . . . | 91         |
| 12.3.6 Setting the Sampling Rate . . . . .                              | 91         |
| 12.3.7 Polling the Status of the Logger . . . . .                       | 92         |
| 12.3.8 Configuring the Trigger . . . . .                                | 93         |
| <b>13 Scope</b>                                                         | <b>94</b>  |
| <b>III User Guide</b>                                                   | <b>95</b>  |
| <b>14 Setup</b>                                                         | <b>96</b>  |
| <b>15 Operation</b>                                                     | <b>99</b>  |
| <b>Appendices</b>                                                       | <b>101</b> |
| <b>A Theoretical Background</b>                                         | <b>102</b> |
| A.1 Internal Behavior of a CIC Filter . . . . .                         | 102        |
| A.2 CIC Filter Tables . . . . .                                         | 108        |
| <b>B Filter Design</b>                                                  | <b>109</b> |
| B.1 Decimation of 625: Variants . . . . .                               | 109        |
| B.2 Resource Usage for FIR Filters on the FPGA . . . . .                | 111        |
| B.3 Filter Frequency Responses . . . . .                                | 112        |
| B.3.1 5steep . . . . .                                                  | 112        |
| B.3.2 5flat . . . . .                                                   | 112        |
| B.3.3 2steep . . . . .                                                  | 113        |
| B.3.4 CIC25 . . . . .                                                   | 113        |
| B.3.5 CFIR25 . . . . .                                                  | 113        |
| B.3.6 CIC125 . . . . .                                                  | 114        |
| B.3.7 CFIR125 . . . . .                                                 | 114        |
| B.3.8 Chain for R=25 . . . . .                                          | 115        |
| B.3.9 Chain for R=125 . . . . .                                         | 116        |
| B.3.10 Chain for R=625 . . . . .                                        | 117        |
| B.3.11 Chain for R=1250 . . . . .                                       | 118        |
| B.3.12 Chain for R=2500 . . . . .                                       | 119        |
| <b>C Oscilloscope</b>                                                   | <b>120</b> |
| C.1 WebSockets . . . . .                                                | 120        |
| C.2 State Tree of Oscilloscope . . . . .                                | 122        |
| C.3 mithril.js . . . . .                                                | 123        |
| C.4 WebGL . . . . .                                                     | 124        |
| C.5 FFT Windowing Parameters . . . . .                                  | 125        |
| <b>D Licenses</b>                                                       | <b>126</b> |
| D.1 The MIT License . . . . .                                           | 126        |
| <b>E Problem Statement</b>                                              | <b>127</b> |

# List of Figures

|      |                                                                      |    |
|------|----------------------------------------------------------------------|----|
| 1    | System Overview . . . . .                                            | 1  |
| 1.1  | The DSP Chain . . . . .                                              | 4  |
| 1.2  | Signals Passing Through the DSP Chain (Simplified) . . . . .         | 6  |
| 1.3  | Aliasing Illustrated via Signal Frequency Band . . . . .             | 6  |
| 1.4  | Aliasing With Harmonic Signals . . . . .                             | 7  |
| 1.5  | Folding Back of Stopband Components Into Passband . . . . .          | 8  |
| 1.6  | IIR Filter: Biquad . . . . .                                         | 9  |
| 1.7  | FIR Filter Topology Example . . . . .                                | 10 |
| 1.8  | Brick Wall Filter vs. FIR Filter (simplified) . . . . .              | 11 |
| 1.9  | Specifying FIR Filter Constraints . . . . .                          | 12 |
| 1.10 | Impulse Response of a FIR Filter . . . . .                           | 12 |
| 1.11 | Half-band Filter Frequency Response . . . . .                        | 13 |
| 1.12 | Integrator Stage . . . . .                                           | 14 |
| 1.13 | Comb Stage . . . . .                                                 | 14 |
| 1.14 | CIC Filter Topology . . . . .                                        | 14 |
| 1.15 | Frequency Responses for Integrators, Combs and CIC Filters . . . . . | 16 |
| 1.16 | CIC Filter: Passband and Aliasing Attenuation . . . . .              | 17 |
| 1.17 | CIC Filter: Passband and Aliasing Attenuation . . . . .              | 18 |
| 1.18 | Influence of Design Parameters on Frequency Response . . . . .       | 20 |
| 1.19 | CIC Compensator . . . . .                                            | 21 |
| 1.20 | Frequency Response of Multi-Stage Vs. Single-Stage Design . . . . .  | 24 |
| 1.21 | Cascade: Transition Band Overlap . . . . .                           | 26 |
| 1.22 | Cascade: Stopband Attenuation . . . . .                              | 27 |
| 2.1  | STEMlab Photo . . . . .                                              | 29 |
| 2.2  | STEMlab Block Diagram . . . . .                                      | 30 |
| 2.3  | Frequency Response of Moving Averager: Example . . . . .             | 31 |
| 3.1  | Filter Chain Concept . . . . .                                       | 39 |
| 4.1  | System Schematic . . . . .                                           | 43 |
| 4.2  | Bit Flow in Filter Chain . . . . .                                   | 46 |
| 4.3  | Good Vs. Bad Use Of Dynamic Range . . . . .                          | 48 |
| 5.1  | Server Event Structure . . . . .                                     | 51 |
| 6.1  | The Scope Application . . . . .                                      | 55 |
| 6.2  | Scope Event Structure . . . . .                                      | 56 |
| 6.3  | The Scope Structure . . . . .                                        | 57 |
| 6.4  | FFT Comparison . . . . .                                             | 58 |

|                                                                         |     |
|-------------------------------------------------------------------------|-----|
| 6.5 SNR comparison . . . . .                                            | 61  |
| 7.1 RMS at Filter Chain Output . . . . .                                | 64  |
| 7.2 Mean at Filter Outputs . . . . .                                    | 64  |
| 7.3 SNR at Filter Outputs . . . . .                                     | 65  |
| 7.4 Attenuation in Passband, Edge and Stopband for $R = 5$ . . . . .    | 67  |
| 7.5 Attenuation in Passband, Edge and Stopband for $R = 25$ . . . . .   | 67  |
| 7.6 Attenuation in Passband, Edge and Stopband for $R = 125$ . . . . .  | 68  |
| 7.7 Attenuation in Passband, Edge and Stopband for $R = 625$ . . . . .  | 68  |
| 7.8 Attenuation in Passband, Edge and Stopband for $R = 1250$ . . . . . | 69  |
| 7.9 Attenuation in Passband, Edge and Stopband for $R = 2500$ . . . . . | 69  |
| 9.1 Project Structure Tree . . . . .                                    | 77  |
| 14.1 Entering the URL . . . . .                                         | 97  |
| 14.2 The popup warn popup . . . . .                                     | 97  |
| 14.3 Accept popups in the future . . . . .                              | 98  |
| 14.4 Running the scope . . . . .                                        | 98  |
| 15.1 The Scope Application . . . . .                                    | 100 |
| A.1 Topology of Example Filter . . . . .                                | 102 |
| A.2 Frequency Respose of Example CIC Filter . . . . .                   | 103 |
| A.3 Two's Complement Circle . . . . .                                   | 104 |
| A.4 Simulink Filter Model . . . . .                                     | 105 |
| A.5 Simulink Simulation Results . . . . .                               | 106 |
| B.1 Decimation Chain Variants for Rate of 625 . . . . .                 | 110 |
| B.2 Usage Report FIR Compiler . . . . .                                 | 111 |

# List of Tables

|     |                                                                       |     |
|-----|-----------------------------------------------------------------------|-----|
| 2.1 | Measurements of STEMlab 125-14 Stock Configuration . . . . .          | 32  |
| 2.2 | Decision Matrix of Possible System Design Choices . . . . .           | 34  |
| 3.1 | Downsampling Ratios, Decompositions, and Target Frequencies . . . . . | 38  |
| 3.2 | Summary of Filter Specifications . . . . .                            | 40  |
| 3.3 | Transition Band Widths . . . . .                                      | 42  |
| 6.1 | Comparison of Programming Languages . . . . .                         | 53  |
| 6.2 | Window Correction Factors . . . . .                                   | 59  |
| 7.1 | SNR Referenced Against $50\Omega$ . . . . .                           | 66  |
| 7.2 | Mean Metrics for All Filter Chains . . . . .                          | 70  |
| A.1 | CIC Filter Example: States for 16 Cycles . . . . .                    | 105 |
| A.2 | CIC Filter Example: Output Out of Range . . . . .                     | 107 |
| A.3 | CIC Filter Passband Attenuations . . . . .                            | 108 |
| A.4 | CIC Filter Passband Aliasing Attenuation . . . . .                    | 108 |
| B.1 | FIR Compiler Parameters . . . . .                                     | 111 |
| C.1 | FFT Windowing Parameters . . . . .                                    | 125 |

# List of Listings

|                                                                                                |     |
|------------------------------------------------------------------------------------------------|-----|
| 11.1 Using <code>cliDispatcher.m</code> . . . . .                                              | 83  |
| 11.2 Calling <code>cliDispatcher.m</code> from Matlab's Command-line Interface . . . . .       | 83  |
| 11.3 Creating an Entry for <code>cliDispatcher.m</code> in <code>guiWrapper.m</code> . . . . . | 84  |
| 11.4 Designing two FIR Filters . . . . .                                                       | 84  |
| 11.5 Cell Array with Two FIR Filters . . . . .                                                 | 84  |
| 11.6 Designing a CIC Filter and Its Compensator . . . . .                                      | 85  |
| 11.7 Cascading Two Filter Cell Arrays . . . . .                                                | 85  |
|                                                                                                |     |
| 12.1 Forcing a New Trigger Event . . . . .                                                     | 88  |
| 12.2 Configuring the Frame Sent by the Server . . . . .                                        | 89  |
| 12.3 Setting the Number of Logged Channels . . . . .                                           | 90  |
| 12.4 Reading the Currently Stored Frame . . . . .                                              | 90  |
| 12.5 Requesting a New Frame and Read It When It Is Ready . . . . .                             | 91  |
| 12.6 Setting the sampling Rate . . . . .                                                       | 91  |
| 12.7 Polling the Status of the Logger . . . . .                                                | 92  |
| 12.8 Configuring the Trigger . . . . .                                                         | 93  |
|                                                                                                |     |
| C.1 Using Websockets in JavaScript . . . . .                                                   | 121 |
| C.2 Scope State Tree . . . . .                                                                 | 122 |
| C.3 Basic Usage of <code>mithril</code> Components . . . . .                                   | 123 |
| C.4 Drawing on Canvas in JavaScript . . . . .                                                  | 124 |
| C.5 Usage of <code>requestAnimationFrame</code> Callback . . . . .                             | 125 |

# Introduction

Electronic measuring equipment has historically tended to be very pricey, with specialized appliances sometimes costing as much as a middle-class car, or even more. While high-performance solutions are unlikely to be replaced with something radically different in the foreseeable future, modern, affordable FPGAs are a viable alternative in many use cases nowadays. By keeping performance requirements within reasonable bounds and pairing the FPGA with an appropriate front-end, sufficient performance for many applications can be achieved at a very competitive price point. Additionally, an FPGA offers vastly superior flexibility over the fixed silicon of conventional signal processing chips, since its hardware capabilities can be altered even after deployment.

A suitable front-end for an FPGA usually comprises dedicated ADCs and DACs and analog filters. Along with FPGA chips themselves, ADC and DAC chips have become much more economical in recent years, resulting in a product which may cost as little as a few hundred Swiss Francs and which can replace an apparatus several times as expensive.

This project aims to equip such an FPGA board with logic that can record, filter and store electrical signals with adjustable sampling rates up to 125 MHz. To complement the hardware subsystem, a software component is provided, consisting of two applications: One runs on an embedded GNU/Linux on the board itself, while the other runs on a user's computer. The application on the board (the *server*) is responsible for transmitting the recorded data over a network connection, while the software running on the user's computer (the oscilloscope, or *scope* for short) serves to visualize and process the measurement results. This concept is depicted in Figure 1.

The objective of this project is to provide a device which enables students and hobbyists to analyze signals encompassing the region from audio frequencies to the low megahertz range. Compared to the frequencies which modern hardware can handle (dozens to hundreds of megahertz), the sampling rates required to process such signals can be kept within the limits suitable for an FPGA.

A RedPitaya STEMlab 125-14 board is used as the basis for the hardware. It easily offers sufficient performance to process the signals in the desired range, having an ADC which provides a 14 bit signal at 125 MHz on two channels. Indeed, downsampling this signal is



**Figure 1:** An overview of the main system components. An analog signal (left) is measured, the data goes into the server and is then transmitted via network to a computer running the scope.

*necessary* if it is to be transmitted over a network connection, since its data rate far exceeds the available bandwidth.

Thus, two primary objectives can be defined:

1. The signal coming out of the ADC must be decimated.
2. The resulting data stream must be visualized for the end user.

Downsampling the data provided by the ADC is performed on the FPGA. Because downsampling introduces unwanted frequencies into the signal's spectrum, it needs to be processed by filters which attenuate those components. While the STEMlab offers some limited capability in this area in its stock configuration, this was deemed insufficient and has been improved in this project. Several filter chains are implemented to allow decimation rates between 5 and 2500, corresponding to output frequencies between 25 MHz and 50 kHz, respectively. The server application then transmits the filtered data stream over Ethernet to a client.

The filter chains have achieved a signal-to-noise ratio of up to 84 dB in tests, depending on the chosen chain and input frequency. Additionally, the 14 bit signal from the ADC is improved by 1.2 bit to 1.8 bit, again depending on the chain. Passband shapes show negligible droop, and stopband aliasing is 60 dB.

Cost-effective FPGA boards like the STEMlab usually come without a physical user interface with displays and buttons. This keeps the device compact and its cost low. Since fast personal computers are ubiquitous these days, it seems obvious to exploit that and run an oscilloscope application on such a device. Due to the fragmentation of the modern computing world into various ecosystems (Windows, GNU/Linux, macOS, Android, ...), web technologies form the foundation of the scope; the application can be accessed from any modern browser. This makes life easier both for the developers and for the end users.

**This document is split into four main parts:** Part I, contains the primary project report. Part II is the developer guide, and begins on page 76. A short user guide is provided in Part III from page 96 onwards. Appendices are located on page 109 and onwards.

The project report covers information relevant to the design and implementation of the final product. Its first chapter starts on presents some relevant theoretical background on digital signal processing, with an emphasis on digital filters and CIC filters in particular. The next chapter outlines the process leading to the concept for our product, and compares a few alternative choices against it, beginning on page 28.

The design and implementation of the product is detailed in Chapters 3, 4, 5 and 6. Chapter 3 documents the filter design; both requirements and specifications are documented. An overview of the FPGA implementation is given in Chapter 4, highlighting some key points which proved challenging during development. The transmission of data between STEMlab and client is handled by a server application, documented in Chapter 5. Finally, the concepts and design choices underpinning the scope are explained in Chapter 6. The chapters begin on pages 37, 43, 49 and 52, respectively.

The product's performance is assessed in *Verification* from page 62 onwards, and Chapter 8 contains some concluding remarks on the overall result and possible future steps.

The Developer and User Guides are mostly self-contained. The Developer Guide is intended for people who wish to use our product, or parts of it, to implement a system of their own. The User Guide is intended for end-users who wish to perform measurements with our product.

All components specifically developed for this project fall under the MIT license, a copy of which is located in Appendix D.

**Part I**

**Project Report**

# 1

## CHAPTER

# Theoretical Background

Before delving into the design process, some aspects of the underlying theory should be understood. While this chapter is not intended to be a comprehensive treatise on the subject, it shall serve as a short refresher. At its end, the reader should have sufficient insight to understand the basic motivation of our project from a theoretical point of view.

## 1.1 The Digital Signal Processing Chain

Digitally acquiring a signal generally requires at least the following steps:

- Passing the signal through an analog low-pass filter.
- Sampling and quantizing the filtered signal.

The resulting sequence of values can then be further digitally processed. The necessary building blocks for this process are portrayed in Figure 1.1.

Of particular interest for our application is what happens in the ADC. The quantization process converts a value-continuous signal into a value-discrete one, with its resolution being a specification of the ADC. As an example, the ADC in our system has a resolution of 14 bits, meaning it can divide its valid input range into 16 384 values. Given an input range of  $2 \text{ V}_{\text{PP}}$ , this equates to a resolution of roughly  $122 \mu\text{V}$  (in theory). This quantization process is the source of what is generally known as *quantization noise*.

Besides the quantization, the other step happening in the ADC is sampling; a time-continuous signal is converted into a series of time-discrete values. The time between those values is known as *sampling time*, its inverse is the *sampling frequency*. Note that usually these are constant, at least during the time where the signal is measured. This need not strictly be the case in theory though. In our system, this sampling frequency is a fixed property of the ADC, and is 125 MHz.



**Figure 1.1:** The basic building blocks of the DSP chain from its analog input to its digitally processed output. From left to right: The analog low-pass filter (LP), the analog-to-digital converter (ADC), and an arbitrary digital signal processing system for further processing of the ADC's output (DSP).

The sampling step lies at the core of the problem our project intends to address: *aliasing*. Therefore, we will take a closer look at a few consequences of the sampling process, and how they are relevant to this project.

Descriptively, the sampling process can be thought of as looking at a signal at specific points in time and capturing its value at those points. Mathematically, this amounts to multiplying the signal with a series of Dirac pulses in the time domain, and convolving with a series of Dirac pulses in the frequency domain<sup>1</sup>. This convolution in the frequency domain lies at the heart of the problem of aliasing, because it results in the incoming signal's spectrum being repeated at intervals of  $f_s$  (see stage 3 in Figure 1.2). This is no problem as long as the spectrum of the incoming signal fits within the boundaries set by this repetition. But if the spectrum of the incoming signal is too broad, two or more recurrences of the spectrum will overlap. This effect is highlighted in Figure 1.3.

This overlap results in two primary problems:

- The digital signal may not be unambiguously reconstructable into an analog signal, if that is intended.
- Frequencies may occur in the digital signal stream which are not actually present in the original signal. This problem is often referred to as the *folding back* of frequency components. See Figure 1.4 for an illustration of how this might look. This problem is of particular interest to our application, as we will see later.

Once a signal has left the ADC and is handed down the DSP chain for further processing, the primary problem becomes one of resources, particularly in real-time applications. In most systems, the available hardware is a fixed constraint and depending on what sort of processing is to be conducted on the digital data stream, the available resources may or may not suffice. If available resources are found to be insufficient for real-time processing of the data stream, one may choose to

- not process the data in real time,
- reduce the complexity of the computations, or
- reduce the amount of data to be processed through *downsampling* of the signal.

The last case is the route which is chosen in our application. The main constraint on the STEMlab is that the data being generated cannot be moved off the device in real time. Therefore the amount of data must be reduced before it can be moved off the device to a computer for viewing or further processing (see Section 3.1).

Because downsampling a signal is in essence nothing more than the sampling of a signal which has already been sampled, a lot of the considerations which are valid for the step from an analog to a digital signal as outlined above are either very similar or even identical. Specifically, the same considerations for aliasing still apply: If the signal which is to be downsampled has frequency components above  $f_{s,\text{downsampled}}/2$ , aliasing will occur. And since the signal coming out of the ADC has the analog signal's spectrum (filtered by the analog lowpass before the ADC) recurring at intervals of the ADC's sampling frequency, this is always the case.

Therefore, the sampled signal must be filtered through a low-pass filter before being downsampled, just as the original analog signal was low-pass filtered before being passed into the ADC. In light of the signal to be downsampled being a *digital* signal instead of an analog one, that low-pass filter must naturally be a digital filter as well. Designing such digital low-pass filters is the core mission of this project.

The key properties of such filters which are relevant to our application are its transition band width (filter steepness) and its aliasing attenuation. The aliasing attenuation refers to

---

<sup>1</sup>Pro memoria: A series of Dirac pulses in the time domain has as its spectrum a series of Dirac pulses as well.



**Figure 1.2:** Simplified time-domain (top) and frequency-domain (bottom) view of the signal at different stages on its way through the DSP chain. The circled numbers correspond to the stages as outlined in Figure 1.1. Stage 1 is the signal before passing through the input low-pass filter, with a significant amount of high-frequency noise. The low-pass filter removes any frequency components above  $f_s/2$  in an ideal scenario (in reality, it merely attenuates them, as we will see later), resulting in the signal at stage 2.

After having been filtered, the ADC samples and quantizes the signal, yielding a sequence of values, schematically portrayed in the rightmost picture for stage 3. Note that due to the sampling process, the spectrum of the filtered signal is repeated at intervals of  $f_s$ . This is the source of the issue of aliasing.



**Figure 1.3:** Simplified view of a signal which does not produce aliasing between its recurrences in the frequency spectrum (top), contrasted with a signal whose frequency band has components above half the sampling frequency, resulting in aliasing; its spectral copies overlap (highlighted areas in the bottom plot).



**Figure 1.4:** Example of two harmonic signals being sampled. In the top plot, the signal's frequency is below half the sampling frequency and there is no aliasing. The signal can be reconstructed without error. In the bottom plot, the signal's frequency is above half the sampling frequency. Consequently, the copies of the signal's frequency spectrum centered around the sampling frequency and its negative alias back into the band between  $-f_s/2$  and  $f_s/2$ . If this signal is reconstructed, the resulting signal would have a frequency of  $f_s - f_{sig}$  instead of  $f_{sig}$ .

the fact that when a filter is being used for downsampling, copies of its frequency response will be created at intervals of the lower sampling rate (analogous to the sampling process producing spectral copies of a signal when sampling an analog signal).

The stopband components of these copies overlap with the intended passband, leading to aliasing (it should be noted that this phenomenon is also present in the case of the analog input filter for the DSP chain at the ADC's output). This effect is portrayed in Figure 1.5. The top plot shows the filter's frequency response along with four copies to illustrate the overlap effect. The bottom plot shows the aliasing effect more clearly by removing the spectral copies and retaining the aliased components.

The overlapping parts of the spectrum are composed of spectral copies both to the right and left side of the original. Therefore, the aliased regions are alternately flipped around the vertical axis. This creates in essence the same effect as if the paper were folded along multiples of the lower sampling rate over the frequency range of the central copy (in the case of our example:  $0.2f_s$ ,  $0.4f_s$ ,  $0.6f_s$  and  $0.8f_s$ ) like an accordion. This is where the term *folding back* originates.



**Figure 1.5:** The phenomenon of folding back when downsampling, illustrated for a lowpass IIR filter with a cutoff frequency of  $0.2 \cdot f_s$  for a downsampling ratio of  $R = 5$ . The downsampling process produces copies of the filter's frequency response at intervals of the lower sampling frequency, visible in the top plot. The stopbands of these copies then overlap with the intended passband. The bottom plot shows a close-up view with the spectral copies for clarity.



**Figure 1.6:** Example of an IIR filter topology for a biquad

## 1.2 Digital Filters

Ahead of studying any potential details of digital filters, it is helpful to have an overview of the major types of filters and some of their key properties. Digital filters can be distinguished by several characteristics; common ways to categorize them are by topology, impulse response and their frequency response. There are two to three commonly used types of digital filters, depending on how they are grouped: Infinite impulse response (IIR) filters and finite impulse response (FIR) filters. A third group, cascaded integrator-comb (CIC) filters, is variously thought of as a special type of FIR filter (similarity of its transfer functions), a subgroup of IIR filters (it contains feedback and feedforward paths) or a new type of filter altogether, depending on which source one chooses to consult. While our system uses FIR and CIC filters, a brief overview of IIR filters is still presented here, for the sake of completeness and some context.

### 1.2.1 IIR Filters

Infinite impulse response filters are so named because their impulse response continues into perpetuity, never reaching zero. In practice, the response usually comes sufficiently close to zero at a certain point that it can be considered zero for most intents and purposes.

IIR filters have feedback paths, resulting in a filter response equation with non-zero denominator components. Their basic building blocks are delay elements, multipliers and adders, as seen in the example of Figure 1.6. Their transfer functions have the form

$$H(z) = \frac{\sum_{k=0}^N b_k \cdot z^{-k}}{1 + \sum_{i=0}^M a_i \cdot z^{-i}} \quad (1.1)$$

IIR filters generally require a lower order (and therefore fewer resources) to approximate a certain frequency response specification than FIR filters do (particularly the constraint of a narrow transition band), but this comes at a cost: IIR filters have a non-linear phase response; linear-phase responses can only be approximated. Furthermore, IIR filters are not guaranteed to be BIBO stable due to their feedback paths.

Some of the generally used types of IIR filters are:

- Butterworth filter: Named after the British engineer and physicist Stephen Butterworth (1885 – 1958), who first described it in 1930. Characterized by a very flat passband (no passband ripple).

- Chebyshev filter (type I and II): Named after Russian mathematician Pafnuty Chebyshev (1821 – 1894). They are steeper than Butterworth filters, at the cost of suffering from ripple in the passband (type I) or stopband (type II).
- Bessel filter: Named for the German mathematician Friedrich Bessel (1784 – 1846). Optimized to have a maximally linear phase response in order to minimize the distortion of signals passing through the filter.
- Elliptical filters: Also known as Cauer filters, after the German mathematician Wilhelm Cauer (1900 – 1945), or Zolotarev filter, after Russian mathematician Yegor Zolotarev (1847 – 1878). Characterized by equiripple in the bassband and stopband and a very narrow transition band compared to other filters of the same order.

### 1.2.2 FIR Filters

FIR filters are characterized by an impulse response which decays to zero in finite time (see Figure 1.10, unlike IIR filters. The filter response is characterized by Equation 1.2:

$$H(z) = \sum_{k=0}^N b_k \cdot z^{-k} \quad (1.2)$$

Figure 1.7 shows one possible topology for implementing a FIR filter, the so-called direct form. As can be seen, the basic building blocks of a FIR filter are delay elements, multipliers and adders, same as for IIR filters.

FIR filters have several advantages:

- They are inherently BIBO stable because they lack feedback paths.
- They can be easily designed to have a linear phase response, preventing signal distortion due to different group delays for signal components of different frequencies.
- The shape of their frequency response can be very finely tuned. This makes them ideally suited for certain purposes, such as compensation filters (see Section 1.2.3).
- Implementation is usually rather straightforward.

Their main disadvantage is that due to the lack of feedback, they generally require comparatively high filter orders for narrow transition band widths. Illustratively, this can be understood by the following considerations:

- The frequency response of an ideal low-pass filter is a rectangle.
- The inverse Fourier transform of a rectangle is an infinitely long *sinc* function.
- Therefore, the impulse response of the ideal brick wall filter (rectangular) would have an infinite number of taps.
- Truncation of the number of taps leads to a deviation of the filter's frequency response from the brick wall filter. As the number of taps (and therefore the FIR filter's impulse response) is reduced, its frequency response deviates more and more from



**Figure 1.7:** One possible topology for a FIR filter (direct form)



**Figure 1.8:** The effect of truncating a *sinc* function in the time domain on its spectrum (simplified)

the brick wall filter, resulting in a flatter transition between the passband and the stopband as well as the introduction of ripple.

This process is illustrated in simplified form in Figure 1.8. The FIR filter's transition band width is particularly important for our application in order to reduce aliasing effects, as will be shown later.

Designing FIR filters is usually performed by specifying certain desired characteristics of the filter's frequency response. Figure 1.9 shows one possible way of doing this for FIR filters by specifying four parameters:

- pass band ripple:  $A_p$
- stop band attenuation:  $A_{st}$
- pass band edge frequency:  $F_p$
- stop band edge frequency:  $F_{st}$

The resulting transition band width  $F_{tb}$  is the difference between the passband edge frequency and the stopband edge frequency, and serves as a useful indicator of how many coefficients (i.e. resources) the filter will end up using. Narrower transition bands tend to require a higher filter order, and therefore more resources. Coefficient counts of several hundred are not uncommon for steep FIR filters. Other sets of parameters can be used to design filters, but these are the ones used in this project, therefore the emphasis on them.

Figure 1.10 shows the resulting impulse response (coefficient set) for a FIR filter designed by using the four above mentioned parameters, with values given by Equations 1.3 through 1.6 handed to one of Matlab's FIR filter design algorithms.

$$A_p = 2 \text{ dB} \quad (1.3)$$

$$A_{st} = 60 \text{ dB} \quad (1.4)$$

$$F_p = 0.3 \cdot f_s \quad (1.5)$$

$$F_{st} = 0.4 \cdot f_s \quad (1.6)$$



**Figure 1.9:** Specifications in the frequency domain and the resulting filter's frequency response as designed by Matlab.



**Figure 1.10:** Impulse response (coefficients) for the filter from Figure 1.9 with the parameters as given by Equations 1.3 through 1.6 passed to one of Matlab's FIR filter design algorithms, resulting in a set of 39 coefficients. Note that the coefficients to the left and right of these values are zero, hence *finite impulse response filters*.



**Figure 1.12:** A single integrator stage**Figure 1.13:** A single comb stage in feedforward form**Figure 1.14:** CIC decimation filter topology with three integrator and comb stages

### 1.2.3 CIC Filters

CIC filters offer a very computationally efficient way to decimate or interpolate, though their design parameters are limited. They were first introduced in 1981 in [1] by Eugene B. Hogenauer. They can be implemented both as decimation filters (reduction in sampling rate) and interpolation filters (increase in sampling rate).

#### 1.2.3.1 General Description

A CIC filter is a cascade of integrator and comb stages, with either a sampling rate compressor (in case of a decimator) or a sampling rate expander (in case of an interpolator) between the integrator and comb sections. A single integrator stage is shown in Figure 1.12, and Figure 1.13 shows a single comb stage in feedforward form. Figure 1.14 depicts a complete CIC decimator with three stages.

The integrator stages run at  $f_s$  and have a transfer function of

$$H_I(z) = \frac{1}{1 - z^{-1}}, \quad (1.7)$$

while the comb stages run at the reduced frequency  $f_s/R$  and have the transfer function

$$H_C(z) = 1 - z^{-RM} \quad (1.8)$$

where  $M$  is the *differential delay*, one of the filter's design parameters.

The transfer function of a complete CIC filter (referenced to the high sampling rate  $f_s$ ) consisting of  $N$  stages is deduced by multiplying the transfer functions of the  $N$  cascaded integrator and comb stages, yielding

$$H_{\text{CIC}}(z) = H_I^N(z) \cdot H_C^N(z) = \frac{(1 - z^{-RM})^N}{(1 - z^{-1})^N} = \left[ \sum_{k=0}^{RM-1} z^{-k} \right]^N \quad (1.9)$$

Looking at the last form of the CIC filter's transfer function, it becomes evident that it can also be thought of as a FIR filter with unitary coefficients. Of particular note is the fact that this is so despite each stage having feedback or feedforward paths and the integrator stages having poles at  $f = 0$  (i.e. the integrators by themselves are not in fact BIBO stable, even though the complete system is). The fact that the resulting filter has no poles can be

intuitively understood by looking at the frequency responses of the integrator and comb stages, and finally their cascade (see Section 1.2.3.2).

CIC filters are well-suited to large reductions in sampling rates because they are very economical in their resource usage. This economy is based on six primary factors [1]:

- The filter requires no multipliers.
- There are no filter coefficients to store.
- The amount of storage needed for intermediate results is reduced by running the comb stages at a lower sampling rate. A conventional FIR filter topology implementing the same transfer function would require more resources for storing its intermediate results because the entire filter would run at the incoming sampling rate.
- The topology of the filter has a high degree of regularity; consisting of two primary building blocks. This lends itself well to optimization.
- The control logic can be kept simple.
- The same filter design can be used for a large range of rate change factors  $R$ , requiring minimal adaption in circuitry. This effect can be seen in the frequency responses plotted in the top plot of Figure 1.18.

However, CIC filters do suffer from some drawbacks. The two primary ones are:

- For large rate change factors  $R$ , the register growth of the filter can become very high. See Section 1.2.3.4.
- A CIC filter has only three design parameters determining its frequency response: Rate change factor  $R$ , differential delay  $M$ , and the number of stages  $N$ . The amount of fine-tuning which can be conducted on the filter's frequency response is therefore extremely limited (more in Section 1.2.3.2).

As can be seen in Equation 1.7, the integrator stages have unity feedback coefficients. In the case of CIC decimators, the registers of the integrators will therefore suffer from register overflow. This causes no harm as long as two conditions are fulfilled:

- The filter's implementation is based on two's complement or another number system allowing wrap-around between its most positive and most negative numbers.
- The maximum magnitude which is expected at the output is within the range of that number system.

A numerical example to demonstrate this effect and better explain the inner workings of a CIC filter can be found in Appendix A.1, starting on page 102.

### 1.2.3.2 Frequency Characteristics

This section presents some of the more important frequency characteristics of the CIC filter. We will start with some considerations about how the integrators and comb sections interact in the frequency domain to create the CIC filter's frequency response.

As shown in the top plot in Figure 1.15, an integrator is in essence a lowpass filter, with a pole at  $f = 0$ . A comb filter is a filter which attenuates one specific frequency component along with its multiples (in a notch comb filter; there is also the inverse concept of a peak filter which only lets a certain frequency and multiples of it pass). It is also evident that comb filters have no poles (a fact which can be deduced from Equation 1.8 as well, of course).

The bottom plot from Figure 1.15 is the result of cascading integrators and combs. The integrator's pole at  $f = 0$  compensates for the comb section's zero at the same location, leading to a significant, but finite, DC gain of the CIC filter.



**Figure 1.15:** Frequency responses for integrators, combs and their combination into a three-stage CIC filter with a rate change factor of 9 and a differential delay of 1. Note that 4.5 lobes fit into the plot for the comb filter, due to  $R \cdot M = 9$  (the order of the comb filter).



**Figure 1.16:** Passband attenuation for two CIC filters with  $R = 9$ ,  $N = 4$  and  $M = 1$  and  $M = 2$ , respectively. The attenuation is identical for the bandwidth-differential delay product, which is  $1/8$  for the two configurations. The attenuation is  $-3.65$  dB in both cases; the value can be found in Table A.3 on page 108.

One drawback of CIC filters is that they have no clearly defined passband as such. Rather, their frequency response starts dropping off right as the frequency axis goes beyond zero. This effect (also referred to as *passband droop* or *passband attenuation*) is visible in the bottom plot of Figure 1.15 and in Figure 1.16 in more detail. Since CIC filters lack a clearly defined transition band edge, defining the frequency band which is to be used, i.e. the actual passband, is a design decision and can vary even when using the same filter, depending on the application.

The amount of passband droop is constant for a given product of the differential delay  $M$  and the cutoff frequency  $f_c$ , where  $f_c$  is a fraction of the lower sampling rate (i.e. a fraction of the first lobe's width). Figure 1.16 highlights this effect for two different filters. Table A.3 in Appendix A.2 on page 108 contains a list with more values for some common configurations.

Because of the passband droop, a CIC filter by itself is rarely a viable solution. Rather, it is generally deployed as the first element in a chain of filters, where the later stages are FIR filters. Due to the CIC filter's frugality in terms of resource usage, it is ideally suited as an initial stage where the most samples per time need to be processed. The fact that FIR filters need to perform many more computations (and more complex ones) per sample is then no longer as much of a problem, since the FIR filters run at lower sampling frequencies and therefore have many more clock cycles available to compute each output. Also, because the frequency response of a FIR filter can be very finely tuned to a desired profile, they can be used to compensate for the CIC filter's passband droop; this is generally known as a *CIC compensation filter*. More on the topic in Section 1.2.3.3.

Another effect which must be taken into consideration when designing CIC filters is the amount of aliasing which occurs from the stopband into the passband. A region of width  $f_c$  above and below each  $M$ th null is folded back into the filter's passband. This effect is highlighted in Figure 1.17. The gravity of this effect depends on the width of the cutoff frequency  $f_c$  as well as the differential delay  $M$ . Table A.3 in Appendix A.2 contains some values for common ranges for  $M$  and  $f_c$ .



**Figure 1.17:** Passband aliasing for a CIC filter with  $R = 9$ ,  $N = 4$  and  $M = 1$  and a cutoff frequency of  $f_c = 0.25$ , referenced to the lower sampling frequency  $f_{s,\text{low}}$ . The region of width  $f_c$  around every  $M$ th null is folded back into the passband. The regions beyond that are of course folded back as well, but are cut off by the compensator, see Figure 1.19. The resulting passband aliasing attenuation is 41.8 dB, as indicated in Table A.4 on page 108.

As mentioned, the CIC filter has only three design parameters: Its rate change factor  $R$ , the differential delay  $M$  and the number of stages  $N$ . The influence of these parameters on the CIC filter's frequency response is portrayed in Figure 1.18. Some things of note are:

- Increasing  $R$  increases the amount of nulls and the filter's overall gain.
- Increasing  $M$  also increases the number of nulls as well as the filter's gain. Note that for CIC decimators, the region around every  $M$ th null is folded back into the passband. For practical purposes,  $M$  is usually set to 1 or 2, see [1].
- Adding more stages leads to a high increase in filter gain, since  $N$  occurs in the exponent of the filter's transfer function. It does not, however, change the number or placement of the nulls.

### 1.2.3.3 Compensators

In order to achieve a flat passband, the CIC filter's attenuation in the frequency range observed in Figure 1.16 can be compensated with a filter whose frequency response has the opposite shape. Operated in a cascade, the two filters create a frequency response with a flat passband and a sharp drop-off into the stopband. Figure 1.19 shows an example of such a system, with frequency responses of a CIC filter, its compensator, and the resulting cascade.

The compensation filter not only serves to compensate for the passband, but is also responsible for the transition band width of the cascade. Due to their flexibility, FIR filters are generally employed for this purpose. As mentioned in the previous section, the sharpness of their transition band can be controlled by adjusting their kernel size. If more filters follow after the compensator in the filter chain, the compensator's transition band need not be very narrow. A filter with a few dozen coefficients in size is often sufficient in such cases (the filter used for the example in Figure 1.19 has 50 coefficients).

The design of CIC compensators is usually left up to software algorithms, for example with Matlab. For a more detailed introduction to the topic, including example code and more elaborate explanations, Altera's Application Note from [2] is warmly recommended.

### 1.2.3.4 Register Growth

As shown by Hogenauer in [1], the maximum register growth is

$$G_{\max} = (R \cdot M)^N \quad (1.10)$$

The most significant bit  $B_{\max}$  of the output register as well as for all stages (both the integrators and the comb stages) of the filter is determined to be

$$B_{\max} = \lceil N \log_2 RM + B_{\text{in}} - 1 \rceil \quad (1.11)$$

where  $B_{\text{in}}$  is the bit width of the input register. For high rate change factors, these values can become very large. A filter with three stages, a differential delay of 1, a rate change of 128 and an input width of 16 bits yields 36 bits output width at full precision.

### 1.2.3.5 Errors Due to Truncation and Rounding

In practical cases, it is often not feasible to retain full precision; in such situations, either truncation or rounding may be used at each filter stage to reduce register widths and keep



**Figure 1.18:** The influence of the design parameters  $R$ ,  $M$  and  $N$  on a CIC filter' frequency response. Increasing  $R$  and  $M$  leads to an increased number of nulls, as visible in the top two plots, as well as an increase in the DC gain. Adding more stages does not change the location of the nulls, but does add significant DC gain.



**Figure 1.19:** Frequency behavior of a CIC filter, its compensator, and the cascade of the two. Note the spectral copies of the compensator around the nulls of the CIC filter, i.e. the multiples of its outgoing sampling rate.

resource usage within certain limits. For this purpose, it is necessary to know the system function from the  $j$ th stage up to and including the last:

$$H_j(z) = \begin{cases} H_I^{N-j+1} H_C^N = \sum_{k=0}^{(RM-1)N+j-1} h_j[k]z^{-k} & j = 1, 2, \dots, N \\ H_C^{j-N} = \sum_{k=0}^{2N+1-j} h_j[k]z^{-kRM} & j = N+1, \dots, 2N \end{cases} \quad (1.12)$$

where

$$h_j[k] = \begin{cases} \sum_{l=0}^{\lfloor k/(RM) \rfloor} (-1)^l \binom{N}{l} \binom{N-j+k-RMl}{k=RMI} j & = 1, 2, \dots, N \\ (-1)^k \binom{2N+1-j}{k} j & = N+1, \dots, 2N \end{cases} \quad (1.13)$$

are the impulse response coefficients. These functions are derived by Hogenauer in [1].

In a filter with  $N$  stages, there are  $2N+1$  error sources in the case of limited precision: Each stage, and the output register. Each error source is presumed to have white noise characteristics, i.e. its noise is uncorrelated to its input as well as other error sources. The error at the  $j$ th source is assumed to have a uniform probability distribution with a width of

$$E_j = \begin{cases} 0 & \text{without truncation or rounding} \\ 2^{B_j} & \text{otherwise} \end{cases} \quad (1.14)$$

where the number of bits discarded at the  $j$ th error source is  $B_j$ . The mean of this error is

$$\mu_j = \begin{cases} \frac{1}{2} E_j & \text{for truncation} \\ 0 & \text{otherwise} \end{cases} \quad (1.15)$$

and the variance comes out to

$$\sigma_j^2 = \frac{1}{12} E_j^2. \quad (1.16)$$

The total mean error at the filter's output due to the  $j$ th stage is

$$\mu_{T_j} = \mu_j D_j \quad (1.17)$$

where

$$D_j = \begin{cases} (RM)^N & j = 1 \\ 0 & j = 2, 3, \dots, 2N \\ 1 & j = 2N+1 \end{cases} \quad (1.18)$$

is the *mean error gain* for the  $j$ th error source. Note that only the first and the last error source contribute to the filter's mean error at the output. This is because the sum of the impulse response coefficients is zero for all other stages. Consequently, whether one chooses to truncate or round is without consequence except in the case of the first and last error sources. In an analogous manner, the total variance computes to

$$\sigma_{T_j}^2 = \sigma_j^2 F_j^2 \quad (1.19)$$

where

$$F_j = \begin{cases} \sum_k h_j^2[k] & j = 1, 2, \dots, 2N \\ 1 & j = 2N+1 \end{cases} \quad (1.20)$$

is called the *variance error gain* for the  $j$ th error source. We can now compute the global mean error and variance of the filter:

$$\mu_T = \sum_{j=1}^{2N+1} \mu_{T_j} = \mu_{T_1} + \mu_{T_{2N+1}} \quad (1.21)$$

$$\sigma_T^2 = \sum_{j=1}^{2N+1} \sigma_{T_j}^2 \quad (1.22)$$

These equations are used to calculate the properties of the CIC filter as deployed in our design in Section 4.4.4.

### 1.2.3.6 Summary

In conclusion, the key properties of CIC filters are:

- They can be implemented both as decimators and interpolators.
- Neither multipliers nor storage for coefficients are needed.
- CIC decimation filters have a high gain, leading to significant register growth. Truncation or rounding can be used to limit the resource usage, both at the filter's output and internally.
- The three design parameters are the rate change  $R$ , the differential delay  $M$  and the number of stages  $N$ .
- The presence of passband droop requires a compensation filter to achieve a flat passband response.

## 1.3 Multi-Stage Filter Designs

At first sight, the most obvious way to implement a downsampling system might appear to be to design one filter for each desired rate change factor. However, this would be highly impractical. Instead, multi-stage designs are usually used in practice. An in-depth discussion of their advantages and drawbacks was offered by Crochiere and Rabiner in [3]. A few aspects of multi-stage filter design which are relevant to our application shall be presented here.

To reduce aliasing effects in the passband, it is generally desirable to keep the width of the transition band roughly constant in relation to the width of the passband (visible in the filter's flank in Figure 1.5). As the downsampling ratio increases and the passband width decreases, the transition band therefore becomes progressively narrower, necessitating higher filter orders in a single-stage design.

This effect is illustrated in Figure 1.20, comparing two filters with a transition band 1/5 as wide as the passband for downsampling ratios of 2 and 4. The filter for  $R = 4$  requires 60 coefficients, compared to 30 coefficients for the filter designed for  $R = 2$ . The other specifications (passband ripple, stop band attenuation) are identical. As an extreme case, a filter designed by Matlab with the same parameters for a downsampling ratio of  $R = 625$  is 8860 coefficients in size.

Using multi-stage designs helps to avoid the need to implement filters with such large kernels. When cascading filters, it is the last stage of the chain which defines the overall passband and transition band width. The same overall transition band in absolute terms can be achieved with smaller filters in multi-stage designs, because a filter's transition band width is relative to the sampling rate at which it is running. This is shown in the bottom plot in Figure 1.20.

Since cascading multiple filters does increase coefficient count (and storage), one might be inclined to think that not much has been won. Indeed, the overall number of coefficients is identical for both filters in Figure 1.20 (though, this obviously need not be so in other examples). However: Merely 30 multipliers and 29 adders (those of the first filter in the cascade) run at the incoming sampling frequency in the case of the cascade, while the components of the second filter run at half that. In the case of the single-stage filter, all its 60 multipliers and 59 adders run at the full sampling frequency. Distributing the calculations over two stages has therefore yielded an overall reduction in needed computation



**Figure 1.20:** Two filters are used here: Both have a transition band 1/5 as wide as their passbands. The top filter is designed for a downsampling ratio  $R = 2$  and has a coefficient count of  $N = 30$ . The second filter is designed for  $R = 4$  and has 60 coefficients. Cascading two of the top filters into a two-stage design, depicted in the bottom plot, results in the same overall passband and transition band width as the single-stage design, but with fewer computations per time.

power. As rate change factors increase, the benefits of multi-stage designs become even more pronounced [3].

In general, the earlier stages in a cascade tend to have higher downsampling ratios but wider transition bands, and the later stages have lower downsampling ratios and narrower transition bands (normalized to their respective sampling frequencies). This minimizes computations across the overall design by using smaller filters for high-frequency calculations, and giving the larger filters more time to compute their outputs. Multiple stages having the same downsampling ratios is also a common approach, but earlier stages having lower downsampling ratios than later stages is uncommon.

A further advantage of cascading filters is that successive stages can be used to shape the overall frequency response, as seen in the case of the CIC compensator in Section 1.2.3.3. A drawback of cascades is that the ripple in their passband shows additive behavior, so the stages in a cascade of filters have more stringent ripple requirements in the passband than a single-stage. However, the cost for this is usually offset by the advantages of multi-stage designs.

When designing multi-stage filters, it can happen that the transition band of an earlier stage overlaps with the spectral copy of a later stage running at a reduced sampling rate. In that case, the stopband response of the cascade can have peaks exceeding the desired overall stopband attenuation. To prevent this, the following condition must be satisfied:

$$f_{st,1} < \frac{f_{s,1} - f_{st,2}}{R_1} \quad (1.23)$$

Where:

$f_{s,1}$  : high sampling rate

$f_{st,1}$  : stopband frequency of first filter

$f_{st,2}$  : stopband frequency of second filter

$R_1$  : rate reduction in first filter

Figure 1.21 shows some examples for this condition being broken or fulfilled.

One last effect of note when cascading filters concerns stopband attenuation: When cascading two filters with different stopband attenuations, two things can happen:

- The second filter attenuates more strongly than the first one. This results in peaks above the second filter's stopband attenuation in the regions where the spectral copies of the second filter's passband are located. This can be seen in the top plot in Figure 1.22.
- The first filter attenuates more strongly than the second one. In that case, the stopband region of the cascade right next to its transition band is less strongly attenuated than the stopband regions farther away from the edge. This case is shown in the middle plot in Figure 1.22.

Neither of these two effects is usually desired. In both cases, the resources invested into the steeper filter's stronger attenuation are wasted by the other filter's weaker stopband attenuation. It therefore makes more sense for the various stages in a cascade to have the same stopband attenuation.



**Figure 1.21:** Comparison of two cascades: The first cascade (blue) has sufficient distance between the start of its stopband ( $0.65 \cdot f_s1$ ) and the start of the transition band of the second stage's first copy around  $f_{s2}$  ( $0.675 \cdot f_s1$ ). The second cascade (orange) has a peak in its stopband because the transition band of its first stage overlaps with the copy of the second stage ( $0.8 \cdot f_s1$  vs.  $0.6 \cdot f_s1$ ). Note: All frequencies are normalized to the high sampling rate  $f_s1$ .



**Figure 1.22:** Cascading two filters with different stopband attenuations: If a filter with stronger stopband attenuation is cascaded after a filter with weaker attenuation, the resulting cascade has peaks above the second filter's stopband attenuation (top plot). If the stronger filter is first in the cascade, the drop-off in the stopband right next to the transition band is weaker. The two filters by themselves are shown in the bottom plots.



# 2

CHAPTER

## Mission

Moving from the theoretical domain to the practical, the overarching question becomes

*What do we intend to do, why, and how?*

To answer that question is the purpose of this chapter. A general overview of the STEMlab hardware is presented first to give the reader a broad impression of what the hardware component of the project actually is. Based on that information, the objectives of this project are defined. Possible approaches to reach those objectives are presented and evaluated, and a decision is reached on how to achieve this project's goals. The project's problem statement is included in Appendix E.

### 2.1 The Red Pitaya STEMlab 125-14

In order to have a rough idea of what sort of device this project is based on, some key information of the STEMlab is given first. With that in mind, some of the shortcomings of its stock configuration are discussed, which leads directly to the main objective of this project.

#### 2.1.1 Hardware Overview

The device on which this project is based is the Pitaya STEMlab 125-14, pictured in Figure 2.1. STEMlab is a compact measurement instrument (it fits into the palm of a hand) which can replace more expensive devices like oscilloscopes by using a computer for data storage, processing and presentation.

Some of its key specifications relevant to us are:

- two high-speed analog inputs and outputs via coaxial connectors
- Uses a Linear Technology LTC2145-14 converter [4] on those inputs: 14 bits resolution at  $125 \text{ MS}\cdot\text{s}^{-1}$  per channel.
- Xilinx SoC with an FPGA component for data processing on the device itself and two ARM Cortex9 cores for general-purpose tasks
- Ethernet and USB physical interfacing chips for data transmission and device control
- Has its own operating system, a GNU/Linux distribution (Ubuntu is used in our project), running on the ARM cores.
- The software used is open-source and available under [5].



**Figure 2.1:** Photo of the STEMlab [7]

More comprehensive documentation can be found at [6]. A block diagram with the system's key components is shown in Figure 2.2.

### 2.1.2 Downsampling on the STEMlab With Stock Configuration

The STEMlab enables downsampling by powers of 2 in its stock configuration, in a range between  $1 = 2^0$  and  $65536 = 2^{16}$ . The exact manner in which this downsampling process is performed is not easily deducible from public information; the documentation on the internals of the FPGA codebase is rather sparse. However, work conducted by our predecessors has shown that the STEMlab uses a moving averaging filter for this process [9].

The moving averager's transfer function is

$$H(z) = \frac{1}{N+1} \sum_{k=0}^N z^{-k}. \quad (2.1)$$

A high degree of similarity is immediately recognizable when comparing this to the transfer function of a CIC filter in Equation 1.9 (page 14). Indeed, a cascade of moving averagers would almost yield a CIC filter, save for the gain, which could be easily adjusted after the fact if needed.

A moving averager as a decimation filter is relatively cheap to implement when using only decimation rates of powers of two, as is the case for the stock software of the STEMlab. The weight for each coefficient of  $z^{-k}$  will be  $\frac{1}{N+1} = \frac{1}{R} = 2^{-m}, m \in \mathbb{N}_0$ , meaning the computations can be performed without multipliers by performing a bit-shift operation to the right. The primary disadvantage which results from these rate reduction factors is that



Figure 2.2: Block diagram of the Red Pitaya STEMlab [8]

the resulting reduced sampling rates are not very “nice” numbers, so to speak, since the incoming sampling rate is 125 MHz, and therefore not a power of 2.

Just like a CIC filter by itself, a moving averager also makes for a rather poor lowpass filter, for two reasons: Passband droop and poor stopband attenuation (equivalent to a single-stage CIC filter). Figure 2.3 shows the frequency response for the case of  $R = 8$  and a signal of 5 MHz. The incoming sampling rate is 125 MHz, the reduced sampling rate is  $125 \text{ MHz} \div 8 = 15.625 \text{ MHz}$ . Therefore, anything above half that frequency is aliased back into the region below  $15.625 \text{ MHz} \div 2 = 7.8125 \text{ MHz}$ . In the case of a signal at 5 MHz, this results in an aliasing attenuation of a modest 8.06 dB. Combined with the passband droop of 1.48 dB (a signal attenuation of almost 16 %) at that frequency, this makes for a margin of a mere 6.58 dB between the attenuated signal and the aliasing attenuation!

As a benchmark for our own solution, we will use measurements conducted by our predecessors with the STEMlab’s stock configuration, listed in Table 2.1. The measurements do not actually suggest a very bad performance of the filter. This is because they were conducted at specific harmonic frequencies. The primary issue with a moving averager is less SNR for specific frequencies, but aliasing when measuring a non-harmonic signal which has frequency components above  $f_{s,\text{low}}/2$ . In that case, the considerations from Figure 2.3 become crucial to understanding the aliasing issue. This behavior was also confirmed in [9]. Our primary aim is therefore more to reduce these aliasing effects rather than purely trying to improve SNR for specific single frequencies, although improving



**Figure 2.3:** Frequency response, passband droop and aliasing attenuation for a moving average filter of order 7 for a decimation by a factor of 8. The incoming sampling frequency is 125 MHz, the outgoing sampling frequency is 15.625 MHz, the measured signal has a frequency of 5 MHz.

**Table 2.1:** Measurement results for STEMlab 125-14 from [9]. SNR was determined for a specific harmonic frequency signal for each sampling rate.

| $R$      |   |        | $f_{s,\text{low}}$ (kHz) | $f_{\text{signal}}$ (kHz) | SNR     |
|----------|---|--------|--------------------------|---------------------------|---------|
| $2^0$    | = | 1      | 125 000                  | 10 000                    | 62.8 dB |
| $2^3$    | = | 8      | 15 625                   | 5 000                     | 69.8 dB |
| $2^6$    | = | 64     | 1 953                    | 500                       | 76.0 dB |
| $2^{10}$ | = | 1 024  | 122.1                    | 40                        | 76.4 dB |
| $2^{13}$ | = | 8 192  | 15.26                    | 5                         | 82.4 dB |
| $2^{16}$ | = | 65 536 | 1.907                    | 0.5                       | 83.5 dB |

SNR is certainly a goal too.

In conclusion, we can formulate three main objectives:

- Reduce aliasing of out-of-range frequency components into the passband.
- Improve SNR
- Have a nicer set of reduced sampling rates.

## 2.2 Possible Solutions

Based on the previous section, it is clear that the downsampling filter from the STEMlab's stock configuration must be replaced if better aliasing attenuation is to be attained. Furthermore, since a new filtering system is being implemented anyway, one may wish to change the rate change factors in order to achieve a set of nicer-looking reduced sampling rates. The question now becomes: *How?*

Problematic is that as of spring 2017, the official code by Red Pitaya base has been split into two: There is a new development branch bringing major changes, while the old code (which has been used in the projects preceding this one) is no longer being maintained, as explained by one of the developers in [10]:

Current development is done on the `mercury` branch on the FPGA subproject of the same name. The code is still under heavy development and not stable. All other FPGA subprojects will not be developed further.

Essentially, this presents us with three options on how to proceed:

- Use the old Red Pitaya codebase, and implement our new filtering system based on it.
- Use the new Red Pitaya codebase. Due to this codebase being unstable according to the developers themselves as of the time of this project, we consider this to be an unviable option.
- Use an alternative ecosystem, either by a third party (if one can be found) or developed by ourselves.

In order to have a somewhat objective measure for comparing the first and last option, a decision matrix is used; see Table 2.2. In the following paragraphs, the criteria and weights in the table are elaborated upon. The table uses a scale of 1 through 6, with 1 being the worst and 6 being the best score. At the end, totals are tallied and compared.

**Reliance upon others:** Using the existing Red Pitaya ecosystem would also couple us to any problems inherent to that platform and to the solutions developed by preceding projects based on it. Any bugs encountered in the Red Pitaya ecosystem would either require a bugfix by the manufacturer or a workaround on our part. Since the old Red Pitaya codebase is no longer being maintained by the developers, bugfixes will not be available, leaving us to clean up any potential issues inherent to the codebase. If we were to implement our own solution instead, we would be less reliant upon others to fix any inherent problems to the code. These factors lead us to give the existing codebase a rather weak rating compared to the choice of pursuing our own solution.

**Flexibility:** Obviously, implementing our own system would give us much higher flexibility than using the existing ecosystem. The only two true limitations would be the time and hardware resources available to us. The official codebase would limit us to its capabilities. Adding new functionality to the existing codebase would be possible, but adapting software for purposes for which it was not originally intended does tend to be a time-consuming procedure in our experience. Therefore, the option of developing our own solution wins out again here.

**Complexity of complete system:** This refers not just to the complexity of the components we work on, but of the entire platform and ecosystem. As anyone who bothers to peruse the Red Pitaya codebase [5] can deduce, it is a large ecosystem with many features and capabilities, most of which are not relevant to our needs. A custom system developed for the specific requirements of this project would be much leaner and have fewer points of failure.

**Labor costs:** Here is where using the existing codebase would be beneficial in our view. While understanding its inner workings would doubtlessly be required and would take a significant amount of time (particularly in view of the sparse documentation at this point), developing a completely new system from scratch would be a much costlier undertaking in terms of required man-hours we our estimate.

**Chances of success:** Basing our work on the existing codebase would unburden us of the legwork needed to get basic functionality up and running. We could exchange the filter components of the existing system with our own, but leave most of the remaining system untouched. The challenge would be to understand the system well enough to do this. Building our own system would require re-implementing more components of the Red Pitaya platform than just the filters. As each additional task increases the risk of failure, this is not without risk to the project's success. Overall, we consider this factor to be roughly even between the two choices.

**Robustness of third-party components:** This criterion takes into account our assessment of the reliability and dependability of any component we use which is not developed by ourselves, along with its documentation and manufacturer support. In the case of the Red Pitaya codebase, this would primarily comprise the codebase itself as well as any components for it developed by other parties. If we were to develop our own solution, the building blocks would primarily be the Xilinx toolchain and libraries for the SoC.

Because the Red Pitaya codebase is rather large, not well documented, has not had many years to mature and is no longer being maintained, and the company is small and more likely to be stretched thin, we do not score the Red Pitaya codebase highly here. The Xilinx toolchain and FPGA libraries, while undoubtedly not without bugs, have had a lot

**Table 2.2:** Decision matrix comparing the usage of the existing Red Pitaya ecosystem against building our own data acquisition system. Weighing: Scale of 1 (worst) to 6 (best). More total points is better.

|                                                  | RP | Custom |
|--------------------------------------------------|----|--------|
| Reliance on Others                               | 2  | 5      |
| Flexibility                                      | 2  | 5      |
| Complexity of Complete System                    | 1  | 4      |
| Labor Costs                                      | 4  | 2      |
| Chances of Success                               | 4  | 4      |
| Available Documentation for Used Building Blocks | 3  | 5      |
| Robustness of Third-Party Components             | 2  | 5      |
| Reparability                                     | 2  | 5      |
| Long-term Viability                              | 2  | 4      |
| Total                                            | 21 | 37     |

of time to mature on the other hand. Also, Xilinx is a big company with vast resources, so any bugs which are encountered in their products are more likely to be addressed in our view. For these reasons, we score developing our own solution higher than using the existing codebase.

**Reparability:** If any issues are found in the resulting product, we are more likely to be able to fix them if it is our own codebase rather than that by a third party.

**Long-term viability:** Developing our own solution would allow us (or anyone else wishing to base their work off of ours) to address future needs relatively easily. Using the existing code base would make this more difficult in our view.

**Conclusion:** Developing our own solution does carry a significant risk and is likely to require more work than basing our application on existing work. But in light of the mentioned drawbacks of the latter approach, we still find that it is the preferable approach and is more likely to lead to a successful outcome.

### 2.3 Concept

Having concluded that we shall develop our own solution from scratch, we can now devise a concept for how that solution will look. On the most fundamental level, it will require the following components:

- a custom FPGA firmware for data acquisition and filtering
- a new scope application for data visualization
- an interfacing layer between the FPGA and the scope
- optionally, the possibility to connect to other applications like Matlab

The following sections elaborate on the general shape of our solution. Chapters 4 through 6 explain the components in detail, while Chapter 3 documents the filter design process, which is rather separate from the firmware and software and therefore a dedicated chapter.

### 2.3.1 FPGA Components

On the most abstract level, the FPGA part of our system will need to be able to

- acquire data from the ADC,
- decimate and filter it,
- and pass it on for further processing.

Due to the platform's open nature, there exist some projects for the STEMlab by parties other than Red Pitaya themselves. One of these is *Red Pitaya Notes* by Pavel Demin, available at [11]. As part of that project, an ADC core for the FPGA which acquires data from the ADC and passes it on via an AXI4 streaming interface<sup>1</sup>. has been developed. Since this project is open-source and has a permissive license (MIT license [12], see also Appendix D.1 on page 126) and fulfills our main technical requirement (easily to interface with), it is used in our project to interface the FPGA with the ADC.

For filtering, one can either implement custom filter topologies from basic FPGA building blocks (adders, multipliers, etc.), write completely custom VHDL or Verilog code, or use ready-made blocks, if available. Xilinx provides such FPGA blocks for FIR and CIC filters, both of which come with excellent documentation [13], [14]. In order to avoid re-inventing the wheel and to take advantage of some of the advanced features offered by these two blocks, we use them in our design.

Interfacing between the FPGA and the outside world requires a component which can take data from the filters and hand it off to the GNU/Linux running on the ARM cores. Additionally, the user must be able to trigger measurements from the operating system. Luckily, such a data logging core for a Xilinx FPGA has already been developed in an earlier project [15] under a compatible license. It comes with a kernel module to interface with GNU/Linux. With some minor adaption work, this core should suit our needs nicely.

The last component needed for the FPGA is a control logic to set the decimation rate and enable or disable specific components of the data processing chain. This should be fairly straightforward, and will be implemented by ourselves as a custom block.

In summary, the concept for the FPGA data processing system looks as follows:

- The ADC is accessed via the ADC core from [11].
- Filtering the data is conducted with Xilinx's CIC and FIR compilers.
- For interfacing with GNU/Linux, the data logger from [16] is used.
- A custom control logic configures the data processing chain on-the-fly through user input.

### 2.3.2 Interfacing Layer

The interfacing layer is responsible for sending the data from the STEMlab to the user application running on a computer. This is done via a Gigabit Ethernet connection. As is easily apparent, this connection is far too slow for moving all data off the device which is generated by the ADC (about  $3.7 \text{ GiBs}^{-1}$ ), hence the need for decimation (among other reasons).

For the interfacing layer, a server application to which a client can connect is run on the STEMlab. The server takes data from the logger's kernel module, processes and packages it as necessary, and sends it out over Ethernet. The application is documented in detail in Chapter 5, starting on page 49.

---

<sup>1</sup>Xilinx's proprietary general-purpose interconnect for moving large amounts of data around an FPGA

### 2.3.3 Oscilloscope

The scope is the main interface between the end user and the STEMlab in our system. Through it, data can be visualized and analyzed in both the time and frequency domain. Furthermore, it is capable of calculating certain metrics such as the power in a given frequency range or SNR via the autodetection of signal and noise components.

In order to ensure maximum portability, the scope is implemented as a web application rather than a custom binary. This allows it to function on any operating system with a reasonably modern browser. A thorough documentation of its capabilities and implementation is available in Chapter 6, beginning on page 6.



# 3

## CHAPTER

# Filter Design

Some key points underlying the theory of filters have been treated in Chapter 1, and Chapter 2 defines the overall objectives of this project; implementing a custom filtering system on the FPGA among them. This chapter develops a concrete concept for that filtering system, addresses some of the issues encountered when moving from the theory of filters to the practice of designing them, and specifies the filters which are to be used. The implementation of those filters on the FPGA is addressed in the next chapter, beginning on page 43.

### 3.1 Requirements

The overarching objective is to downsample the signal coming out of the ADC. In this section, we derive upper and lower boundaries for the downsampled frequency range, and then define the specific downsampling ratios to be used.

The upper boundary for the resulting sampling rate is set by the STEMlab's network connection, which has a capacity of  $1000 \text{ Mbit s}^{-1}$ . The total data rate from the ADC is

$$\begin{aligned} S &= 125 \text{ Msample} \cdot \text{s}^{-1} \\ N_{\text{ch}} &= 2 \\ B_{\text{ch}} &= 14 \text{ bit} \cdot \text{sample}^{-1} \\ B_{\text{ch,pad}} &= 2 \text{ bit} \cdot \text{sample}^{-1} \\ B_{\text{ADC}} &= N_{\text{ch}} \cdot (B_{\text{ch}} + B_{\text{ch,pad}}) = 32 \text{ bit} \cdot \text{sample}^{-1} \\ R &= S \cdot B_{\text{ADC}} = 4 \text{ Gbit} \cdot \text{s}^{-1} \end{aligned} \tag{3.1}$$

Where:

- $S$  : sampling rate
- $N_{\text{ch}}$  : number of channels
- $B_{\text{ch}}$  : channel width
- $B_{\text{ch,pad}}$  : padding per channel
- $B_{\text{ADC}}$  : total width of bit stream out of ADC
- $R$  : total data rate out of ADC in bit

Comparing the ADC's data rate from Equation 3.1 to the network's available capacity, a

**Table 3.1:** The chosen downsampling ratios, their prime factor decompositions, the downsampling ratios distributed across stages, and the resultant sampling rates

| R    | Decomposition                                               | Stages                                          | $f_s$ (kHz) |
|------|-------------------------------------------------------------|-------------------------------------------------|-------------|
| 5    | $5 = 5^1$                                                   | 5                                               | 25000       |
| 25   | $5 \cdot 5 = 5^2$                                           | $5 \rightarrow 5$                               | 5000        |
| 125  | $5 \cdot 5 \cdot 5 = 5^3$                                   | $25 \rightarrow 5$                              | 1000        |
| 625  | $5 \cdot 5 \cdot 5 \cdot 5 = 5^4$                           | $25 \rightarrow 5 \rightarrow 5$                | 200         |
| 1250 | $2 \cdot 5 \cdot 5 \cdot 5 \cdot 5 = 2^1 \cdot 5^4$         | $125 \rightarrow 5 \rightarrow 2$               | 100         |
| 2500 | $2 \cdot 2 \cdot 5 \cdot 5 \cdot 5 \cdot 5 = 2^1 \cdot 5^4$ | $125 \rightarrow 5 \rightarrow 2 \rightarrow 2$ | 50          |

downsampling factor of at least 4 is required for real-time data transmission. Because 125 is not divisible by 4 and some room for protocol overhead is needed as well, a factor of 5 is chosen instead. This makes for a resulting data rate of  $800 \text{ Mbit} \cdot \text{s}^{-1}$ .

On the lower end of the spectrum, the system should still be able to process audio signals. Common sampling frequencies for audio are 44.1 kHz for Compact Discs, and 48 kHz for the audio component of audio-visual applications. Neither of these frequencies fit nicely into 125 MHz (requiring large prime factor for the rate change), so the lower boundary is specified as 50 kHz, corresponding to a downsampling ratio of 2500.

To cover a wider range of use cases, additional sampling frequencies between these two boundaries are specified. Table 3.1 contains the complete list of downsampling ratios, along with the corresponding sampling frequencies.

## 3.2 Cascade Concept

Based on the downsampling factors from Table 3.1, a general concept for the filter cascades which implement those rate change factors is presented next.

As discussed in Section 1.3, implementing high downsampling ratios in a single stage is generally not a sound design choice. Consequently, the downsampling ratios must be decomposed into smaller factors. Table 3.1 contains the prime factors for all ratios, and then recombines the primes into factors which can be used as downsampling ratios in a multi-stage design. These factors must fulfill the following criteria:

- Filters for different stages should be re-usable across multiple downsampling ratios in order to save resources.
- The factors for the individual stages should be large enough to be of utility, but small enough so as not to make the resulting filter impractically narrow and large.
- No stage should be preceded by a stage with a smaller downsampling ratio than it has. The cascades go from larger ratios to smaller ones, as outlined in Section 1.3.

CIC filters are well-suited for large rate changes, but are not an optimal solution for smaller ones. As an example for such a low-rate change CIC filter, Figure A.2 on page 103 in Appendix A.1 shows the frequency response of a CIC filter with a rate change of 2. Based on that observation, it is reasonable to implement the lower rate change factors without CIC filters, while using CIC filters as the first element in the filter chain for the higher rate changes. This allows taking advantage of the CIC filter's high computational efficiency for large downsampling rates, while still having good frequency response behavior for the lower rate changes.

Other choices are of course possible. Particularly in the case of  $R = 625$ , one may choose to implement a chain of  $125 \rightarrow 5$  instead of  $25 \rightarrow 5 \rightarrow 5$ . The two implementations are com-



Figure 3.1: The concept for the filter chains

pared in Figure B.1 on 110 in Appendix B.1. While the  $125 \rightarrow 5$  chain would offer better stopband attenuation behavior over certain frequency ranges and therefore improved SNR, the  $25 \rightarrow 5 \rightarrow 5$  chain offers the advantage that if the design is ever changed and only the higher sampling rates are implemented (removing the chains for  $R = 1250$  and  $R = 2500$ ), it can re-use the elements from the higher chains. The  $25 \rightarrow 5 \rightarrow 5$  chain is therefore chosen.

With the above considerations in mind, an overall concept for the filter cascades can be devised; Figure 3.1 shows the result. In total, there are seven different filters: Two CIC filters, their compensators (one of which is used to downsample as well), a steep and a flat FIR filter, and one half-band filter. It would be possible to use a second, less steep half-band filter in the chain for  $R = 2500$ , but because those filters run at very low sampling frequencies anyway, using a steep filter twice carries no significant penalty over using a second, flatter filter.

### 3.3 Filter Specifications

Having determined the needed filters, the constraints imposed upon them and the specifications used for the filter design process based on those constraints are documented in the following paragraphs. For this purpose, it is no longer sufficient to merely consider the filters in the mathematical sense and the requirements derived from that; resource usage on the hardware must be taken into account. The hardware places two main constraints on the design:

- Number of available LUTs: 17 600
- Number of available DSP slices: 80

The number of LUTs is relevant for storage (filter coefficients), the CIC filters<sup>1</sup>, the rest of the processing system, and control logic. The DSP slices can therefore be reserved for the FIR filters. Because the device has two channels, only 40 slices may be used per channel.

To have a realistic gauge for resource usage of the FIR filters, it is necessary to keep in mind the two factors which primarily influence resource usage:

- the frequency at which the filter runs (its incoming sampling rate) and

---

<sup>1</sup>The CIC compiler block by Xilinx can be configured to utilize LUTs instead of DSP slices for its computations. The FIR compiler can only use DSP slices for its computations.

**Table 3.2:** The target filter specifications. These parameters are based both on the desired frequency domain behavior of the filters as well as the feasibility of implementation in terms of resource usage. For resource considerations, the results from Appendix B.2 are used as a guideline.

| Filter                                                                        | Passband Edge<br>( $\times \pi \text{rad} \cdot \text{sample}^{-1}$ ) | Stopband Edge<br>( $\times \pi \text{rad} \cdot \text{sample}^{-1}$ ) | Passband Ripple<br>(dB)      | Stopband<br>Attenuation (dB) |
|-------------------------------------------------------------------------------|-----------------------------------------------------------------------|-----------------------------------------------------------------------|------------------------------|------------------------------|
| 5steep                                                                        | 0.2                                                                   | 0.225                                                                 | 0.2                          | 60                           |
| 5flat                                                                         | 0.2                                                                   | 0.3                                                                   | 0.05                         | 60                           |
| CIC25                                                                         | 0.008                                                                 | N/A                                                                   | N/A                          | 60                           |
| CFIR25                                                                        | 0.008                                                                 | 0.016                                                                 | 0.05                         | 60                           |
| CIC125                                                                        | 0.0016                                                                | N/A                                                                   | N/A                          | 60                           |
| CFIR125                                                                       | 0.0016                                                                | 0.0024                                                                | 0.05                         | 60                           |
| Transition Band Width<br>( $\times \pi \text{rad} \cdot \text{sample}^{-1}$ ) |                                                                       |                                                                       | Stopband<br>Attenuation (dB) |                              |
| 2steep                                                                        | 0.004                                                                 | N/A                                                                   | N/A                          | 60                           |

- the number of coefficients, and therefore, adders and multipliers.

Because the final filter in a cascade is the one which determines the overall transition band (see Section 1.3), it is desirable to have maximally steep output filters in a cascade. Consequently, the filter 5steep should be as sharp as possible. Since that filter is not just used as the final stage in some cascades, but also as the single filter for the  $R = 5$  chain, it runs at the highest sampling frequency<sup>2</sup>. 5steep is therefore the most critical filter in terms of resource usage. The filter 5flat also runs at the highest sampling frequency in the  $R = 25$  chain, but because it is not the final filter in that chain, it need not be as steep.

While it is possible to estimate the needed resources of a given filter design based on its specifications, reliable figures are best obtained by way of experiment. The FPGA toolchain might make optimizations which are hard to take into account when performing estimates by hand. The results of these measurements are available in Appendix B.2 on page 111. Based on those figures, a filter size of around 250 is determined to be the upper boundary for 5steep; a filter of that size uses about 25 DSP slices, leaving 15 slices for other filters (per channel). Of these, 5flat is the most critical, because it must also be able to run at the full incoming sampling frequency. 2steep runs at a much lower sampling rate and can therefore be of significant size without a notable penalty in resource usage. The final implementation of 5steep is actually smaller, at a length of 204 coefficients and a DSP slice count of 22 per channel, to allow some flexibility for FPGA design changes.

Based on these findings and the measurements of the STEMlab's stock configuration from Section 2.1.2, it is possible to define performance specifications for the filters without needing to prod in the dark, so to speak. The following paragraphs explain the considerations which lead to the final filter specifications. The results are summarized in Table 3.2.

---

<sup>2</sup>It should be noted at this point that a filter which is configured to run at a high sampling rate can be re-used at lower sampling rates in the Xilinx toolchain. The filter's behavior in that case is correct, even when being run at a lower rate than maximally possible.

**Requirements for 5steep:** Based on the results of the moving averager used in the STEM-lab's stock configuration, even a FIR filter with moderate performance characteristics should already offer significant improvements. In order to achieve notable gains over the default configuration, the following performance goals for 5steep are specified, according to the pattern explained in Section 1.2.2:

- Passband ripple: better than 0.25 dB
- Stopband attenuation: 60 dB or better
- Transition band width:  $0.05 \cdot f_s/2$  or better

The stopband attenuation criterion is also applied to all other filters. Anything else would be a waste of resources, as shown in Figure 1.22 in Section 1.3.

**Requirements for 5flat:** This filter need not have a drop-off as sharp as 5steep, as long as the end of its transition band does not overlap with the first spectral copy of 5steep (see Figure 1.21 in Section 1.3). However, because it is in a cascade with 5steep, a sharper requirement on its passband ripple is imposed, in order not to worsen overall passband ripple behavior of the cascade too much.

**Requirements for 2steep:** Due to the low frequency at which the half-band filters run, they use very few resources (1 DSP slice per channel). Consequently, only a single filter needs to be specified, as mentioned in the previous section.

**Requirements for CIC25:** The relevant design criteria for the CIC filter are its stopband attenuation, its decimation rate, and the cutoff frequency/desired passband width (see Figure 1.16 in Section 1.2.3.2). The cutoff frequency is chosen such that it matches the frequency band which is of interest at the end of the filter chain for  $R = 125$ , i.e.  $f_p = f_{s,\text{high}}/R = 0.008$ . This means that the passband of CIC25 and CFIR25 combined is too wide by a factor of 5 for the  $R = 625$  chain, but this is of no concern because it will be cut off by 5flat as the last stage in that case. This allows the re-use of the CIC and its compensator across two chains without changing their design parameters.

**Requirements for CIC125:** The same considerations as for the other CIC filter apply. The filter and its compensator are specified for a rate change of 125 and 5 instead of 25 and 1, respectively, and the cutoff frequency is set to match the filter chain of  $R = 1250$ . This makes it twice as wide as it needs to be for  $R = 2500$ , which is corrected by a second half-band filter.

**Requirements for compensators:** The compensators are specified according to the considerations from Section 1.2.3.3, with the added feature of CIC125 also being used as a decimator.

**Summary:** With the above considerations and the experimental results for resource usage from Appendix B.2, it is possible to formulate a complete set of specifications for the filters. They are compiled in Table 3.2. Translating the specifications from Table 3.2 into absolute frequencies results in the values from Table 3.3. The frequency responses of all filters and filter chains are listed in Appendix B.3, starting on page 112.

**Table 3.3:** The expected relative and absolute transition band widths of the various filter chains, based on the specifications from Table 3.2.

| Chain | Relative TB Width<br>of Final Filter<br>$(\times \pi \text{rad} \cdot \text{sample}^{-1})$ | Absolute TB Width<br>of Chain<br>(kHz) |
|-------|--------------------------------------------------------------------------------------------|----------------------------------------|
| 5     | 0.025                                                                                      | 1562.5                                 |
| 25    | 0.025                                                                                      | 312.5                                  |
| 125   | 0.025                                                                                      | 62.5                                   |
| 625   | 0.025                                                                                      | 12.5                                   |
| 1250  | 0.040                                                                                      | 4.0                                    |
| 2500  | 0.040                                                                                      | 2.0                                    |

# CHAPTER

# 4

## FPGA

This chapter first presents a rough outline of the FPGA toolchain, and then provides more specific information on each of the three FPGA subsystems in Sections 4.2, 4.3 and 4.4, respectively. Those subsystems are the ADC control logic, a data acquisition core which writes data to RAM and is responsible for triggering, and the filter chains that connect the two. Figure 4.1 shows a schematic of how these components fit together, and how they are related to the overall STEMlab system.

### 4.1 The Xilinx Toolchain

The bitstream is compiled using Vivado, Xilinx's own IDE, a tool that can do everything around Xilinx FPGAs. It does the crucial parts right and can be interfaced with using Tcl.



**Figure 4.1:** Schematic of the STEMlab with the three main FPGA subsystems highlighted in yellow.

This is very convenient, as a project can be replicated idempotently<sup>1</sup> whenever a rebuild is needed.

Whilst Vivado offers a GUI to build block designs, this process can be a bit frustrating to the user due to various “eccentricities” of the application. Therefore, we choose to use its Tcl API to write scripts that create a new project and apply and connect all necessary blocks. This avoids a lot of errors as a bug in Vivado’s user interface won’t tamper with the project. It also enables us to use version control tools for the project as Vivado projects create a lot of files which often clash in very simple versioning operations. Using Tcl scripts which create and configure the project, and leaving everything else out of the repository, avoids this hassle. Tcl also allows the creation of sub-blocks: One can group blocks together and insert them multiple times with little effort; a feature which Vivado’s graphical front-end apparently does not offer. More on the Tcl API and Tcl itself can be found in [18], [19], and [20].

The final advantage of the Tcl API to be mentioned here is that it allows the creation of the entire project, the block design, perform the synthesis and implementation, build a bitstream, as well as the board support package and first stage bootloader, all in a single sequence of automated tasks without the need for manual intervention. Since this tends to take quite a lot of time, that is a significant advantage.

## 4.2 The ADC Core

The ADC core is a simple piece of logic that interfaces with the FPGA pins which are connected to the STEMlab’s ADC. It reads the ADC’s unsigned 14 bit values and converts them to 16 bit signed format by adding an offset of  $2^{13}$  and performing a 2 bit sign extension. The resulting numbers are then provided over an AXI Stream bus interface, which is also used by all the filters. The core is used from the git repository provided by Pavel Demin [11]. More on his repository and project can be read in Section 2.3.1.

## 4.3 The Logger Core

The logger core (logger in further text) is a piece of VHDL code that stores samples it gets from a source into a ringbuffer in the RAM. It is packaged as a Vivado IP core and can be seamlessly integrated into the project. The logger originated from an earlier project [16]. In addition to logging data to RAM, it can also be programmed with various triggers. It reads instructions from a BRAM on the FPGA and iterates over them. Having reached the last one, it issues an IRQ signal, signaling the end of the transcription.

The logger’s original implementation features eight channels with a width of 14 bit, padded to 16 bit in order to simplify data transmission in byte-sized chunks. Each two channels require one clock cycle to store a sample.

In order to take advantage of the fact that additional bits can be “won” by oversampling a signal, this project implements a new configuration, which can process full 16 bit values, a gain in two bits over the ADC’s output. The penalty for this is one additional clock cycle of delay, since the adders and comparators cannot match the timing requirements with two additional carries. Since this project aims to optimize for lower-frequency signals, the resulting additional delay of 8 ns is acceptable and will not be an issue in practice. The logger core comes with a kernel module that provides a convenient interface from the ARM core. This avoids having to manually program the logger core.

---

<sup>1</sup>Idempotence [...] is the property of certain operations in mathematics and computer science, that can be applied multiple times without changing the result beyond the initial application [17].

## 4.4 The Filter Chains

The filter chains are the most crucial part of the project and also the most delicate one as simple mistakes can cost several decibels of SNR and create a worse signal at the filter chain's output than at its input, instead of an improved one. The logical structure of the chains can be seen in Figure 3.1 and the rationale behind it is explained in the respective Section 3.3. For the detailed implementation, it is advised to look at the project in Vivado itself, as the block design is impractically large to be put onto paper, thus it is omitted in this report.

This section first gives a few notes on the two most important building blocks in the FPGA design: The CIC and FIR compilers by Xilinx. After that, two key points which are particularly challenging when implementing filter chains are elaborated upon: Propagating the correct bits through the cascade, and adjusting the gain correctly in order to exploit the available bits for maximum dynamic range.

### 4.4.1 Filter Compilers

The basic building blocks of the filter chains are FIR and CIC filters, which are based on ready-to-use blocks by Xilinx. Vivado's CIC and FIR compilers natively utilize the DSP slices to a maximum extent and make it very easy (at least in theory) to implement a Matlab-designed filter in hardware. Those IPs are described very thoroughly in the official documentation [14], [13].

The FIR filter compilers are configured using a set of coefficients in double format (exported from Matlab, or any other filter design tool of choice). The compiler quantizes the coefficients with maximum precision using a 16 bit fixed point number (this can be changed from the default but should be left to the compiler for best results). It does so by determining the index of the MSB required to represent the biggest coefficient in the set using 16 bits downwards.

As an example, take the biggest coefficient to be  $c_{\max} = 0.23$ . The bit at index  $-2^2$  becomes the sign as its value (0.25) is not needed to display  $c_{\max}$ . Thus the number is said to have 16bits overall and 17 fractional bits. While this might seem a bit counterintuitive at first, it simply means that the LSB is the one at index  $-17$  and it has 16 bits, meaning the sign is at the bit index  $-2$ .

The compiler then also takes the specified input bit configuration and determines the required output configuration to guarantee no overflows and achieve maximum performance. What is important here is that the output bit width is the same that is needed to guarantee no overflows inside the filter. This means that the user has to be aware of the maximum gain of the designed filter and determine on their own which bits are important at the output.

### 4.4.2 Bit Propagation Through the Filter Chains

For this application, only 16 bit values are stored. However, the filters generate far wider numbers at their outputs. This means that many bits are discarded at the output. To make sure that no important bits are truncated, the chosen input format for the filters is 17.7, resulting in 24 total bits. The MSB should always remain just a sign extend of the sign actually residing at bit 15. The 7 fractional bits are cut off at the end of the filter chains but are still important for more precision so less rounding and/or truncation errors are introduced inside the filter chain.

---

<sup>2</sup>The bit at index  $-n$  is the  $2^{-n}$  valued bit.



**Figure 4.2:** The flow of the bits in a filter chain. The bits are shifted through horizontally. Every bit that does not fit into the numerical width of the next stage will be cut off, starting with those furthest away from the sign. At the sign extend, the bit is replicated 4 times and handed through to the next stage. The ZERO stage simply holds 7 '0' bits to pad the number coming from the ADC on its lower end before it goes into the first filter. Inside the filter the bits can grow, so those are not shifted and cut off but rather resized towards the output.

As values can use up significantly more than 24 bits inside the filter due to bit growth, it has to be ensured that no overflows happen, resulting in greater bit widths at the output of the filter. It is important that the location of the decimal point is always tracked and remains in its right place. Figure 4.2 depicts the flow through an example chain but represents the general case in out block design.

#### 4.4.3 Ensuring Maximum Dynamic Range

The challenge of optimally using the available dynamic range is explained here on a simple example of a sampled sine wave coming into the filtering system. It is important not to discard any MSBs (or signs) because otherwise the signal will clip or, even worse, overflow and wrap around. The same applies to the case where too few bits are cut off. If the bit count is increased by one to guarantee no overflow inside the filter, but that bit is not set at the output due to unity gain, it will effectively be lost as it will never be used. This reduces the maximum theoretically achievable SNR by 6 dB, which is obviously highly undesirable.

To make sure neither of these faults happens, it is important to have the highest possible filter gain at  $G \leq 1$ . Furthermore, at the end of each filter chain, the additional bits must not be carried over, but rather the initial 14 bits before the decimal point, along with two fractional bits after the decimal point. This yields the desired 16 bit value and ensure no "empty" bits.

The FIR compiler can avoid those empty bits by normalizing the coefficient such that the highest gain (i.e. the top peaks of its passband ripple) is at exactly 1. This is called *maximizing the dynamic range*. Figure 4.3 illustrates the issue of losing one bit.

One can observe that the sine in Case 1 (top plot) uses the dynamic range to a perfect extent as the full-scale sine has an amplitude of 31, the maximum value a 6 bit int can hold. Case 2 (middle plot) shows a sine that has been scaled to 34 and thus requires an additional bit. But because a 7 bit int can hold values up to 63, most of the time the MSB ends up not being used. With 7 bit, the highest SNR<sub>max</sub> possible would be

$$\begin{aligned} \text{ENOB} &= \log_2(34) = 5.09 \\ \text{SNR}_{\max} &= 1.76 \text{ dB} + \text{ENOB} \cdot 6.02 \text{ dB} = 32.39 \end{aligned} \quad (4.1)$$

In Case 3, the dynamic range is used well and the SNR<sub>max</sub> with a 6 bit int is

$$\begin{aligned} \text{ENOB} &= \log_2(30) = 4.91 \\ \text{SNR}_{\max} &= 1.76 \text{ dB} + \text{ENOB} \cdot 6.02 \text{ dB} = 31.30 \end{aligned} \quad (4.2)$$

This yields a difference of only 1.09 dB<sup>3</sup>. This example shows that it is well advised to scale the coefficient such that they don't ripple around 1, but rather that the maximum ripple is exactly 1 and not more.

Because if it can be ensured that no additional MSB is used which is empty most of the time, it is possible use an additional LSB which is always well used and to effectively win a bit. So in most cases when the coefficients are designed to have a unity gain, close to 6.02 dB can be won by rescaling the coefficients.

#### 4.4.4 Errors Due to Truncation in the CIC Filter

As detailed in Section 1.2.3.4, the high gain of CIC filters generally requires discarding bits at the filter's output. In our implementation, we discard 17 bit at the output of the  $R = 25$  CIC filter, and 26 bit at the output of the  $R = 125$  filter, in both cases through truncation. No bits are discarded at the filter's input, and the CIC compiler ensures that its internal widths are always sufficient for full precision [13]. Therefore, only the last stage introduces an error.

Hogenauer's formulas are used to determine the mean error and variance of both CIC filters. Interestingly, the result is identical for both filters. But given that the output precision is 16 bit in both cases, this actually does make sense. The results are:

$$\mu_{\text{CIC}25} = 0.5 \quad (4.3)$$

$$\sigma_{\text{CIC}25}^2 = 0.289 \quad (4.4)$$

$$\mu_{\text{CIC}125} = 0.5 \quad (4.5)$$

$$\sigma_{\text{CIC}125}^2 = 0.289 \quad (4.6)$$

The calculations are performed by a Matlab script and are therefore not further explained.

---

<sup>3</sup>While 1.09 dB might still seem rather large, keep in mind that with the wider numbers running in the actual filter chains, the result is significantly better than in this illustrative example.



**Figure 4.3:** An illustration of good and bad use of dynamic range



# 5

## CHAPTER

# Server

Once the FPGA has recorded data, that data has to be transmitted over the network. Since implementing networking in hardware is not feasible in most cases, this is done via the ARM Cortex A9 core that already has a PHY<sup>1</sup>. To control all the hardware of the SoC, an embedded Ubuntu GNU/Linux is running on the ARM core which can control all hardware components, including the logger running on the FPGA. An application is then needed that reads the necessary data from the RAM and sends it to the network. This part of the overall product is designated as the *server*. This section explains the design choices and internal structure of the server application.

## 5.1 Requirements

The basic functional requirements of the server application are:

- Read the system status and transmit it over the network.
- Receive commands over the network, translate them where needed and relay them to the FPGA IP.
- Read data from the RAM and transmit it over the network.

## 5.2 Design Choices

As the **ZYNQ Logger** comes with a kernel module that has to be interfaced via IOCTL calls, it is recommended to write the application in C or C++. This is due to the nature of Linux, which still requires mostly C for interfacing. There are some IOCTL interfaces in Python and Rust, for example, but those bring additional problems on ARM Linux since not all libraries and features are available.

Since the server application is rather complex, C++ is a good choice, eliminating some of the caveats which C has. Additionally, the entire feature set of C can be used in C++, so the choice carries no penalties in terms of features.

The **WebSockets protocol** is mandatory, due to the choice of JavaScript on the client side. This is less of a design choice on the server side, and more of an inherited requirement from

---

<sup>1</sup> A chip implementing the physical layer of the OSI model

the front-end part of the project. For this purpose, the uWebSockets (*uWS*) library is chosen. It carries a very small footprint and offers good performance, though documentation is somewhat lacking. Its performance is high enough to ensure that in any given scenario, the server application will not be the bottleneck of the overall data pipeline; the network connection will choke before *uWS* reaches the limits of its capabilities.

*uWS* is based on epoll, libuv or boost::asio depending on the user's choice. All of them are asynchronous libraries (more on this topic at [21]), which makes networking very convenient. *uWS* comes with callbacks that can be registered for each WebSockets event (see Section C.1). Furthermore, the user can hook into the event loop and register other events such as a reoccurring timeout (timer).

**JSON** is used as a data format for settings and statistics. It is our format of choice because it has by far the largest user base of all the available formats, the specification is simple and JavaScript can parse it natively into a JavaScript object. This also another point where the choice of C++ offers significant benefits over C, since it has a number of high level (i.e. easy to use) libraries that can serialize and, more importantly, deserialize JSON objects.

### 5.3 Implementation

While the server's overall task is complex, its implementation has been kept as simple as possible. Indeed, the application fits into a single file (not counting the loaded libraries) of a few hundred lines, and consists mainly of a single asynchronous event loop.

The overall application state is stored in a struct. This allows any event to access any information it might need, such as the socket handle, the current sampling rate, or the requested number of bytes. Most of the application's functionality can be deduced from its event loop, which is shown in Figure 5.1.

**Figure 5.1:** The server's event structure



# CHAPTER

# 6

# Oscilloscope

Because the FPGA and Linux side are newly implemented, and because the new transmission protocol is not the same as the old one, existing applications reading data from the STEMlab no longer function. A new front-end is therefore required. This front-end is a web application whose functionality is broadly modeled on conventional oscilloscopes; therefore, it is generally referred to as *scope* in this report. This chapter lays out the requirements for the scope, explains the design choices on which it is built, summarizes the major implementation details, and presents the final product.

## 6.1 Requirements

The requirements for the scope are partially defined by a Java application from previous projects [22] whose core capabilities are to be replicated. More functionality may be added where sensible and possible. The main requirements are:

- Receive data in configurable size over the network.
- Display received data both in the time and frequency domain.
- Calculate RMS power density in the signal.
- Calculate THD of the signal.

## 6.2 Design Choices

There is a wealth of programming languages to choose from, with countless libraries to go with them. The following sections explain why JavaScript and web technologies are used to implement the graphical user interface (GUI) and the mathematical functions of the scope.

For this purpose, a group of programming languages are compared and weighed against each other in various aspects; the results are summarized in Table 6.1. The general attributes listed in Table 6.1 are explained in the subsequent paragraphs to enable the reader to understand how they apply to our decision. Due to its importance, there is an additional section dedicated to the topic of networking (Section 6.2.1).

**Table 6.1:** A comparison of a few programming languages which might be used to implement a graphical front-end like the oscilloscope from this project. The scale goes from 1 (worst) to 6 (best).

|                               | Rust | C <sup>+</sup> | Java | Python | JavaScript |
|-------------------------------|------|----------------|------|--------|------------|
| Open Standard                 | 6    | 6              | 1    | 6      | 6          |
| Networking                    | 6    | 6              | 6    | 6      | 4          |
| Graphics                      | 2    | 5              | 5    | 5      | 6          |
| Prevalence                    | 3    | 6              | 6    | 5      | 6          |
| Ease of Development           | 5    | 5              | 5    | 5      | 6          |
| Ease of Deployment            | 3    | 4              | 5    | 6      | 6          |
| Familiarity With the Language | 3    | 3              | 4    | 6      | 6          |
| Total                         | 28   | 35             | 32   | 39     | 40         |

**Open Standard:** Since this is a university project meant, among other things, for educational purposes, it is crucial to make all source code available to the public under a flexible license. Thus, it is desirable to use a technology which is independent of any one company and their corporate policies (avoiding vendor lock-in). Some programming languages are managed openly and accept contributions from the public, others not.

**Networking:** The two criteria which the data transfer needs to fulfill are speed and data integrity. Since networking is a highly complex topic, it is important that the language not only has libraries for good networking protocols, but that those libraries are also easy to use. More information on the topic is contained in Section 6.2.1.

**Graphics:** An oscilloscope is quite a demanding application when it comes to graphics; drawing an image stream which looks fluid to the human eye on modern high resolution display uses a lot of processing power. Using an interface such as OpenGL which can utilize dedicated graphics resources on a computer is therefore necessary.

Additionally, creating a sensible user interface with basic drawing commands such as rectangles and lines is impractical. Instead, a GUI toolkit to speed up the design process is needed.

**Prevalence:** Using a technology which is widespread makes it more likely that good tutorials are available. It also facilitates troubleshooting, since a larger user base means that there is a higher chance of savvy users being able to provide support if needed.

**Ease of Development:** Some solutions require large IDEs and unwieldy toolchains for development. Others can be used with a more lightweight setup. Because both team members come from a Linux background, the latter is preferred. Since added complexity always also means added probability of errors and failures, using leaner tools also decreases the chances of having to fight with the development tools instead of tackling the actual challenges which are to be solved.

**Ease of Deployment:** This takes into account how easy or difficult it is for an end-user to install the scope and get it up and running. Having a toolchain which allows the effortless creation of stable binaries is important here.

**Familiarity With the Language:** The best toolkits do not matter if none of the involved programmers have ever used them and will struggle with even the basics for a major part of the project's duration. Thus it is inevitable that personal preferences also flow into the decision process.

**Summary:** As the total scores in Table 6.1 show, JavaScript fits the priorities set in this project best. As a scripting language which can be run in any modern browser, it can provide a website from the STEMlab board and which can be accessed from a client via a browser. Since no special programs need to be installed on the client computer, and every major operating system today has at least one reasonably modern browser, this makes the solution both highly portable and easy to deploy from an end-user perspective.

JavaScript's popularity ensures that there is no danger of the underlying technology of the scope becoming obsolete any time soon. Furthermore, with its support for WebGL and the WebSockets protocol, it provides two high-performance and easy-to-use technologies to implement graphics and networking, respectively. Its primary downside is a heavy memory footprint, but since that is rarely a concern on modern computers, it is not considered a relevant factor for our decision.

### 6.2.1 Networking

For networking, two primary protocols are available: UDP and TCP. TCP has been chosen in this project; the following paragraphs explain why.

To ensure a fluid stream of data, minimizing protocol overhead is key. In situations where data integrity is not essential, UDP is generally used. It carries no overhead for guaranteeing completeness and correct order of packages.

But if data integrity is vital, TCP is generally the protocol of choice. It has mechanisms for guaranteeing both the completeness and correct order of packages. This comes at the cost of some overhead, but in most applications, this is negligible and well worth the cost. Another key feature of TCP is that it can perform congestion control. TCP will send no more packages if previous packages have gone missing (i.e. if their reception has not been confirmed). Where UDP will happily flood the network with as much data as it is fed, TCP ensures that the network is not flooded. The result is that in case of a bad or slow network connection, the amount of transmitted data is automatically adapted to the network, and only as much data is sent as the client can actually receive and process.

When deciding how to deploy TCP, one can choose to implement one's own sub-protocol, or use one of the existing two: HTTP or WebSockets. The WebSockets sub-protocol, being intended for data streaming and having mature JavaScript support, fits the requirements of our application perfectly. Some additional notes on WebSockets can be found in Appendix C.1 on page C.1.



**Figure 6.1:** The scope application in it's current state, displaying time and FFT data.

### 6.3 Product

The oscilloscope is a web application that can be directly loaded from the server running on the STEMlab. It has the following capabilities in its current implementation:

- Receive data over the network for two channels (this is only limited by the physical channels of the STEMlab).
- Manage triggering set a trigger type, level and the number of samples that have to be recorded before and after the trigger is activated.
- Calculate and display the power density spectrum.
- Calculate the SNR both by automatically detecting the signal and by being told manually where the signal is.
- Export data to an array-string.
- Export and load the scope configuration to and from JSON strings.

The following pages explain how the application is structured, along with the implementation of some of its key features. A screenshot in Figure 6.1 shows the overall layout and design of the scope.

#### 6.3.1 Application Structure

The entire application consists of a single state tree. This makes it very easy to import and export settings and presents a better overview of the application state than scattering state information across various objects. Listing C.2 in Appendix C.2 on page 122 shows an extract of the tree structure code. Like any JavaScript application, the oscilloscope runs asynchronously. Its event loop structure is shown in Figure 6.2.

All of the values that can be controlled through the GUI – and many more – are also controllable directly through the state tree. Upon initialization of the application, the entire state tree is loaded and references to parts of it are passed to the controller objects. The structure of the application is hierarchical; the most important relations are depicted



Figure 6.2: The scope's event structure

in Figure 6.3. In the following sections, some of the more important prototypes and functions from Figure 6.3 are explained.

**The Oscilloscope prototype** is the top level controller which contains exactly one source. It is responsible for handling all mouse events and reacting accordingly, such as moving the trigger level or zooming and panning. The Oscilloscope draw call is responsible for drawing general information which is not part of a specific trace onto the canvas. Oscilloscope is also the caller for the Trace draw call. Oscilloscope manages general information and is responsible for rescaling the canvas and initiating the draw call chain.

**The Source controller** manages all calls to and from the server. It contains a lot of helper calls to set a trigger or issue a new frame. Those helpers are called by the Oscilloscope controller or GUI elements. It also contains the important callbacks to send and receive data from the server. They are explained in more detail in Appendix C.1.

Source stores all received frames in itself. The frames will then later be copied and processed by the trace controller. Each frame received is always overwritten by the next one, preventing memory leaks and ensuring that the data is current. Once the Source controller receives new data, it starts the calc() call for each trace to trigger an update of the trace with new data.



Figure 6.3: The structure of the scope application with its major relations

**The Trace prototype** is in charge of drawing and calculating traces. It has two derived prototypes, `Timetrace` and `FFTrace`. While the `Timetrace` prototype just returns untouched data to display the time domain of a signal, the `FFTrace` prototype calculates some signal metrics before displaying its results, i.e. spectral power density and SNR. The structure of the application allows for easy extension with new trace types with arbitrary math functions (e.g. differential pair subtraction for noise cancelling).

**One instance of a PrefPane** is created for each trace. A `PrefPane` component is a mithril component that creates a vnode that exposes all the necessary controls for its corresponding trace in the GUI. It is also responsible for displaying processed data for a trace such as the SNR for an `FFTrace`. There is also a general `PrefPane` that exposes common controls such as switching modes, the trigger trace or the active trace. The `PrefPanes` are built with `mithril.js`. Appendix C.3 on page 123 provides some general information on `mithril.js` and example code relevant to our application.



**Figure 6.4:** The used JavaScript FFT compared to Matlab's FFT

### 6.3.2 Graphics

Having a robust graphics layer is paramount to providing a user-friendly interface. It must process data quickly for smooth visualization, display signal metrics, and offer sensible controls for the user to interact with. Taking advantage of a GPU in order to speed up data processing and rendering is highly desirable. Due to the open-source nature of this project, using a technology based on OpenGL makes sense; JavaScript offers WebGL for this purpose.

With JavaScript and HTML there come a large variety of libraries that enable the easy creation of graphical applications, and CSS offers great flexibility for styling the resulting structure. `mithril.js` has been chosen to build the controls of the scope. It offers an exceptionally low footprint and high speed when recalculating the DOM. Thus, the user need not download a heavy library into their browser's cache, and the interface updates smoothly and without lag.

For data plotting, existing libraries such as `plotly.js` or `chart.js` could be used. While they provide a lot of built-in functionality like logarithmic plots or automated axis labeling, they also carry significant overhead and lack the performance required for real-time data plotting at high data rates. Therefore, our application draws directly onto an HTML canvas via WebGL draw calls. Some more general information about WebGL and its usage is provided in Appendix C.4 on pages 124f.

### 6.3.3 Power Calculation

Calculating the spectral power of a signal is accomplished via the FFT algorithm first presented by Cooley and Tukey in 1965 [23]. Kevin Kwok has written a compact and fast implementation of this algorithm in JavaScript [24], which is provided under a permissive license [25] and therefore nicely suits the requirements of this project. It has been verified against Matlab's FFT algorithms as a reference; the results line up nicely and are shown in Figure 6.4.

**Table 6.2:** Correction factors for the different window types used in the scope application as seen in [26].

| Window          | CG     | NG     |
|-----------------|--------|--------|
| Rectangular     | 1      | 1      |
| Hamming         | 0.54   | 0.3974 |
| Hanning         | 0.5    | 0.375  |
| Blackman-Harris | 0.3587 | 0.258  |
| Flat Top        | 0.2156 | 0.1752 |

The following equations detail the calculations. For an in-depth explanation, see [26]. The two-sided spectrum of an input signal  $x$  is obtained with

$$Y[i] = \frac{1}{N} FFT(x[i]w[i]). \quad (6.1)$$

From that, the one-sided power spectrum can be calculated via

$$\begin{aligned} P_{yy}[0] &= \frac{Y[0] \cdot Y[0]^*}{NG} \\ P_{yy}[i] &= 2 \cdot \frac{Y[i] \cdot Y[i]^*}{NG} \quad \text{for } 0 < i \leq N/2. \end{aligned} \quad (6.2)$$

Where  $NG$  is the noise gain, and depends on the window being used. Values for common cases are listed in Table 6.2. Finally, the power of a spectral region between two frequencies can be found by integrating over that range of frequencies, or in the discrete case like here, by summing all the bins:

$$P_{1,2} = \int_{f_1}^{f_2} P'_{yy}(f) df \approx \sum_{i_1}^{i_2} P_{yy}[i] \quad (6.3)$$

### 6.3.4 SNR Autodetection

Calculating the signal-to-noise ratio requires defining which components of captured data are signal, and which parts are noise. The power contained in each frequency range can then be computed with Equation 6.3, and the SNR calculated as the following ratio:

$$\text{SNR} = 10 \log_{10}\left(\frac{P_s}{P_n}\right) \quad (6.4)$$

Automatically determining which parts of a spectrum constitute the signal and which parts are noise is the purpose of SNR autodetection. This requires the detection of the actual signal components as well as additional spectral lines introduced in the FFT process. The phenomenon of these spurious spectral components is commonly known as *spectral leakage*. Additionally, one may wish to remove any spurious harmonics originating from the signal source (e.g. a function generator which does not generate a flawless sine wave).

Depending on how the signal is windowed for the FFT, different spectral lines will occur with varying magnitudes. Table C.1 in Appendix C.5 contains the primary characteristics for some commonly used windows. The *Number of Lines per Bundle* entry from that table can be used for a static approach to SNR autodetection:

- Find the peak in the spectrum.
- Presume that peak and  $\frac{n}{2}$  spectral lines to its left and right to be the actual signal, where  $n$  corresponds to the *Number of Lines per Bundle* column in Table C.1.
- Regard the rest of the spectrum, except any potential DC offset and the next  $\frac{n}{2}$  spectral lines above it, as noise.
- Integrate the power over the defined frequency ranges and calculate SNR according to Equation 6.4.

However, this method does not give optimal results. An example data stream similar to the one in Figure 6.5 processed by Matlab's `snr()` function yields about 78 dB. Calculating SNR with the static method as outlined above results in only about 33 dB for such a signal. Even accounting for Matlab's harmonics cancellation, a result no better than 43 dB is achieved (10 harmonics canceled out).

Improving this SNR detection method can be accomplished by using an iterative algorithm. The number of lines below the peak lobe which are still considered signal is increased until the SNR no longer drastically changes. In our implementation, this is defined as a change no greater than 0.05 dB. Algorithm 1 depicts this process systematically. It can be repeated for any number of spurious harmonics to cancel them out as well. The two methods are compared in Figure 6.5. The example also shows highlighted harmonics for elimination in the SNR calculations. The result is in line with Matlab's.

---

**Algorithm 1** Algorithm to iteratively determine the SNR of a spectrum

---

```

 $l \leftarrow 1$ 
 $max_i \leftarrow 0$ 
 $SNR_p \leftarrow 0$ 
 $SNR_n \leftarrow 0$ 
while  $|SNR_p - nextSNR| > 0.05$  do
     $SNR_p \leftarrow SNR_n$ 
     $P_s \leftarrow power(spectrum[max_i - l : max_i + l + 1])$ 
     $P_n \leftarrow power([l : max_i - l] + power(spectrum[max_i + l + 1 :]$ 
     $SNR_n \leftarrow 10 \cdot log_{10} \left( \frac{P_s}{P_n} \right)$ 
     $l \leftarrow l + 1$ 
end while
 $SNR \leftarrow SNR_p$ 

```

---



**Figure 6.5:** Comparison of methods to determine lobe width for SNR calculation. The top plot shows the static method. As can be seen, only a very narrow section of the peak lobe is taken to be the actual signal, while most of the surface beneath it is considered noise.

In the bottom plot, the iterative method to determine the width of the lobe components considered to be signal has been used. As is easily visible, the resulting lobe region is significantly wider. Additionally, spurious harmonics from the signal source have been highlighted with the same process, allowing for their elimination in the SNR calculations.

# Verification

Designing filters in Matlab and running FPGA simulations in Vivado is all nice and well, but in the end, what counts is how the system performs in practice. This chapter outlines which sorts of measurements have been performed, the rationale behind them, and summarizes the results.

All the metrics presented in this chapter are based on the same set of measurements, unless otherwise mentioned. For each filter chain, 2000 measurements have been made, each with a size of 8192 samples in the time domain at one frequency. Between each of the 2000 runs for a single chain, the signal frequency increases equidistantly from 0 to five times the filter's outgoing sampling frequency. Covering 2000 frequencies allows to cover the passband ripple and the transition band in sufficient resolution to detect possible issues.

For the filter chains with  $R = 5$  decimators at their ends (the chains for decimation factors by 5, 25, 125 and 625), this means that the output filter's passband is first traversed, and then its stopband up until half its incoming sampling rate. This enables us to measure the passband and stopband magnitude response. Furthermore, by performing an FFT of the filter chain's output at specific frequencies, we can verify how strongly the stopband aliases back into the passband. This is the problem illustrated in Figures 1.5, 1.17, and 2.3.

For the filter chains with half-band filters as their final stages (1250 and 2500), this frequency range is actually too large, since we would only need to measure up to twice the outgoing sampling rate in order to cover the final filter's magnitude response range. However, there is no real harm in going higher (a bit of resolution is lost), and it simplifies the measuring process, so no particular distinction is made here.

Note that it is generally not necessary to measure at even higher frequencies. As can be seen in Section 1.3 and some of the frequency responses for the filter cascades which are being used (e.g. the chain for  $R = 25$ , see page 115), it is the final filter in a cascade which determines the chain's overall behavior in the relevant frequency regions (assuming the individual filters in the chains have been sensibly designed to match each other's characteristics, also explained in the section *Multi-Stage Filter Designs*). Increasing the input frequency further simply moves into regions which are ever more strongly attenuated, so unless serious issues are encountered, the higher spectral ranges are not of interest in verifying the filter chain's functionality. The measurements for the 1250 and 2500 chains, which do indeed cover more than the necessary frequency range, confirm this.

All measurements are adjusted to have the right scaling with Equation 7.1 where  $x_{\text{unsigned}}$

is the measured sample.

$$x_{i,\text{signed}} = (x_{i,\text{unsigned}} - 2^{15}) \cdot 2^{-15} \quad (7.1)$$

$$x_{i,V} = x_{i,\text{signed}} \cdot 1.1 \text{ V} \quad (7.2)$$

## 7.1 RMS Frequency Response

To verify that the filters perform as expected, the RMS of all samples is calculated and plotted against their respective input frequency, as seen in Figure 7.1. The RMS looks good and the passband is practically flat. Two things stand out as not too positive:

- The passband frequency response has a slight slope upwards. It is at no point higher than 20 mV, effectively having an error below 2 % in all of the filter chains except the  $R = 5$  chain. This is mostly in line with the case in the filter performance as predicted by Matlab; a ripple in the cascade's passband between 0.67 % and 2.8 % is predicted. What is peculiar is that the overall shape of the passband doesn't quite align with Matlab's designs. No behavior occurs which we consider to be deal-breaking though.
- The passband of  $R = 5$  chain looks deformed, on the other hand, and not at all what is expected. Something has gone horribly wrong here. The previous statements do not apply for this case.

On the other hand, these results are as expected:

- The Passband has a slight ripple of  $\pm 1\%$  in all filter chains. This is well within the filter design specification in theory. Matlab predicts a ripple in the cascade's passband between 0.67 % and 2.8 %, depending on the chain.
- The passband shows a gain loss. If the optimum case of a 2 bit win were achieved, all of the amplitudes would be at  $\frac{1}{\sqrt{2}} = 0.707 \text{ V}$  (since input is 2 V<sub>PP</sub>). Since not all filter chains win the same amount of bits, this sadly is not the case. Table 7.2 lists the actual number of bits won, along with correction factors which need to be applied to scale the output correctly.

Overall, these results are mostly acceptable in our view, particularly considering that this system is in essence a first prototype which has not gone through any performance tuning yet. However, there is certainly room for further investigation and improvement.

## 7.2 Mean Frequency Response

Because it became apparent during tests that the device has an offset (even without filters!), the mean of each sample has been calculated too such that at later stages the offset can be calibrated out in the scoping application. The results are shown in Figure 7.2.



**Figure 7.1:** RMS at the output of each filter chain over a range of frequencies. Input is a sine wave with 2 Vpp.



**Figure 7.2:** The mean at the output of each filter stage over a range of frequencies. Since the input signal is a sine wave and the function generator which provides it has been verified to have no (meaningful) offset, these lines should be located at zero volts.



**Figure 7.3:** The SNR at the output of each filter stage over a range of frequencies. Note the downward spikes in the passband. These are caused by Matlab's algorithms during post-processing and are not an actual system issue.

### 7.3 SNR Frequency Response

Two series of measurements are used for SNR: An automated one by us which is evaluated with Matlab's SNR algorithms. Its results are shown in Figure 7.3. SNR for a signal is determined by separating signal and noise components, calculating the respective power and the ratio of signal power to noise power components<sup>1</sup>. This series of measurements allows us to get a good idea of the shape of the system's overall frequency response for the SNR.

However, Matlab's SNR algorithms are not perfect; the automated separation of signal from noise components is prone to errors. Consequently, the results achieved by this method are below the system's true capabilities. To assess actual system performance, SNR measurements referenced against a measured noise floor have also been performed by Mr. Gut. For this, the noise floor is measured by terminating the device with a  $50\Omega$  resistor, instead of extracting it from a signal. Afterwards, a signal is put through the system, the power in the signal is measured, and its power compared to the measured noise floor. This method is more precise than the automated method, and also happens to give better results. Its findings are summarized in Table 7.1.

When comparing our new system against the measurements of the STEMlab's stock configuration by our predecessors<sup>2</sup>, our system achieves an SNR of 84 dB for an input frequency of 3 kHz at a sampling rate of 50 kHz! The stock configuration needs to downsample by a factor of 65536 (corresponding to a sampling frequency of 1.9 kHz in order to achieve an SNR of 83.5 dB<sup>3</sup>.

<sup>1</sup>This is similar to what our scope does.

<sup>2</sup>Those measurements are also referenced against a  $50\Omega$  noise floor.

<sup>3</sup>The higher the downsampling ratio, the better the achievable SNR tends to be.

**Table 7.1:** SNR measurements referenced against a true  $50\Omega$  noise floor, measured by Mr. Gut. The results constitute a significant improvement over the stock system's SNR capabilities (see Table 2.1). Only the chain for  $f_s = 25\text{MHz}$  lies outside the expected pattern; this has already been seen in its magnitude frequency response in Section 7.1.

| $f_s$<br>(kHz) | $f_{\text{signal}}$<br>(kHz) | $V_{\text{sine,RMS}}$<br>(mV) | $V_{\text{noise,RMS}}$<br>(mV) | $\text{SNR}_{50\Omega}$<br>(dB) | $\text{SNR}_{\text{autom.}}$<br>(dB) |
|----------------|------------------------------|-------------------------------|--------------------------------|---------------------------------|--------------------------------------|
| 50             | 1                            | 707                           | $40 \times 10^{-6}$            | 84                              | 71                                   |
| 100            | 25                           | 707                           | $55 \times 10^{-6}$            | 82                              | 71                                   |
| 200            | 50                           | 708                           | $59 \times 10^{-6}$            | 81.5                            | 69                                   |
| 1000           | 250                          | 708                           | $75 \times 10^{-6}$            | 79.5                            | 69                                   |
| 5000           | 1250                         | 707                           | $130 \times 10^{-3}$           | 79.5                            | 69                                   |
| 25 000         | 6250                         | 707                           | $260 \times 10^{-6}$           | 62                              | 69                                   |

## 7.4 Stopband Attenuation

Measuring stopband attenuation is relevant in order to verify whether or not the aliasing effect of the stopband into the passband during downsampling is attenuated to the specified degree of 60 dB. See Figures 1.5, 1.17 and 2.3 for illustrations of the phenomenon.

For this purpose, the power density spectrum is plotted properly scaled and adjusted as seen in Equation 7.3 through 7.7:

$$x_{i,\text{corrected}} = x_{i,V} \cdot \sqrt{\frac{1}{2f_s N}} \quad (7.3)$$

$$X = \text{FFT}(x_{i,\text{corrected}}) \quad (7.4)$$

$$X_{i,\text{one}} = X_i \cdot 2, i < \frac{N}{2} + 1 \quad (7.5)$$

$$X_{i,\text{abs}} = |X_{i,\text{one}}| \quad (7.6)$$

$$S_{\text{dB}} = 10 \log_{10}(X_{i,\text{abs}}^2) \quad (7.7)$$

Figures 7.4 through 7.9 depict the results. Each plot contains one frequency which falls into the passband, one which falls into the filter's edge, and one which falls into the stopband. The passband and edge frequency components are expected to be relatively high (depending on where exactly in the edge the measurement point falls). The measurement point in the stopband should be 60 dB below the passband measurement point. Most satisfactorily, this is the case, and the measurement results line up very nicely with the specifications.



**Figure 7.4:** Attenuation in passband, edge and stopband for  $R = 5$ .  $f_s$  is the sampling rate at the chain's output.



**Figure 7.5:** Attenuation in passband, edge and stopband for  $R = 25$ .  $f_s$  is the sampling rate at the chain's output.



**Figure 7.6:** Attenuation in passband, edge and stopband for  $R = 125$ .  $f_s$  is the sampling rate at the chain's output.



**Figure 7.7:** Attenuation in passband, edge and stopband for  $R = 625$ .  $f_s$  is the sampling rate at the chain's output.



**Figure 7.8:** Attenuation in passband, edge and stopband for  $R = 1250$ .  $f_s$  is the sampling rate at the chain's output.



**Figure 7.9:** Attenuation in passband, edge and stopband for  $R = 2500$ .  $f_s$  is the sampling rate at the chain's output.

## 7.5 Summary

In conclusion, the filter chains perform their task mostly according to specifications. The main point of concern is the stark passband droop in the chain for  $R = 5$  chain. More investigating is needed to determine the cause of this behavior. But overall, we consider it to be a success.

Table 7.2 lists the resulting mean values for comparison against each other.  $\text{SNR}_{\text{won}}$  is the theoretical SNR gain caused by the bits which are gained along the processing chain. It is not the actually achieved improvement in SNR as other factors play a role as well.  $\text{Correction}$  is the correction factor which needs to be applied in order to achieve the proper gain (0.707 V RMS for a 2 V<sub>PP</sub> signal, see Section 7.1).

**Table 7.2:** Mean metrics for all filter chains

| $R$  | $V_{\text{RMS}}$ (V) | $V_{\text{Mean}}$ (V) | $S_{\text{SNR}}$ (dB) | Corr.<br>(1) | BITS<br>wON (1) | SNR<br>wON (dB) |
|------|----------------------|-----------------------|-----------------------|--------------|-----------------|-----------------|
| 5    | 0.6203               | -0.1800               | 79.0054               | 1.1398       | 1.8113          | 10.9038         |
| 25   | 0.6329               | -0.1690               | 76.9049               | 1.1171       | 1.8403          | 11.0784         |
| 125  | 0.5643               | -0.0159               | 73.6582               | 1.2529       | 1.6748          | 10.0820         |
| 625  | 0.5529               | -0.0155               | 71.8813               | 1.2787       | 1.6453          | 9.9048          |
| 1250 | 0.4576               | -0.0130               | 69.7006               | 1.5450       | 1.3724          | 8.2617          |
| 2500 | 0.4092               | -0.0127               | 63.5121               | 1.7278       | 1.2111          | 7.2908          |



# CHAPTER 8

## Conclusions

The project has been a success. We can present six chains with sampling rates from 25 MHz to 50 kHz. While not perfect in every aspect, their passbands exhibit negligible droop, SNR has been measured up to 84 dB, and aliasing attenuation is a minimum of 60 dB.

A toolchain to design filters is available; it can quickly specify a number of filters to a wide range of specifications. The results then allow to converge on the desired implementation. This enables even users without extensive filter designing experience to devise filter chains. Implementing the filters is accomplished through a custom FPGA toolchain which allows for easy and reliable creation of a bitstream by way of Tcl scripts.

To read data from the FPGA and provide it to the user via Ethernet, a server application running on an embedded GNU/Linux has been developed, along with a toolchain for compilation of the operating system. Visualizing and analyzing the data is accomplished with a newly developed JavaScript application which can run in any modern browser, ensuring compatibility across many platforms. Data access through other programs like Matlab is also easily possible.

There were a few notable challenges during development: Soon after launch, it was discovered that the existing code base from the STEMlab was not at the time a viable route; this necessitated the expansion of the project's scope far beyond initial plans. Secondly, ensuring that the correct bits are propagated through the filter chains turned out to be a highly non-trivial task, requiring many hours of simulations and verification. Lastly, although the math behind calibrating the oscilloscope is relatively straightforward in theory, correctly applying it in practice revealed some unanticipated subtleties which need to be accounted for to obtain correct results.

If this project were to be continued, we would issue the following recommendations: Firstly, performing more and improved measurements in order to tune the device's performance, particularly for the sampling rate of 25 MHz. Secondly, about 15 % of the FPGA's resources are currently not being utilized. This was done to have some leeway for adding more features or in case of resource problems during development, but those resources could be exploited for adding functionality and/or increasing performance. And thirdly, the oscilloscope can be readily expanded with additional functionality with little effort.

Overall, we are very happy the results. We thank all our supporters for their help, both personal and technical, and to whomever may find our product useful, we say:

*May your harmonics be undistorted, your noise floor minute, and your aliasing effects attenuated.*

# Bibliography

- [1] E. Hogenauer, "An economical class of digital filters for decimation and interpolation," *IEEE Transactions on Acoustics, Speech, and Signal Processing*, vol. 29, no. 2, pp. 155–162, Apr 1981.
- [2] Altera Corporation. (2007, April) Understanding CIC Compensation Filters. [Online]. Available: [https://altera.com/content/dam/altera-www/global/en\\_US/pdfs/literature/an/an455.pdf](https://altera.com/content/dam/altera-www/global/en_US/pdfs/literature/an/an455.pdf)
- [3] R. E. Crochiere and L. R. Rabiner, *Multirate Digital Signal Processing*, 1st ed., ser. Prentice-Hall signal processing series, A. V. Oppenheim, Ed. Englewood Cliffs, New Jersey 07632: Prentice-Hall, 1983.
- [4] Linear Technology. (2017, Aug) LTC2145-14 - 14-Bit, 125MSps Low Power Dual ADCs. [Online]. Available: <http://www.linear.com/product/LTC2145-14>
- [5] Red Pitaya. (2017, Aug) Red Pitaya Ecosystem and Applications. [Online]. Available: <https://github.com/RedPitaya/RedPitaya>
- [6] ——. (2017, Aug) Welcome to the Red Pitaya Documentation. [Online]. Available: <http://redpitaya.readthedocs.io/en/latest/index.html>
- [7] Elector International Media BV. Stemlab 125-14 (Starter Kit). [Online]. Available: <https://www.elektor.com/stemlab-125-14-starter-kit>
- [8] M. Ossmann. (2014, Dec) Red Pitaya: Mehr als ein USB-Oszilloskop! [Online]. Available: <https://www.elektormagazine.de/articles/red-pitaya-mehr-als-ein-usb-oszilloskop>
- [9] R. Bucher and P. Kuery, "Front-End Signal-Processing For Red Pitaya Spectrum Analyzer," Jan 2017.
- [10] N. Huesser and I. Jeras. (2017, Mar) Which Vivado project base is to be used for custom filters on the FPGA? [Online]. Available: <https://github.com/RedPitaya/RedPitaya/issues/107>
- [11] P. Demin. (2017, Aug) Red Pitaya Notes. [Online]. Available: <http://pavel-demin.github.io/red-pitaya-notes/>
- [12] The MIT License. [Online]. Available: <https://opensource.org/licenses/MIT>
- [13] Xilinx Inc. (2016, Oct) CIC Compiler v4.0 – LogiCORE IP Product Guide. [Online]. Available: [https://www.xilinx.com/support/documentation/ip\\_documentation/cic\\_compiler/v4\\_0/pg140-cic-compiler.pdf](https://www.xilinx.com/support/documentation/ip_documentation/cic_compiler/v4_0/pg140-cic-compiler.pdf)

- [14] ——. (2015, Nov) FIR Compiler v7.2 – LogiCORE IP Product Guide. [Online]. Available: [https://www.xilinx.com/support/documentation/ip\\_documentation/fir\\_compiler/v7\\_2/pg149-fir-compiler.pdf](https://www.xilinx.com/support/documentation/ip_documentation/fir_compiler/v7_2/pg149-fir-compiler.pdf)
- [15] N. Huesser. (2017, Aug) ZYNQ Logger. [Online]. Available: [https://github.com/Yatekii/zynq\\_logger](https://github.com/Yatekii/zynq_logger)
- [16] N. Hüsser and R. Schnider, “Design and Implementation of a FPGA Based Fault Record and Data Logging System for PWM Rectifiers,” Dezember 2014.
- [17] Wikipedia. Idempotence. [Online]. Available: <https://en.wikipedia.org/wiki/Idempotence>
- [18] Xilinx Inc. (2016, Nov) Vivado Design Suite Tcl Command Reference Guide. [Online]. Available: [https://www.xilinx.com/support/documentation/sw\\_manuals/xilinx2016\\_4/ug835-vivado-tcl-commands.pdf](https://www.xilinx.com/support/documentation/sw_manuals/xilinx2016_4/ug835-vivado-tcl-commands.pdf)
- [19] ——. (2016, Apr) Vivado Design Suite User Guide – Using Tcl Scripting. [Online]. Available: [https://www.xilinx.com/support/documentation/sw\\_manuals/xilinx2016\\_1/ug894-vivado-tcl-scripting.pdf](https://www.xilinx.com/support/documentation/sw_manuals/xilinx2016_1/ug894-vivado-tcl-scripting.pdf)
- [20] (2017, Aug) Tcl Developer Xchange. [Online]. Available: <http://tcl.tk/>
- [21] R. Fonseca. Introduction to asynchronous programming. [Online]. Available: <http://cs.brown.edu/courses/cs168/s12/handouts/async.pdf>
- [22] R. Gut et al., “Spectrum Analyzer,” 2016.
- [23] J. W. Cooley and J. W. Tukey, “An algorithm for the machine calculation of complex fourier series,” *Mathematics of Computation*, vol. 19, pp. 297–301, Apr 1965. [Online]. Available: <http://www.ams.org/journals/mcom/1965-19-090/S0025-5718-1965-0178586-1/S0025-5718-1965-0178586-1.pdf>
- [24] K. Kwok. (2015, May) Cooley-tukey fft + dct + idct in under 1k of javascript. [Online]. Available: <https://antimatter15.com/2015/05/cooley-tukey-fft-dct-idct-in-under-1k-of-javascript/>
- [25] ——. (2015, May) Small Cooley-Tukey radix-2 DIT FFT in Javascript. [Online]. Available: <https://gist.github.com/antimatter15/0349ca7d479236fdcd8>
- [26] H. Schmid, “How to use the FFT and Matlab’s pwelch function for signal and noise simulations and measurements,” 8 2012. [Online]. Available: <http://www.schmid-werren.ch/hanspeter/publications/2012fftnoise.pdf>
- [27] Red Hat Inc. (2017, Aug) Ansible Documentation. [Online]. Available: <https://docs.ansible.com/>
- [28] R. Frey and N. Hüsser. (2017, Aug) Vivado Install Guide. [Online]. Available: <https://github.com/alpenwasser/pitaya/tree/master/doc/vivado-install>
- [29] Xilinx Inc. (2017, Aug) Support Portal – Documentation. [Online]. Available: <https://www.xilinx.com/support.html#documentation>
- [30] A. Hultman. (2017, Aug) uwebsockets. [Online]. Available: <https://github.com/uNetworking/uWebSockets>

- [31] N. Lohmann. (2017, Aug) JSON For Modern C++. [Online]. Available: <https://github.com/nlohmann/json>
- [32] I. F. . A. Melnikov. (2011, Dec) The WebSocket Protocol. [Online]. Available: <https://tools.ietf.org/html/rfc6455>
- [33] Wikipedia. OSI model. [Online]. Available: [https://en.wikipedia.org/wiki/OSI\\_model](https://en.wikipedia.org/wiki/OSI_model)
- [34] mozilla. WebSockets API. [Online]. Available: [https://developer.mozilla.org/en-US/docs/Web/API/WebSockets\\_API](https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API)
- [35] [Online]. Available: <https://mithril.js.org/>
- [36] mozilla. CanvasRenderingContext2D. [Online]. Available: <https://developer.mozilla.org/en/docs/Web/API/CanvasRenderingContext2D>
- [37] M. Meyer, *Grundlagen der Informationstechnik*, 1st ed. Vieweg, 2002.



**Part II**

## **Developer Guide**



# 9

## CHAPTER

# Project Structure

This chapter contains answers the question of “What do I find where if I want to do  $\langle X \rangle$ ?”. It guides through the top-level directories of the repository and presents some information on the more important subdirectories. It is not meant as a replacement to reading the READMEs, the technical documentation for the respective components, or toolchain documentation by third parties. Rather, it is indented as a guide to get started and point the developer in the right direction.

The repository tree with its major components is shown on the next page in Figure 9.1, along with some explanatory remarks for respective nodes of the tree. Throughout this chapter, the root directory `/` is understood to be the top-level repository directory. *NOTE:* Only the more important parts of the project structure are mentioned here; less crucial components have been omitted.

The rest of the developer guide explains the setup and usage of the major components listed in the above directory trees. For the most up-to-date and complete documentation, consulting the READMEs and source code is advised.



**design/** contains the filter design toolchain **filter/** which is described in Chapter 11. Additionally, it has a subdirectory for running FPGA tests with the designed filters, used to create the filter resource usage analysis from Appendix B.2.

**doc/** is where the project documentation and associated tools are located. **report/** is where the directory containing this document. The **verification/** directory contains the scripts and results on which Chapter 7 is based. A pictorial guide to the Vivado install (see Section 10.2) can be found in **vivado-install/**.

**env/** has the components which are needed to set up the build box. See Section 10.1 for the instructions on how to accomplish this.

**firmware/** comprises all necessary components to build the Linux image for the STEMlab and the bitstream. **patches/** is a set of patches which are needed to get Ubuntu to run on the STEMlab, **scripts/** are some helper scripts to ease development and deployment, and **server/** contains the server and its needed dependencies.

**bin/** is a collection of helper scripts and binary blobs related to the creation of the board firmware.

**fpga/** consists of the FPGA toolchain, which is primarily the collection of Tcl scripts and Makefiles used to generate the Vivado projects, block designs, and the bitstream.

**scope/** encompasses the scope project (see Chapter 6). Note that this is located in a dedicated Git repository and is included via the mechanism offered by Git submodules.

**resources/** are documents and links which have accumulated through the course of the project and which have proven more or less useful. This primarily includes data sheets and user manuals from third parties.

**Figure 9.1:** Project structure with major directories and subdirectories



# 10

## CHAPTER

# FPGA/SoC Toolchain

Because the FPGA/SoC requires a lot of different utilities and environments, it is advisable to have a fixed development environment as hardware tooling often breaks at the slightest change. For this reason we provide a build box<sup>1</sup> which contains an Ubuntu Linux. To set up the build box, Vagrant and ansible are required. The former pulls the base Linux box from a global repository at HashiCorp<sup>2</sup>. ansible is then used to provision the box to install all the necessary tooling. Once that is set up, the user has to perform a graphical install of the Vivado toolchain.

This chapter describes how to set up the build box and Vivado, and how to generate a Linux image which can be flashed onto the STEMlab's SD card to get the device up and running.

*Note:* The following represents code which has to be entered on the command line:

```
enter commands here
```

## 10.1 Setting Up the Build Box

The following prerequisites need to be installed onto the host machine first:

- Vagrant
- ansible
- VirtualBox

If you are on a Linux or macOS and use a package manager to install VirtualBox. It is also necessary to install the VirtualBox Guest Additions packages. They are required in later stages during the setup of the box. It is also advisable to perform a

```
sudo vboxreload
```

Otherwise a reboot of the host system might be necessary.

After all the tools on the host have been installed, we move to the root of the project repository. For all the further steps we assume that we operate from that root directory.

<sup>1</sup>A virtual machine image for the purposes of development.

<sup>2</sup>The corporation behind Vagrant

To do an initial setup of the box, enter

```
cd env
vagrant up
```

This will boot the box and provision it using ansible. This step requires a stable internet connection. If anything fails during the initial setup (red messages in the shell), you can run and retry the provisioning with

```
vagrant provision
```

Once the provisioning has finished, the build box should be rebooted because some changes (for example the installed desktop environment) require a reboot. Do this by running

```
vagrant halt
vagrant up
```

If you want to make changes to the default box setup, have a look at the file `/env/roles/common/tasks/main.yml` and the official ansible documentation [27].

The fully configured build box should contain two shared folders: `/vagrant/` points to the `/env/` directory on the host, and `/repo/` points to `/` on the host.

**IMPORTANT:** The password of the default vagrant user is **vagrant** and has sudo privileges.

## 10.2 Setting up Vivado

After having completed the basic build box setup as outlined above, Vivado needs to be installed. For this Section, we assume that all commands are executed on the VirtualBox (*guest* for the rest of this manual). Enter

```
/repo/Xilinx/Downloads/Xilinx_Vivado_SDK_2016.2_0605_1_Lin64.bin
```

into a shell to start the Vivado install. The graphical installer will guide through the process. A complete pictorial guide for this is beyond the scope of this document, but an illustrated guide with screenshots for all steps is available under [28]. For documentation on Vivado itself, it is recommended to have a look at the documentation portal from Xilinx [29]; it contains extensive documentation on many topics.

## 10.3 Building a Linux

With the fully set up build box it is possible to build an image with only two commands. While it is possible to only perform the build steps once changes have been made to the FPGA project or the server application, it is advisable to perform an initial build. This allows to check if everything is set up correctly and works as intended.

First we copy the repository to a **non-shared** folder on the guest machine, because u-boot requires `mmap()`, which cannot handle shared folders. After that, building the image is a

single-step process. This includes building the Linux, the bitstream, the required first-stage bootloader and the board support package, the logger kernel module, the server application, and the scope application. After that, a bash script is used to create an image with all the components, mount it and provision the Ubuntu environment which is to be run on the STEMlab. To start this process, enter

```
cp -r /repo ~/local_folder  
cd ~/local_folder  
make init
```

After the build process has finished, the image can be created using

```
cd ~/local_folder/firmware/arm  
sudo sh scripts/image.sh scripts/ubuntu.sh red-pitaya-ubuntu.img 1024
```

These scripts were initially created by the Red Pitaya corporation, altered by Pavel Demin [11], and have again been adapted to the requirements of this project.

# Filter Toolchain

Designing the filters according to the desired specifications (see Chapter 3) is performed through a set of Matlab scripts. This chapter describes the overall structure of the script suite and gives a basic usage example. All functions have a help available which describes their usage, particularly their respective interfaces.

## 11.1 Toolchain Structure

This section explains which files constitute the toolchain and what their purpose is. The entire toolchain can be used either from Matlab's graphical front-end, or from its command-line mode. The following files are present in the filter design directory<sup>1</sup>:

```
design/filter/
├── cliDispatcher.m
├── guiWrapper.m
└── generators/
    ├── decCIC.m
    ├── decFIR.m
    ├── halfbandFIR.m
    ├── compCIC.m
    ├── cascador.m
    ├── parcascador.m
    ├── pardecFIR.m
    └── parhalfbandFIR.m
├── plotData
├── coefData
├── Makefile
└── README.md
```

**cliDispatcher.m** is the main script from where the design functions are initiated. To design a new filter chain, this is where its specifications need to be placed, and from where the filter design scripts are then called with those parameters.

---

<sup>1</sup>relative to global project root

**guiWrapper.m** is a convenience layer which makes it easier to work with **cliDispatcher.m** from Matlab's graphical front-end. It sets some of the configuration parameters for calling **cliDispatcher.m** which are otherwise set in Matlab's command-line interface when calling the dispatcher from there.

**generators/** is the directory where the filter design scripts are located. These are split into two primary groups: Functions which design filters, and functions which combine them into cascades. Note that all generators support generating multiple filters at once for a combination of various parameters and will pass back a cell array with the resulting filters and the parameters used in their specification.

This allows to iteratively generate a large set of filters in an initial step to assess resource usage or other characteristics for a wide range of parameters. The parameters for FIR filters are the ones laid out in Section 1.2.2. For half-band and CIC filters, they slightly differ. The filter design functions are:

- decCIC.m** designs a CIC filter.
- decFIR.m** designs a FIR filter.
- halfbandFIR.m** designs a half-band filter.
- compCIC.m** designs a compensator for a CIC filter.

Note that all of these have a version which iterate in parallel over a given set of specifications, prefixed with `par` (e.g. `pardecFIR`). The interface for all parallel versions is identical. They can be used if Matlab's parallel processing toolbox is available. If only a small set of filters is to be designed, it is recommended to use the regular, serial versions, since starting up a parallel processing pool in Matlab is a slow process and its overhead is usually not worth it in those cases. If Matlab is being run in command-line mode, make sure *not* to start it with the `-nojvm` switch, since the parallel processing pool requires the Java Virtual Machine.

The functions `cascador.m` and `parcascador.m` create cascades of filter cell arrays passed to them. Note that in order to achieve the desired iteration result, some manual intervention in re-structuring the filter objects passed to the cascade functions might on occasion be needed, particularly in the case of cascading other filters with CIC filters. This is because the set of parameters used to design CIC filters is different from the set of coefficients used to design FIR filters, which means the objects passed back from the CIC and FIR filter functions might not always match as needed in their structure to cascade them in all possible manners. This is not an issue when only cascading single CIC filters with other filters.

**coefData** and **plotData** are two directories which are created by the functions (if they do not yet exist) for storing filter property data. **coefData** contains the filter coefficient files for each filter in the Vivado-compliant `.coe` format. This allows a direct import of Matlab's results into Vivado's FIR compiler.

**plotData** contains frequency responses for each filter, as well as any potential cascade, in `.csv` format, as given by Matlab's `freqz()` function. This is used to generate the frequency response plots from this report, for example.

**The Makefile** can be used to call **cliDispatcher.m** from the command line. It has various targets for designing different filters or filter chains. Matlab will be called in command-line mode, and its output redirected to a log file, so not output will generally be visible on screen. This is primarily intended to be used when all filters have been specified and fixed, and the needed files are to be generated for use on the FPGA side.

The `README.md` contains additional information which is beyond the scope of this documentation. In general, it is highly recommended to consult both the `README` and the help of the provided functions in case of questions, as well as the code itself, which is extensively commented (particularly `cliDispatcher.m`).

## 11.2 Usage

Now that we know the location and purpose of the main components, it is time for a basic usage example. For this, it is useful to know how to use the basic FIR filter design function (Listing 11.4), the CIC design function (Listing 11.6), and how to cascade filters (Listing 11.7). All of this code is to be put into the `cliDispatcher.m` file, and a case created for it. This is shown in Listing 11.1.

**Listing 11.1:** Using `cliDispatcher.m`

```

1 % cliDispatcher.m
2 genDir = 'generators';
3 coefDir = 'coefData';
4 plotDir = 'plotData';
5 addpath(genDir);
6 switch filtertype
7     % ... OTHER CASES ...
8     case 'EXAMPLE_FILTER_AND_OR_FILTER_CHAIN'
9         clear all;close all;
10        filtertype='EXAMPLE_FILTER_AND_OR_FILTER_CHAIN';
11        genDir = 'generators';
12        coefDir = 'coefData';
13        plotDir = 'plotData';
14
15        % ... filter design specifications and function calls ...
16 end

```

To execute `cliDispatcher.m`, Matlab's command-line interface can be used:

**Listing 11.2:** Calling `cliDispatcher.m` from Matlab's Command-line Interface

```

>> filtertype='EXAMPLE_FILTER_AND_OR_FILTER_CHAIN';
>> cliDispatcher

```

Alternatively, one may create an entry in `guiWrapper.m` when using Matlab's graphical interface for added convenience. An entry can be created as shown in Listing 11.3.

**Listing 11.3:** Creating an Entry for `cliDispatcher.m` in `guiWrapper.m`

```
%> EXECUTE EXAMPLE_FILTER_AND_OR_FILTER_CHAIN
clear all;close all;clc;
filtertype = 'EXAMPLE_FILTER_AND_OR_FILTER_CHAIN';
disp('Designing Chain for R = EXAMPLE')
run cliDispatcher;
```

Besides knowing how to initiate the filter design toolbox, one must obviously also know how to actually design filters. The code in Listing 11.4 designs *two* FIR filters, with two different stopband edge frequencies ( $F_{st}$ ). The resulting cell array  $Hd$  from Listing 11.5 has in its first column the designed filter system objects, and in the remaining columns the design parameters used to specify the filter.

**Listing 11.4:** Designing two FIR Filtes

```
1 R1      = 5;
2 Fp      = 0.2;
3 Fst     = [0.21 0.225];
4 Ast     = 60;
5 Ap      = 0.25;
6 HdFIR  = decFIR(R1, Fp, Fst, Ap, Ast, coefDir, plotDir);
```

**Listing 11.5:** Cell Array with Two FIR Filters

```
>> Hd

Hd =
2×6 cell array

[1×1 dsp.FIRDecimator]    [0.2000]    [0.2500]    [0.2100]    [60]    [5]
[1×1 dsp.FIRDecimator]    [0.2000]    [0.2500]    [0.2250]    [60]    [5]
```

Designing a CIC filter uses slightly different parameters, but is otherwise similar. One point of note is that the compensator's passband edge is specified relative to the CIC filter's incoming sampling rate, not the sampling rate at which the compensator runs, as is common for FIR filters otherwise. Hence  $FpComp = 1/R2$  on line 8 of Listing 11.6.

The  $HdComp$  cell array will look similar to the cell array from the FIR design function above. However, it will have both the cascade of the CIC filter and its compensator, as well as each filter by itself, in a separate column entry.

**Listing 11.6:** Designing a CIC Filter and Its Compensator

```

1 R2      = 32;
2 RCIC    = 8;
3 AstCIC  = 60;
4 FpCIC   = 1/R2;
5 DL      = 1;
6 HdCIC   = decCIC(RCIC, FpCIC, AstCIC, DL, plotDir);
7 RComp   = 4;
8 FpComp  = 1/R2;
9 FstComp = 1/R2 * 1.1;
10 ApComp  = 0.25;
11 AstComp = 60;
12 HdComp  = compCIC(RComp, FpComp, FstComp, ApComp, AstComp, DL, ...
13                               HdCIC, coefDir, plotDir);

```

Cascading two filters with the `cascador` or `parcascador` functions is shown in Listing 11.7. It accepts filter cell arrays as returned by the filter design functions (so, `Hd` from Listing 11.4, for example). However, before being given to the `cascador` function, the filters which are to be cascaded must be packaged into a single cell array, called `stages` in Listing 11.7. `Hd1` and `Hd2` are presumed to be of the form of `Hd` from above.

No special care needs to be taken when handling `HdComp` cell arrays to `cascador`<sup>2</sup>, despite its first column being a cascade filter object instead of a single filter. `cascador` will notice the difference and unpack the cascade<sup>3</sup>. The other parameters handed to `cascador`, such as `R`, `Fst` etc. should correspond to the overall properties of the cascade, and not the individual stages. Since it is not possible to automatically determine some of these properties before actually cascading the filter, the user must manually set these to the correct values. Particularly when cascading cell arrays with multiple filters, some care must be taken in order to ensure that all desired permutations are produced. This is because the `cascador` iterates over these parameters when cascading the filters. They are also used to name the resulting plot files for the cascade.

**Listing 11.7:** Cascading Two Filter Cell Arrays

```

1 % ... design filter objects first ...
2 stages = cell(2,1);
3 stages{1} = Hd1;
4 stages{2} = Hd2;
5 Hcasc = cascador(R, Fp, Fst, Ap, Ast, 1, plotDir, stages);

```

While the above remarks cover the most essential information, they by no means constitute a comprehensive guide. When designing filters, it is therefore highly recommended to consult the READMEs, the help of the functions, the code and its comments, as well as the official Matlab documentation.

<sup>2</sup>There is a limitation to the numbers of filters Matlab can chain in a single cascade. This limitation might be relevant when designing long filter chains. However, multiple cascades can themselves be cascaded, so this issue can be worked around if necessary.

<sup>3</sup>Because cascade objects can themselves be cascaded, this is not strictly necessary. But for the sake of easier understanding and elegance, we unpack cascades when possible.

## CHAPTER

# 12

## Server

The server application runs on the ARM Linux and is responsible for moving data from the FPGA to the network, and vice versa. This chapter explains how it is to be compiled and how to expand its functionality, if needed.

### 12.1 Building the Server

The server can only be built using the build environment as explained in Chapter 10. A simple

```
cd /repo/firmware/arm/server  
make
```

should suffice to build all the external dependencies and the binary for the ARM core. The external libraries can be rebuilt using

```
cd /repo/firmware/arm/server  
make external
```

The server application can be rebuilt after changes have been made by issuing

```
cd /repo/firmware/arm/server  
make arm
```

The server application is a one-file application and depends on libuWebSockets [30] and a header file called json.hpp from Niels Lohmann's *JSON for Modern C++* project [31], which contains the entire JSON library. There are two important functions for extending the server application: `onHttpRequest` and `onMessage`, which are explained in Section 12.2.1 and 12.2.2, respectively.

### 12.2 Manually Starting the Server

The image for the entire system is built such that it contains a Sys-V init script and the necessary entries to start on boot. The Sys-V init script then in turn calls the server startup script.

This script loads the kernel module and sets the `LD_LIBRARY_PATH` variable such that it can load the shared libraries required. The script can be executed with

```
/opt/server/init_server.sh
```

If the Sys-V init daemon already started an instance, it can be stopped using

```
service server stop
```

For the exact commands, consult the code of the `init_server.sh` script.

### 12.2.1 onHttpRequest

This callback is executed when the user makes an HTTP request which is not an UPGRADE request. Currently it simply serves the files from the file system, but could easily be extended to outline data or similar tasks. Returning a correct HTTP response has to be performed manually and is not well documented in the library itself.

If the response should outline a 200 OK status,

```
res->end(const char*, size_t);
```

can be used with a string and a size. The library then detects that no header is attached to the response and attaches a proper 200 OK header.

If a custom header should be attached, it first has to be written into the answer, after which the actual content can be written into it as well:

```
std::string mime;
char header[128];

mime = std::string("text/css");
std::string content = std::string("Hello World");
int header_length = std::sprintf(
    header,
    "HTTP/1.1 200 OK\r\nContent-Length: %u\r\nContent-Type: %s\r\n\r\n",
    str.size(),
    mime.c_str()
);
res->write(header, header_length);
res->end(str.c_str(), str.size());
```

### 12.2.2 onMessage

This callback is triggered when a new message is received through the WebSocket. For now, this call only handles incoming text messages in our project as those contain the instructions. Binary messages are simply discarded but could be used at a later time to interface with the DAC.

## 12.3 Instruction Set

The server's instruction set contains eight commands which are listed in Listing 12.1 through 12.8. The following sections explain their basic purpose and functionality.

### 12.3.1 Forcing a New Trigger Event

This command forces the logger to finish its current frame. It still respects the pre and suf conditions which have been set. The server does not automatically send the recorded frame; it has to be requested separately. The argument forceTrigger always has to be set to true. Example code for this is shown in Listing 12.1.

**Listing 12.1:** Forcing a New Trigger Event

```
1 // Request
2 {
3     "forceTrigger": "true"
4 }
5 // Response
6 {
7     "response", {
8         {"request", "forceTrigger"}, 
9         {"status", status}, // "error" or "ok"
10        {"error", errorMessage}
11    }
12 }
```

### 12.3.2 Configuring the Frame Sent by the Server

This command tells the server how big the frame should be and how many samples have to be recorded before and after the trigger. All arguments have to be of the numerical type in the JSON format, not strings. Listing 12.2 contains the corresponding example.

**Listing 12.2:** Configuring the Frame Sent by the Server

```
1 // Request
2
3 {
4     "frameConfiguration": {
5         "frameSize": frameSize,
6         "pre": minSamplesBeforeTrigger,
7         "suf": minSamplesAfterTrigger
8     }
9 }
10
11 // Response
12
13 {
14     "response", {
15         {"request", "frameConfiguration"}, 
16         {"status", status}, // "error" or "ok"
17         {"error", errorMessage}
18     }
19 }
```

### 12.3.3 Setting the Number of Logged Channels

This command tells the server how many channels are being logged. This should always be two as the STEMlab does not support more channels. The logger itself would support up to eight channels. The argument has to be a number in the JSON format, not strings.

**Listing 12.3:** Setting the Number of Logged Channels

```

1 // Request
2 {
3     "setNumberOfChannels": 2
4 }
5 // Response
6 {
7     "response", {
8         {"request", "setNumberOfChannels"}, 
9         {"status", status}, // "error" or "ok"
10        {"error", errorMessage}
11    }
12 }
```

### 12.3.4 Reading the Currently Stored Frame

This command forces the server to send the currently stored frame over the binary channel as soon as the current frame is finished. If the logger is not currently recording the frame will be sent immediately. The channel argument has to be a number in the JSON format, not strings.

**Listing 12.4:** Reading the Currently Stored Frame

```

1 // Request
2 {
3     "readFrame": true,
4     "channel": channelID
5 }
6 // Response
7 {
8     "response", {
9         {"request", "readFrame"}, 
10        {"status", status}, // "error" or "ok"
11        {"error", errorMessage}
12    }
13 }
```

### 12.3.5 Requesting a New Frame and Reading It When It Is Ready

This command forces the server to start a new frame and send it over the binary channel as soon as the frame is finished. The channel argument has to be a number in the JSON format, not strings.

**Listing 12.5:** Requesting a New Frame and Reading It When It Is Ready

```

1 //Request
2
3 {
4     "requestFrame": true,
5     "channel": channelID
6 }
7
8 // Response
9
10 {
11     "response", {
12         {"request", "requestFrame"},
13         {"status", status}, // "error" or "ok"
14         {"error", errorMessage}
15     }
16 }
```

### 12.3.6 Setting the Sampling Rate

This command sets the sampling rate. The argument has to be a number in the JSON format and has to be the sampling rate in Hz

**Listing 12.6:** Setting the Sampling Rate

```

1 // Request
2
3 {
4     "samplingRate": samplingRateInHz
5 }
6
7 // Response
8
9 {
10     "response", {
11         {"request", "samplingRate"},
12         {"status", status}, // "error" or "ok"
13         {"error", errorMessage}
14     }
15 }
```

### 12.3.7 Polling the Status of the Logger

This command requests the current logger status. The response contains all the information the logger currently holds.

**Listing 12.7:** Polling the Status of the Logger

```
1 {
2     "status": "true"
3 }
4
5 // Response
6
7 {
8     "response": {
9         "request": "forceTrigger",
10        "status": status, // "error" or "ok"
11        "error": errorMessage,
12        "data":{
13            {
14                {"memorySize", memorySizeInBytes},
15                {"baseAddress", physicalBaseAddress},
16                {"currentAddress", currentPhysicalAddress},
17                {"pre", pre},
18                {"suf", suf},
19                {"numberOfChannels", numberOfChannels},
20                {"started", hasLoggerStarted},
21                {"IRQack", wasIRQAckedActiveLow},
22                {"errorCode", errorCode},
23                {"faultyAddress", faultyPhysicalAddress},
24                {"testMode", isTestModeActive},
25                {"numberOfSamples", numberOfRecordedSamples},
26                {"numberOfSamplesTimes", numberOfRecordedSamplesTimesFull},
27                {"decimationRate", decimationRate},
28            }
29        }
30    }
31 }
```

### 12.3.8 Configuring the Trigger

This command configures the currently active trigger. For now, only rising edge triggers are supported on the server side. The logger itself supports more trigger types. For more information on the logger's capabilities, it is recommended to peruse its code at [15].

channel is the channel on which the trigger should be active. The level at which the trigger shot fire is set with level, while slope is the minimum slope the curve needs to have to trigger. The hysteresis is configured for all triggers at once and makes sure there is no accidental trigger. It is the variance the signal can have until a trigger is armed.

**Listing 12.8:** Configuring the Trigger

```
1 // Request
2
3 {
4     "triggerOn": {
5         "type": "risingEdge",
6         "channel": channel,
7         "level": levelConvertedToUnsigned,
8         "slope": minimalSlope,
9         "hysteresis": hysteresis
10    }
11 }
12
13 // Response
14
15 {
16     "response", {
17         {"request", "triggerOn"},
18         {"status", status}, // "error" or "ok"
19         {"error", errorMessage}
20    }
21 }
```

# 13

## CHAPTER

## Scope

To set up the scope project, the build box is not needed. Still, the build box does have the necessary tools installed, should the need arise for some reason. For development, a local install is advised, however.

This requires installing yarn and nodejs version 8.0 or newer. Package managers for Unix systems usually already have packages for these tools in their repositories. For Windows, the official installers can be used. Once yarn is set up, the project dependencies can be installed using

```
cd ~/repo/scope/  
yarn install
```

The yarn project comes with two main build configurations; one for deployment and one for running a debugging webserver. Building is done with

```
cd ~/repo/scope/  
yarn build
```

This leaves all the built files in `/repo/scope/build`. They can then be copied to any directory served by a webserver, to be provided to a client by said server. Running the debugging webserver brings the benefit of automatic rebuilding and reloading in the browser after rebuilding. The webserver can be started in debugging mode with

```
cd ~/repo/scope/  
yarn watch
```

The webpage can now be reached on `http://localhost:8080` and will be automatically reloaded when yarn detects a change and rebuilds the bundle.



**Part III**

**User Guide**

## Setup

Initializing the system and using the oscilloscope is mostly straightforward, even for beginners. A few notes on the initial setup are outlined here to get the reader started. The next chapter then presents the general concept of the user interface.

Getting the system up and running requires two basic steps: Initializing the STEMlab itself, and connecting to it from a web browser to run the oscilloscope. Assuming the STEMlab board at hand has been delivered with an SD card containing the correct Linux image from this project, the procedure for this is as follows:

1. Insert the SD card coming with the board (*Note:* Check that the SD card has the right side up to make pin contact).
2. Connect the board physically to the network.
3. Connect the board to the power supply.
4. Call the board's IP address and the right port (eg. <https://10.84.130.54:50090>) in a browser of your preference (for best results use Chrome 61.0 and above). Figure 14.1 shows an example how to do this.
5. A notice that a popup menu has been blocked should appear (Figure 14.2). Select *Always allow popups from this application* or similar and reload the previously called page. Figure 14.3 illustrates the necessary setting.
6. A popup tab should now contain the scope and automatically connect to the STEMlab's webserver. Figure 14.4 shows a running scope.

If it is unknown whether the SD card was delivered with a prebuilt image, it can be assumed that it was and the board can be powered on. If the orange LED farthest away from the ports (RJ45, SD card, ...) flashes at 2 Hz, the SD card is fine. If the SD card does not contain a prebuilt image, Section 10.3 in the Developer's Guide explains how one can be acquired or built.



**Figure 14.1:** Using a browser and the correct URL to run the scope application on the STEMlab



**Figure 14.2:** The browser warns about a popup the site has tried to open.



Figure 14.3: Let the browser accept popups in the future



Figure 14.4: The running scope after everything has been set up properly

## Operation

Once the oscilloscope is running, interaction is done via its graphical user interface. Its most important features are summarized here. It has been designed with high discoverability in mind, so a complete guide on every single last button and menu is considered beyond the scope of this document.

Figure 15.1 shows a screenshot of the interface. The UI's most important functions are:

**Zooming and Panning:** To zoom, use the vertical and horizontal scrolling function of the mouse or trackpad. To pan, click and drag the signal. If you are dragging a time trace it will automatically move the trigger location. The two arrow buttons in the general preference pane resize the signal to the visible area.

**Triggering:** To set the trigger level, move the drawn trigger level by clicking and dragging it or enter the level in numbers on the preference pane of the triggering time trace.

**Markers:** There are four markers by default. Two mark the start and end of the area over which power is integrated and two mark the area that contains the signal for the automatic SNR calculation.

Markers can be dragged and moved by clicking and dragging. The number at the bottom of each marker shows the frequency it is at.

The plus button in the general pref pane adds a new marker which can be used to mark certain frequencies.

To move the SNR markers, the SNR mode has to be set from *auto* to *manual* mode in the corresponding FFT trace preference pane.

**Sampling Rate:** To configure the sampling rate, use the drop-down menu at the top of the general preference pane.

**Line Width:** To improve readability, some people might like a thicker line width. To set the line width, use the input at the top of the general pref pane.



Figure 15.1: The scope application in its current state, displaying time and FFT data

## **Appendices**

# Theoretical Background

This chapter contains some supplementary information on the theoretical background behind this project. For those who wish to better understand the internal workings of a CIC filter, the first section details exactly what happens inside a CIC filter for a very simple example. Simulink files are also provided for further study. The second section contains tabulated values for CIC filter design taken from [1]

## A.1 Internal Behavior of a CIC Filter

This section presents an example for a very simple CIC filter to better understand its internal workings. For verification, the filter is also implemented in a Simulink model and simulated.

In the interest of simplicity, we choose a filter with a decimation rate  $R = 2$ , a differential delay  $M = 1$  and  $N = 1$  stages. The corresponding topology is shown in Figure A.1; Figure A.2 shows the magnitude frequency response of the filter. As can be clearly seen, this filter would be of very limited use in practice. However, for the purposes of this example, we feed a DC signal (a constant) into the filter, so the only thing of importance is the filter's DC gain (which is 6 dB, or 2).

Lastly, we restrict numerical accuracy to three bits in two's complement; the entire range of representable values can be found in Figure A.3. This limits the number of steps which need to be calculated to gain the desired insight into the filter's mathematical mechanics.



**Figure A.1:** Topology of the CIC filter for this example



**Figure A.2:** Frequency response of a CIC filter with  $N = 1$ ,  $M = 1$ ,  $R = 2$ . Note the DC gain of 2.

The state of the filter can be calculated by the formulas given in Equations A.1 and A.2:

$$N = 1 \quad M = 1 \quad R = 2$$

$$OUT_{INT}[n] = IN_{COMB}[n] = IN[n] + OUT[n-1] \quad (\text{A.1})$$

$$\begin{aligned} OUT_{COMB}[n] &= IN_{COMB}[n] - IN_{COMB}[n-R \cdot M] \\ &= OUT_{INT}[n] - OUT_{INT}[n-2] \end{aligned} \quad (\text{A.2})$$

The input of the filter shall be a constant of 1, starting at time zero. Once this input is applied to the system, the integrator stage will begin to accumulate the constant. Given an unlimited number of digits (bits), the integrator would in theory reach infinity if it kept running forever. In practice, however, it wraps around once it has reached its maximum representable value ( $011 = 3$ ) and begins counting from its lower numerical limit ( $100 = -4$ ) again. This cycle keeps repeating as long as the filter is running.

The ingenuity of the CIC filter lies in exploiting the fact that this wraparound is irrelevant to the comb stage. Whether the comb stage calculates the difference between an integrator's value whose precision is unbounded or whether it calculates the difference between two values which have potentially been wrapped is without consequence.

As a demonstration of this effect, we shall examine the computation step of cycle  $n = 4$  from Table A.1 (which contains the entire filter's state for 15 steps). At this point, the state of the filter's output is as follows (represented in three-bit two's complement and decimal):

$$\begin{aligned} OUT_{COMB}[4]_b &= OUT_{INT}[4] - OUT_{INT}[2] \\ &= 101_b - 011_b \\ &= 010_b \end{aligned} \quad (\text{A.3})$$

$$\begin{aligned} OUT_{COMB}[4]_d &= -3_d - 3_d \\ &= -6_d \end{aligned} \quad (\text{A.4})$$

Obviously, the decimal and binary results do not match. This is where the wraparound comes into play, for which we shall look at Figure A.3. The figure presents a circular arrangement for all numbers in two's complement with three digits precision. In that arrangement, addition of a positive number corresponds to moving clockwise through the circle, while subtraction of a positive number means moving counterclockwise. Doing this



**Figure A.3:** Subtracting 3 from  $-3$  in two's complement with three digits precision, represented on a circle. Addition of a positive number corresponds to moving clockwise, subtraction of a positive number corresponds to moving counterclockwise.

for the calculation of Equation A.4, we find that moving by three in the counterclockwise direction lands us at 2, exactly as Equation A.3 demands. The computation has *wrapped around* its boundary ( $-4$ ).

What is left to verify is that the result of Equation A.3 is indeed the correct result, i.e. if an unbounded integrator and comb would have yielded the same outcome. And indeed, they would have:

$$OUT_{COMB}[6] = 7 - 5 = 2 \quad (\text{A.5})$$

As a last step to confirm our results, the CIC filter of this exercise is simulated with Simulink. Its block design is given in Figure A.4. All blocks are set to two's complement with three digits of precision (no fractional bits).

The simulation results are given in Figure A.5. As can be seen, the filter states are identical to our manually calculated example. The effect of the integrator's output wrapping around the numerical boundaries is also nicely visible. In case the reader wishes to tinker around with this themselves, the Simulink files can be found in the subdirectory doc/report/images/cicSimu, relative to the repository root. Note that the fixed-point toolbox is required for this.

Lastly, it is shown what happens when the expected output of the filter exceeds the numerical range available. This is accomplished by feeding a constant of 2 into the filter; the expected output is therefore 4.

Because 4 is not within the range of a three-digit two's complement number system, the filter wraps around and produces an incorrect output,  $-4$ . The first few calculation steps for this are presented in Table A.2; running the above simulation with the modified input also confirms this result.

**Table A.1:** Binary and decimal values for the different filter elements during various stages of the filtering process. As expected due to the filter's DC gain of 2, its output is 2. The two right columns contain the calculations as they would occur if the filter's components had unbounded precision. It can be seen that the wraparound effect of the two's complement representation does indeed not change the filter's output.

| Cycle | IN  | OUT <sub>INT</sub><br>IN <sub>COMB</sub> | OUT | OUT <sub>INT</sub> | OUT<br>(bounded<br>integrator) | OUT<br>(unbounded<br>integrator) |
|-------|-----|------------------------------------------|-----|--------------------|--------------------------------|----------------------------------|
|       |     |                                          |     | (unbounded)        |                                |                                  |
| -2    | 000 | 000                                      | 000 | 0                  |                                |                                  |
| -1    | 000 | 000                                      | 000 | 0                  |                                |                                  |
| 0     | 001 | 001                                      | 001 | 1                  | 1 - 0 = 1                      | 1 - 0 = 1                        |
| 1     | 001 | 010                                      |     | 2                  |                                |                                  |
| 2     | 001 | 011                                      | 010 | 3                  | 3 - 1 = 2                      | 3 - 1 = 2                        |
| 3     | 001 | 100                                      |     | 4                  |                                |                                  |
| 4     | 001 | 101                                      | 010 | 5                  | -3 - 3 = -6 = 2 <sub>wr</sub>  | 5 - 3 = 2                        |
| 5     | 001 | 110                                      |     | 6                  |                                |                                  |
| 6     | 001 | 111                                      | 010 | 7                  | -1 - (-3) = 2                  | 7 - 5 = 2                        |
| 7     | 001 | 000                                      |     | 8                  |                                |                                  |
| 8     | 001 | 001                                      | 010 | 9                  | 1 - (-1) = 2                   | 9 - 7 = 2                        |
| 9     | 001 | 010                                      |     | 10                 |                                |                                  |
| 10    | 001 | 011                                      | 010 | 11                 | 3 - 1 = 2                      | 11 - 9 = 2                       |
| 11    | 001 | 100                                      |     | 12                 |                                |                                  |
| 12    | 001 | 101                                      | 010 | 13                 | -3 - 3 = -6 = 2 <sub>wr</sub>  | 13 - 11 = 2                      |
| 13    | 001 | 110                                      |     | 14                 |                                |                                  |
| 14    | 001 | 111                                      | 010 | 15                 | -1 - (-3) = 2                  | 15 - 13 = 2                      |



**Figure A.4:** Simulink model for the filter in Figure A.1. The Rate Compressor is a simple Matlab function which returns every second value of its input vector.



**Figure A.5:** Simulation results from Simulink for the filter in Figure A.4

**Table A.2:** The same CIC filter as before stimulated with an input of 2. A comparison between the right two columns shows that the output of the bounded filter and its unbounded counterpart no longer match starting with the output at  $n = 2$ ; the filter produces a false output.

| Cycle | IN  | OUT <sub>INT</sub> | OUT | OUT <sub>INT</sub><br>(unbounded) | OUT<br>(bounded<br>integrator) | OUT<br>(unbounded<br>integrator) |
|-------|-----|--------------------|-----|-----------------------------------|--------------------------------|----------------------------------|
|       |     | IN <sub>COMB</sub> |     |                                   |                                |                                  |
| -2    | 000 | 000                | 000 | 0                                 |                                |                                  |
| -1    | 000 | 000                | 000 | 0                                 |                                |                                  |
| 0     | 010 | 010                | 010 | 2                                 | 2 - 0 = 2                      | 2 - 0 = 2                        |
| 1     | 010 | 100                |     | 4                                 |                                |                                  |
| 2     | 010 | 110                | 100 | 6                                 | -2 - 2 = -4                    | 6 - 2 = 4                        |
| 3     | 010 | 000                |     | 8                                 |                                |                                  |
| 4     | 010 | 010                | 100 | 10                                | 2 - (-2) = -4                  | 10 - 6 = 4                       |

## A.2 CIC Filter Tables

**Table A.3:** Passband attenuation for CIC filters as a function of the bandwidth-differential delay product. Taken from [1].

| Relative Bandwidth-Differential Delay Product ( $Mf_c$ ) | Passband attenuation at $f_c$ in dB<br>as a Function of Number of Stages ( $N$ ) |      |      |      |      |      |
|----------------------------------------------------------|----------------------------------------------------------------------------------|------|------|------|------|------|
|                                                          | 1                                                                                | 2    | 3    | 4    | 5    | 6    |
| 1/128                                                    | 0.00                                                                             | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 |
| 1/64                                                     | 0.00                                                                             | 0.01 | 0.01 | 0.01 | 0.02 | 0.02 |
| 1/32                                                     | 0.01                                                                             | 0.03 | 0.04 | 0.06 | 0.07 | 0.08 |
| 1/16                                                     | 0.06                                                                             | 0.11 | 0.17 | 0.22 | 0.28 | 0.34 |
| 1/8                                                      | 0.22                                                                             | 0.45 | 0.67 | 0.90 | 1.12 | 1.35 |
| 1/4                                                      | 0.91                                                                             | 1.82 | 2.74 | 3.65 | 4.56 | 5.47 |

**Table A.4:** Passband aliasing attenuation for CIC filters as a function of the bandwidth and the differential delay. Taken from [1].

| Differential Delay ( $M$ ) | Relative Bandwidth ( $f_c$ ) | Aliasing/Imaging Attenuation at $f_{s,low}$ in dB<br>as a Function of Number of Stages ( $N$ ) |      |       |       |       |       |
|----------------------------|------------------------------|------------------------------------------------------------------------------------------------|------|-------|-------|-------|-------|
|                            |                              | 1                                                                                              | 2    | 3     | 4     | 5     | 6     |
| 1                          | 1/128                        | 42.1                                                                                           | 84.2 | 126.2 | 168.3 | 210.4 | 252.5 |
| 1                          | 1/64                         | 36.0                                                                                           | 72.0 | 108.0 | 144.0 | 180.0 | 215.9 |
| 1                          | 1/32                         | 29.8                                                                                           | 59.7 | 89.5  | 119.4 | 149.2 | 179.0 |
| 1                          | 1/16                         | 23.6                                                                                           | 47.2 | 70.7  | 94.3  | 117.9 | 141.5 |
| 1                          | 1/8                          | 17.1                                                                                           | 34.3 | 51.4  | 68.5  | 85.6  | 102.8 |
| 1                          | 1/4                          | 10.5                                                                                           | 20.9 | 31.4  | 41.8  | 52.3  | 62.7  |
| 2                          | 1/256                        | 48.1                                                                                           | 96.3 | 144.4 | 192.5 | 240.7 | 288.8 |
| 2                          | 1/128                        | 42.1                                                                                           | 84.2 | 126.2 | 168.3 | 210.4 | 252.5 |
| 2                          | 1/64                         | 36.0                                                                                           | 72.0 | 108.0 | 144.0 | 180.0 | 216.0 |
| 2                          | 1/32                         | 29.9                                                                                           | 59.8 | 89.6  | 119.5 | 149.4 | 179.3 |
| 2                          | 1/16                         | 23.7                                                                                           | 47.5 | 71.2  | 95.0  | 118.7 | 179.3 |
| 2                          | 1/8                          | 17.8                                                                                           | 35.6 | 53.4  | 71.3  | 89.1  | 106.9 |

# Filter Design

Some additional information about the filter design process is presented here. This comprises a graphical comparison between two variants of the  $R = 625$  filter chain, measurements for resource usage on the FPGA in relation to filter size, and the frequency responses for all filters and filter chains as designed by Matlab (for measurement results, see Chapter 7). For studying the filters and chains in more detail, the reader is referred to the design files themselves.

## B.1 Decimation of 625: Variants

There are many possible ways to implement a filter chain for a given overall downsampling ratio  $R$ . Here, two possibilities are compared for  $R = 625$ , consisting of a  $125 \rightarrow 5$  cascade and a  $25 \rightarrow 5 \rightarrow 5$  cascade. In both cascades, the initial stage is a CIC filter, followed by a FIR compensator and one or more FIR filters. The results are shown in Figure B.1.



**Figure B.1:** Semilog (top) and linear plot (middle) for two variants for implementing a decimation chain for a rate change factor of  $R = 625$ . Both choices show almost the same behavior with regards to magnitude. The bottom plot shows the phase response of the two filters; here, too, the behavior is very similar. Note: These plots show the frequency responses of the entire filter cascade for the two respective variants.

## B.2 Resource Usage for FIR Filters on the FPGA

Measurements on resource usage for various FIR filters have been performed and are presented here. The objective was to determine how many DSP slices a given FIR filter needs when implemented with Xilinx's FIR Compiler [14]. These figures form the basis to define reasonable bounds for the FIR filter 5steep (see Figure 3.1 on page 39). Figure B.2 depicts the results of the measurements, while Table B.1 contains the configuration parameters which were used for the FIR compiler core.

As can be seen in the plot, DSP slice usage rises roughly linearly at these high sampling rates. When using only a single filter, a filter of roughly 760 coefficients is the maximum possible size. Because the STEMlab has two channels, and because other filters are required as well, an upper limit for 5steep of 250 coefficients is set based on these results. A smaller filter is also acceptable as long as it fulfills the general requirements.



**Figure B.2:** Usage report figures for DSP slices using the Xilinx FIR compiler block. The configuration of the filter in terms of bit widths is identical to the actual configuration used in the final implementation. Unlike the implementation, however, only a single channel was configured.

**Table B.1:** The parameters used to configure the FIR compiler core for the usage measurements from Figure B.2

| Parameter                   | Value   |
|-----------------------------|---------|
| Clock Frequency             | 125 MHz |
| Decimation Rate             | 5       |
| Input Data Width            | 24 bit  |
| Input Fractional Bits       | 7       |
| Output Data Width           | 32 bit  |
| Coefficient Fractional Bits | 17      |

### B.3 Filter Frequency Responses

This section lists the frequency responses of the filters and the cascades as specified in Section 3. For detailed filter specifications, see the appropriate chapters and/or the filter design code in the repository.

#### B.3.1 5steep



#### B.3.2 5flat



**B.3.3 2steep****B.3.4 CIC25****B.3.5 CFIR25**

**B.3.6 CIC125****B.3.7 CFIR125**

**B.3.8 Chain for R=25**

**B.3.9 Chain for R=125**

**B.3.10 Chain for R=625**

**B.3.11 Chain for R=1250**

**B.3.12 Chain for R=2500**

# Oscilloscope

Some supplementary information for the technologies underpinning the oscilloscope application is presented here.

## C.1 WebSockets

WebSockets' final RFC 6455[32] was released in December 2011 and is thus still quite young. It is meant to compensate the lack of raw UDP and TCP sockets in JavaScript; while those would offer maximum flexibility, they also pose a significant security risk, and are therefore not available in JavaScript. The WebSockets protocol is located in the Application Layer of the OSI model<sup>1</sup>. Instead of directly opening a raw WebSocket, the handshake is done via HTTP(S). This brings the benefit of communicating through the same ports as the browser (80 or 443) which enables the protocol to function through most firewalls. Furthermore it greatly simplifies the implementation of handshakes for the programmer.

The client sends an upgrade request to the server which then opens a WebSocket connection. This allows for a very convenient way to use TCP Sockets without any entirely new standards. Section 1.5, *Design Philosophy* in RFC 6455 [32] explains it well:

Basically it is intended to be as close to just exposing raw TCP to script as possible given the constraints of the Web.

The only exception is that WebSockets adds framing to make it packet rather than stream based and to differentiate between binary and text data. This differentiation is very useful for this project. Instructions to the server are issued via the text channel whilst data is sent back through the binary channel, allowing for very convenient interfacing with close to no effort.

In summary: WebSockets are close-to-raw TCP sockets whose handle is shared through HTTP(S).

JavaScript provides a WebSockets interface that offers convenient sending and receiving of large amounts of data. As nearly anything in JavaScript this is done using callbacks. There are callbacks which handle connections, messages and errors. The code snippet in Listing C.1 gives some insight how WebSockets in JavaScript are used. For more detailed information, the reader is referred to the Mozilla documentation [34].

---

<sup>1</sup>For those not familiar with the OSI model, Wikipedia provides a good overview in [33].

**Listing C.1:** Using WebSockets in JavaScript

```
1 // Open a new socket
2 this.socket = new WebSocket('ws://localhost');
3 // Make sure the binary data transmitted
4 // is interpreted as an ArrayBuffer
5 // More on ArrayBuffer and Blobs in:
6 // - https://developer.mozilla.org/en/docs/Web/API/Blob
7 // - https://developer.mozilla.org/en-US/docs/Web/JavaScript/
8 // Reference/Global_Objects/ArrayBuffer
9 this.socket.binaryType = 'arraybuffer';
10
11 // Define all the callback handlers
12 connection.onopen = function () {
13     // The connection was established; send some regards.
14     connection.send('Hello World!');
15 };
16
17 connection.onerror = function (error) {
18     // An error has occurred; print it to the console.
19     console.log('WebSocket Error: ' + error);
20 };
21
22 connection.onmessage = function (e) {
23     if (typeof e.data == 'string') {
24         // If a text type message was received, print it out.
25         console.log('Text message received: ' + e.data);
26     } else {
27         // A binary type message was received.
28         // Interpret the values as 16 bit uints.
29         var arr = new Uint16Array(e.data);
30         // Plot the data.
31         plot(arr);
32     }
33 };
```

## C.2 State Tree of Oscilloscope

**Listing C.2:** The state tree of the scope application

```

1 var appState = {
2   scopes: [
3     ui: {
4       prefPane: {
5         open: true,
6         width: 400,
7       }
8     },
9     source: {
10       id: 2,
11       name: 'Source ' + 1,
12       location: 'ws://localhost:50090',
13       frameSize: 4096,
14       samplingRate: 5000000,
15       bits: 16,
16       vpp: 2.1, // Volts per bit
17       trigger: {
18         type: 'risingEdge',
19         level: 32768,
20         channel: 1,
21         hysteresis: 30,
22         slope: 0
23       },
24       triggerTrace: 0,
25       triggerPosition: 1 / 8,
26       numberofChannels: 2,
27       mode: 'normal',
28       activeTrace: 0,
29       traces: [
30         {
31           id: 4,
32           offset: { x: 0, y: 0 },
33           windowFunction: 'hann',
34           halfSpectrum: true,
35           SNRmode: 'auto',
36           info: {}, // Populated during runtime with math
37           name: 'Trace ' + 2,
38           channelID: 1,
39           type: 'FFTrace',
40           color: '#E8830C',
41           scaling: { x: 1, y: 1 },
42           markers: [
43             {
44               id: 'SNRfirst',
45               type: 'vertical',
46               x: 0,

```

**Listing C.2 (cont.):** The state tree of the scope application

```

47           dashed: true,
48           color: 'purple',
49           active: true,
50       }
51     ]
52   ],
53 ],
54 }
55 ];
56 };

```

### C.3 mithril.js

The official mithril webpage describes mithril.js in the following way: “Mithril is a modern client-side JavaScript framework for building Single Page Applications. It’s small (< 8kb gzip), fast and provides routing and XHR utilities out of the box.” [35]

Mithril, like a lot of other frameworks such as React, Angular.js or Vue.js, uses a virtual DOM. This means that it does not modify the DOM which is outlined by the browser, but rather maintains its own DOM. When a new render call is issued, the virtual DOM calculates all the deltas that stem from new content and applies them to the real DOM. This allows mithril.js to calculate and recalculate the DOM based on a descriptive model. The developer does not have to manually modify an object’s state but rather has to describe it.

A redraw generally happens when an event is triggered by any input element but can also be issued manually. A virtual DOM consists of many vnodes (virtual nodes) and can be mounted on any actual node of the browser’s DOM as the example in Listing C.3 shows.

**Listing C.3:** Basic creation and usage of mithril components in JavaScript

```

1 // A mithril component is a simple object that has at minimum a
2 // view() function that returns a vnode.
3 var HelloWorld = {
4   view: function() {
5     // Return the toplevel <div> vnode
6     return m('', [
7       // Create a <h1> vnode with the attribute class="title"
8       m('h1', { class: 'title' }, 'A very interesting title!'),
9       // Create a <p> vnode
10      m('p', 'Hello World!'),
11    ])
12  }
13 }
14 // Get the root div and mount the HelloWorld component
15 var root = document.getElementById('root');
16 m.mount(root, HelloWorld)

```

A component can be mounted on any DOM node and becomes a vnode in the virtual

DOM. The developer can create new components by simply creating an object that holds at least a `view()` function that instantiates new vnodes. The new component can then be instantiated via the `m()` or `m.mount()` command. As this section should only give a base overview on mithril and is not meant to be a manual, further information on mithril's features and usage can be obtained on its webpage [35].

## C.4 WebGL

An application uses the `canvas` DOM element which provides a direct interface to WebGL. The user can render vertices to the canvas and even apply shaders or, in the case of our scope application, simple 2D geometry calls. These are sufficient for our purposes since the scope basically only requires the drawing of lines.

Via the canvas one can retrieve a 2D rendering context on which simple geometry can be drawn. In JavaScript this can be done using the code in Listing C.4 which shows how a single red line can be drawn on the canvas.

**Listing C.4:** Getting a 2D Rendering Context from a Canvas and Drawing on it in JavaScript

```

1 // Get the canvas element from the dom
2 var canvas = document.getElementById('canvas-id');
3 // Get the 2d context of the canvas
4 var context = canvas.getContext('2d');
5
6 // Set brush color to red
7 context.strokeStyle = '#FF0000';
8
9 // Start a new path and move the cursor
10 // from start to end of the line to be drawn
11 context.beginPath();
12 context.moveTo(x, y);
13 context.lineTo(x + 100, y + 100);
14
15 // Finally actually draw the line on the canvas and end the path
16 context.stroke();

```

There is also the possibility to draw rectangles, circles and much more. All of those elements can be styled easily via properties of the context environment. All the functionality is documented on the Mozilla Network [36].

After having acquired the rendering context, something can be drawn on the canvas once. For the creation of a moving image, those draws have to be re-issued over and over again. There are various possibilities in JavaScript to accomplish this, but only one is actually high-performance and recommended.

Instead of simply drawing to the canvas over and over again, it would be ideal to only do that before a new frame is pulled from the framebuffer by the display. JavaScript provides a interface to register a callback that is called before a new frame is released. This callback will be called with the same frequency as the display refresh rate, which nowadays usually is 60 Hz. To make sure that a callback will always be executed, it has to be registered again after a callback has been issued. The example in Listing C.5 shows how this is done. This callback will not affect the rest of the DOM. This allows JavaScript to handle the redraws

of the DOM with high speed while the callback will render a fluent graph of the data onto just one of the DOM elements.

**Listing C.5:** Usage of the requestAnimationFrame callback in JavaScript

```

1 // The register function is not named the same way in every browser
2 // Make sure this is the case
3 window.requestAnimationFrame = window.requestAnimationFrame
4           || window.webkitRequestAnimationFrame;
5
6 // Our callback we call for every frame drawn
7 export const draw = function() {
8     // Draw anything needed
9
10    // End draw
11
12    // Register the callback again
13    requestAnimationFrame(function(){
14        // Execute our callback
15        // We cannot hand this directly to the register function
16        // since it is not yet known inside it's own definition
17        draw();
18    });
19 };
20
21 // Initially call the draw function
22 draw();

```

## C.5 FFT Windowing Parameters

**Table C.1:** FFT windowing parameters, taken from [37]

| Window        | Scaling<br>for<br>Periodical<br>Signals | Factor<br>Quasi-<br>Periodical<br>Signals | Attenuation<br>of<br>Largest<br>Lobe (dB) | Number<br>of<br>Lines<br>per<br>Bundle | Maximum<br>Error<br>in<br>Amplitude<br>(dB) |
|---------------|-----------------------------------------|-------------------------------------------|-------------------------------------------|----------------------------------------|---------------------------------------------|
| Rectangle     | 1                                       |                                           | 13                                        | 1 – 2                                  | -3.8                                        |
| Hanning       | 1/0.5000                                |                                           | 31                                        | 3 – 4                                  | -1.5                                        |
| Hamming       | 1/0.5400                                |                                           | 41                                        | 3 – 4                                  | -1.6                                        |
| Blackman      | 1/0.4200                                |                                           | 58                                        | 5 – 6                                  | -1.1                                        |
| Bartlett      | 1/0.5000                                |                                           | 26                                        | 3 – 4                                  | -1.9                                        |
| Kaiser-Bessel | 1/0.4021                                |                                           | 67                                        | 7 – 8                                  | -1.0                                        |
| Flat-Top      | 1/0.2155                                |                                           | 67                                        | 9 – 10                                 | 0                                           |

## APPENDIX

# D

## Licenses

All components which have been created specifically for this project are provided under the MIT license. Components taken from other sources may fall under different licenses; make sure to check the repository in case of uncertainty.

### D.1 The MIT License [12]

Copyright 2017 Raphael Frey & Noah Hüsser

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.



APPENDIX

E

## Problem Statement

# ***Front-End Signal-Processing for Red Pitaya Spectrum-Analyzer***



Studenten: Raphael Frey  
Noah Hüsser

Dozenten: Prof. Dr. Richard Gut  
Michael Pichler

Experte: Dr. Jürg Stettbacher

## Zielsetzung

Im Herbstsemester 15/16 wurde das Projekt *Implementation von Messgeräten mittels günstigem Daten-Akquisitionssystem* bearbeitet und es steht nun ein funktionsfähiger, Ethernet basierter Spectrum-Analysator mit einem Red-Pitaya-Board zur Verfügung. Unbefriedigend ist die Signalverarbeitung im Front-End des Red-Pitaya-Boards. Die im Herbstsemester 16 / 17 implementierte Abtastratenwandlung mittels rekursiver Filter stellt einen ersten Versuch dar, möchte aber noch nicht in allen Belangen überzeugen. Hauptziel der Arbeit ist, das FPGA zur Filterung/Ratenreduktion mittels CIC<sup>1</sup> - Filter zu verwenden und das Front-End somit flexibler zu gestalten und Aliasing bei tieferen Abtastraten zu reduzieren.

Hauptkomponente des Red-Pitaya ist ein Xilinx ZC7Z010 System-on-Chip Device mit zwei Cortex ARM9 Prozessoren und einem Artix 7 FPGA. Neben diverser Peripherie stehen mit dem LTC2145 zwei schnelle Analogeingänge mit 14 Bit Auflösung und dem IDT DAC1401 zwei schnell Analogausgänge mit ebenfalls 14 Bit Auflösung zur Verfügung. Die beiden Prozessoren sind Host eines Linux Betriebssystems, das eine einfache Anbindung ans Ethernet erlaubt. Das FPGA Artix 7 steht zur schnellen Signalverarbeitung zur Verfügung und erlaubt somit auch eine Vorverarbeitung der anfallenden ADC Werte.



## Ziel der Arbeit

Das Hauptaugenmerk der Projektarbeit soll auf die Vorverarbeitung der ADC Werte gelegt werden. Ziel ist es, mittels Dezimation gängige Abtastraten bei höherer Signalqualität zu erzeugen. Für die dazu benötigten Tiefpassfilter sollen verschiedene Strukturen untersucht und geeignete im FPGA implementiert werden. Zu diesem Zwecke soll zuerst eine Auslegeordnung des Gesamtsystems und dessen Entwicklungsumgebung gemacht und allenfalls angepasst werden.

<sup>1</sup> CIC filters were invented by Eugene B. Hogenauer, and are a class of FIR filters used in multi-rate digital signal processing.

## Aufgabenstellung

1. Erstellen Sie aufgrund der Aufgabenstellung (Lastenheft) ein Pflichtenheft sowie einen Projektplan und definieren Sie darin das Gesamtziel und die Teilziele in übersichtlicher Form. Erarbeiten Sie dazu einen Zeitplan und definieren einige wichtige Meilensteindaten. Der Projektplan soll mit dem betreuenden Dozenten besprochen werden und falls notwendig, im Verlauf des Projekts nach Absprache mit dem Dozenten an den jeweiligen Gegebenheiten angepasst werden.
2. Untersuchen Sie verschiedene Lösungsmöglichkeiten und halten Sie diese in geeigneter Form fest. Teilen Sie dazu das Problem in Teilprobleme auf und überlegen Sie sich grundsätzliche Lösungsmöglichkeiten. Aufgrund einer Analyse der Vor- und Nachteile legen Sie anschließend denjenigen Lösungsansatz fest, der Ihnen am meisten Erfolg verspricht. Beim Entwurf der Implementation soll auf Testbarkeit geachtet werden.
3. Realisieren Sie Ihr Konzept in kleinen, nachvollziehbaren Schritten. Überprüfen Sie die Softwaremodule anhand der in Punkt 2 festgelegten Tests.
4. Fassen Sie Ihre Überlegungen, Ihr Vorgehen und Ihre Resultate zusammen und halten Sie diese in einem Schlussbericht fest.

Besprechen Sie den Verlauf Ihrer Arbeit regelmässig mit dem betreuenden Dozenten.

Die Benotung der Arbeit wird sich hauptsächlich auf die Qualität und Gründlichkeit Ihres Vorgehens sowie auf die Nachvollziehbarkeit und Korrektheit Ihrer Resultate abstützen.

## Termine

- Ausgabe der Arbeit: Montag, 20. Februar 2017
- Abgabetermin: Freitag, 18. August 2017 um 17:00
- Abzugebende Exemplare: 1 pro Gruppe
- Präsentation der Arbeit: Freitag, 15. September 2017, 9:20-10:05, 4.227
- Besprechung der Arbeit: Gemäss Absprache