



**Politecnico  
di Torino**

Computer Engineering Master's Degree Course (LM-32)

---

## Report #2

---

### UART SW

---

*Team members*

S354939 - Pietro Alberto Levo  
S358481 - Sandro Marghella  
S359182 - Gianluca Riva Governanda  
S361451 - Cristina Rizzo

*Professor*

Claudio Passerone

# Introduction

This lab focuses on creating, configuring, and running an embedded system based on the Nios II soft-core processor on the Intel Cyclone V FPGA (DE1-SoC board). The main goal is to build an interface unit that connects the desktop and the FPGA system using the UART (Universal Asynchronous Receiver-Transmitter) protocol. Unlike higher-level systems where the interface is managed by an operating system driver, this project investigates "bit-banging" and a low-level hardware approach.

The project uses General Purpose I/O (GPIO) pins to sample and send the wire line UART signals. This work aims to clarify the relationship between software and hardware, especially how software execution and the timing of physical signals interact.

## Educational Objectives

The educational journey begins with a simple "Hello World" system and progresses to more complex tasks. This includes a standalone UART receiver that can work with different baud rates. The lab design allows for system validation and the real-time study of various signals using an oscilloscope. This setup enables the examination of embedded real-time systems where timing and precise control of the processor are crucial. It allows for the direct observation of start/stop bits, parity bits, and propagation delays.

## System Configuration and Setup

Implementing on the DE1-SoC board requires a specific setup to connect the ARM-based Hard Processor System (HPS) and the FPGA fabric:

- **Hardware Initial Boot:** Placement of the MicroSD card has to be done before turning the hardware on for the first time. This enables the ARM HPS to route the UART lines to the FPGA.
- **Board Settings:** The manual switch SW(9) is set to 0 to enable software UART mode.
- **System Architecture:** The design uses a Nios II/e soft-core processor integrated through Platform Designer. Since there is no hardware UART block available, the system entirely relies on bit-banging using GPIO pins.
- **Software & BSP Settings:** In the BSP Editor, `stdout` is mapped to `jtag_uart_0` and the `timestamp_timer` to `timer_0`.
- **Implementation Logic:** The C code is designed to detect the Start Bit, pause for 0.5 bit-time to sample at the midpoint of the pulse, and then loop through the data bits using a high-resolution timer.
- **Debugging:** we used an oscilloscope with GPIO header pins to check real-time timing, start/stop bits and propagation delays.

# 1 Projects

## 1.1 Project #1

To start the experience, the "Hello world" program is run.

```
1 #include <stdio.h>
2 #include "system.h"
3 #include "sys/alt_timestamp.h"
4 #include "altera_avalon_pio_regs.h"
5
6 int main()
7 {
8     printf("Hello Sandro, Gianluca, Pietro and Cri!\n");
```

```

9
10    return 0;
11 }

```

The above program is running on the hardware described in VHDL files and flashed to the FPGA. In order to make Software and Hardware able to work together is necessary the BSP. This work as an Hardware Abstraction Layer which is generated based on the instantiation of the different peripherals and components with other informations, all available on the SOPC file, in this case the *nios\_hps\_system.sopcinfo*.

## 1.2 Project #4

The project 4's main purpose is to create an UART receiver with software and test it sending characters from PuTTY to the Cyclone V board.

To do so, it been first defined the following constants at the beginning of the code: In this project

```

1 #define NBIT      8
2 #define NSTOPBIT  1
3 #define NOPARITY  0
4 #define EVENPARITY 1
5 #define ODDPARITY  2
6 #define PARITY    NOPARITY

```

Listing 1: Constants

it was not used a parity bit, so each transmission consists of exactly 10 bits: 1 start bit, 8 data bits, and 1 stop bit.

### 1.2.1 Version #1

In order to create an UART receiver the focus is on the UART's message structure.

```

1 #define BAUDRATE 300
2 int main() {
3     int ticks_per_sec = alt_timestamp_freq();
4     int ticks_per_bit = ticks_per_sec / BAUDRATE;
5     int c[NBIT];
6     int val;
7
8     printf("UART RX ready.\n");
9
10    while (1) {
11        do {
12            val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
13        } while (val == 1);
14
15        alt_timestamp_start();
16        while (alt_timestamp() < (ticks_per_bit >> 1)) {}
17
18        val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
19        if (val != 0) {
20            continue;
21        }
22        alt_timestamp_start();
23        int sample_times[NBIT + NSTOPBIT];
24        for (int i = 0; i < NBIT + NSTOPBIT; i++)
25            sample_times[i] = (i + 1) * ticks_per_bit;
26
27        for (int i = 0; i < NBIT; i++) {
28            while (alt_timestamp() < sample_times[i]) {}
29            val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
30            c[i] = val;           // c[0] = LSB
31        }
32
33        while (alt_timestamp() < sample_times[NBIT]) {}
34        val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
35
36        if (val != 1) {
37            printf("ERROR: stop bit not valid!\n");
38            continue;
39        }
40

```

```

41     int result = 0;
42     for (int i = 0; i < NBIT; i++)
43         result |= (c[i] << i);
44
45     printf("Received: %c (0x%02X) (%d)\n", result, result, result);
46 }
47
48     return 0;
49 }
```

Listing 2: Version \_1

The code first waits for the start bit. The default state of the UART line is 1, so the start bit is represented by 0. To ensure that it is not an error, the code reads the line state after half a bit period (see line 21 of 2). If the check passed, the FPGA computes the sampling times and then begins sampling the other 8 data bits of the character encoded in ASCII, as shown in the for loop on line 28.

The 38th line makes sure that the transmission ends with a stop bit (equal to 1) and then it builds the character (for loop in line 42).

### 1.3 Version #2

In 2 the code use a for loop to read the 8 char bit transmitted by UART. The branch instructions used by for loop can slow down execution.

For this reason the 3 uses loop unrolling to improve performance:

```

1 int main() {
2
3     int ticks_per_sec = alt_timestamp_freq();
4     int ticks_per_bit = ticks_per_sec / BAUDRATE;
5
6     int c[NBIT];
7     int val;
8
9     printf("UART RX ready.\n");
10
11    while (1) {
12
13        do {
14            val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
15        } while (val == 1);
16
17        alt_timestamp_start();
18        while (alt_timestamp() < (ticks_per_bit >> 1)) {};
19
20        // Check on the start
21        val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
22        if (val != 0) {
23            continue;
24        }
25
26        alt_timestamp_start();
27        int sample_times[NBIT + NSTOPBIT];
28        for (int i = 0; i < NBIT + NSTOPBIT; i++)
29            sample_times[i] = (i + 1) * ticks_per_bit;
30
31        // 4. reads DATA's bits - loop unrolling
32
33        while (alt_timestamp() < sample_times[0]) {}
34        val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
35        c[0] = val;
36
37        while (alt_timestamp() < sample_times[1]) {}
38        val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
39        c[1] = val;
40
41        while (alt_timestamp() < sample_times[2]) {}
42        val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
43        c[2] = val;
44
45        while (alt_timestamp() < sample_times[3]) {}
46        val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
47        c[3] = val;
```

```

49     while (alt_timestamp() < sample_times[4]) {}
50     val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
51     c[4] = val;
52
53     while (alt_timestamp() < sample_times[5]) {}
54     val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
55     c[5] = val;
56
57     while (alt_timestamp() < sample_times[6]) {}
58     val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
59     c[6] = val;
60
61     while (alt_timestamp() < sample_times[7]) {}
62     val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
63     c[7] = val;
64
65     while (alt_timestamp() < sample_times[NBIT]) {}
66     val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
67
68     if (val != 1) {
69         printf("ERROR: stop bit not valid!\n");
70         continue;
71     }
72
73     int result = 0;
74     for (int i = 0; i < NBIT; i++)
75         result |= (c[i] << i);
76
77     printf("Received: %c (0x%02X) (%d) (%d%d%d%d%d%d)\n", result, result, result, result, c[7], c[6], c[5], c[4], c[3], c[2], c[1], c[0]);
78 }
79
80 return 0;
81 }
```

Listing 3: Version \_2

Thanks to this approach the compiler avoids repeated branch instructions, making the code faster and more efficient than Listing 2.

## 1.4 Project #5

In this project it is performed the reception and transmission of a char over a 8N1 UART implemented driving GPIOs via software.

This is a time optimized code to enable communication between the PC and the FPGA

```

int main() {
    int ticks_per_sec = alt_timestamp_freq();
    int ticks_per_bit = ticks_per_sec / BAUDRATE;

    int c[NBIT];
    int val;

    IOWR_ALTERA_AVALON_PIO_DATA(NIOS_UARTTX_BASE, 1);

    printf("UART RX ready.\n");

    while (1) {
        // RX PHASE

        // 1. Wait START bit
        do {
            val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
        } while (val == 1);

        // 2. Start timer and wait half bit time
        alt_timestamp_start();
        while (alt_timestamp() < (ticks_per_bit >> 1)) {}

        // Check start bit
        val = IORD_ALTERA_AVALON_PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
        if (val != 0) continue;

        // 3. Calculate sampling times (unrolled)
        alt_timestamp_start();
        int sample_times[NBIT + NSTOPBIT];
        for (int i = 0; i < NBIT + NSTOPBIT; i++)
            sample_times[i] = (i + 1) * ticks_per_bit;
```

```

// 4. Read DATA bits
while (alt_timestamp() < sample_times[0]) {}
val = IORD_ALTERA_AVALON PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
c[0] = val;

while (alt_timestamp() < sample_times[1]) {}
val = IORD_ALTERA_AVALON PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
c[1] = val;

while (alt_timestamp() < sample_times[2]) {}
val = IORD_ALTERA_AVALON PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
c[2] = val;

while (alt_timestamp() < sample_times[3]) {}
val = IORD_ALTERA_AVALON PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
c[3] = val;

while (alt_timestamp() < sample_times[4]) {}
val = IORD_ALTERA_AVALON PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
c[4] = val;

while (alt_timestamp() < sample_times[5]) {}
val = IORD_ALTERA_AVALON PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
c[5] = val;

while (alt_timestamp() < sample_times[6]) {}
val = IORD_ALTERA_AVALON PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
c[6] = val;

while (alt_timestamp() < sample_times[7]) {}
val = IORD_ALTERA_AVALON PIO_DATA(NIOS_UARTRX_BASE) & 0x01;
c[7] = val;

// 5. Read STOP BIT e verify
while (alt_timestamp() < sample_times[NBIT]) {}
val = IORD_ALTERA_AVALON PIO_DATA(NIOS_UARTRX_BASE) & 0x01;

if (val != 1) {
    printf("ERROR: invalid stop bit!\n");
    continue;
}

// 6. Build result
int result = 0;
result |= (c[0] << 0);
result |= (c[1] << 1);
result |= (c[2] << 2);
result |= (c[3] << 3);
result |= (c[4] << 4);
result |= (c[5] << 5);
result |= (c[6] << 6);
result |= (c[7] << 7);

printf("Ricevuto: %c (0x%02X) (%d) (%d%d%d%d%d%d);\\n", result, result, result, c[7], c[6], c[5], c[4], c[3], c[2], c[1], c[0]);

// TX PHASE
alt_timestamp_start();

// Start bit
IOWR_ALTERA_AVALON PIO_DATA(NIOS_UARTTX_BASE, 0);
while (alt_timestamp() < ticks_per_bit) {}

// DATA bits
IOWR_ALTERA_AVALON PIO_DATA(NIOS_UARTTX_BASE, c[0]);
while (alt_timestamp() < ticks_per_bit*2) {}

IOWR_ALTERA_AVALON PIO_DATA(NIOS_UARTTX_BASE, c[1]);
while (alt_timestamp() < ticks_per_bit*3) {}

IOWR_ALTERA_AVALON PIO_DATA(NIOS_UARTTX_BASE, c[2]);
while (alt_timestamp() < ticks_per_bit*4) {}

IOWR_ALTERA_AVALON PIO_DATA(NIOS_UARTTX_BASE, c[3]);
while (alt_timestamp() < ticks_per_bit*5) {}

IOWR_ALTERA_AVALON PIO_DATA(NIOS_UARTTX_BASE, c[4]);
while (alt_timestamp() < ticks_per_bit*6) {}

IOWR_ALTERA_AVALON PIO_DATA(NIOS_UARTTX_BASE, c[5]);
while (alt_timestamp() < ticks_per_bit*7) {}

IOWR_ALTERA_AVALON PIO_DATA(NIOS_UARTTX_BASE, c[6]);
while (alt_timestamp() < ticks_per_bit*8) {}

```

```

IOWR_ALTERA_AVALON_PIO_DATA(NIOS_UARTTX_BASE, c[7]);
while (alt_timestamp() < ticks_per_bit*9) {}

// Stop bit
IOWR_ALTERA_AVALON_PIO_DATA(NIOS_UARTTX_BASE, 1);
while (alt_timestamp() < ticks_per_bit*10) {}

}

return 0;
}

```

The baud rates values considered were 110, 150, 300, 1200, 2400, 4800, 9600, 19200. PuTTY [1] failed to establish a UART connection for the lowest values of baud rates. As an alternative, tio [2] was used, which successfully connected. However, the readings were wrong, as evidenced by (1a) and (1b). It is clear how the TX and RX are completely different. For baud rates above 150 and up to 9600, transmission and reception are correct, as represented by (2a), (2b), (2c), (2d) and (2e). Since 19200 the sampling becomes unreliable. From (3b) it is easy to understand that:

- The sampling happens in delay, which results in a bit shifting in the transmission communication. This is most likely caused by software overhead due to busy-waiting loops.
- The TX signal is also delayed. At 19200 baud rates the time bit is  $\frac{1}{BR} \approx 52 \mu s$ , but the oscilloscope shows each symbol lasts  $\approx 60 \mu s$

## References

- [1] Putty. <https://www.chiark.greenend.org.uk/~sgtatham/putty/>, 2025. Version 0.81.
- [2] tio: a simple serial device i/o tool. <https://github.com/tio/tio>, 2025. Accessed: 2025-12-02.

## A Appendix

Blue signal: RX line. Yellow signal: TX line.



Figure 1



(e) 9600 BR

Figure 2



(a) 19200 BR



(b) 19200 BR: TX and RX compared

Figure 3