

# STM32F103 Flash Loader Protocol: Complete Technical Reference

Ashok Ashwin

February 1, 2026

## Contents

|          |                                                            |           |
|----------|------------------------------------------------------------|-----------|
| <b>1</b> | <b>Introduction</b>                                        | <b>3</b>  |
| 1.1      | Scope and Architecture Overview . . . . .                  | 3         |
| <b>2</b> | <b>Core Concepts: HAL, UART, Flash, and CRC</b>            | <b>3</b>  |
| 2.1      | 1. Hardware Abstraction Layer (HAL) Architecture . . . . . | 3         |
| 2.1.1    | Why HAL Matters for Flash Loading . . . . .                | 3         |
| 2.1.2    | HAL in the Flash Loader . . . . .                          | 3         |
| 2.1.3    | Clock Configuration . . . . .                              | 4         |
| 2.2      | 2. UART Communication Layer . . . . .                      | 4         |
| 2.2.1    | UART Configuration . . . . .                               | 4         |
| 2.2.2    | UART Timeout Handling . . . . .                            | 4         |
| 2.3      | 3. Flash Memory Organization and Programming . . . . .     | 4         |
| 2.3.1    | STM32F1 Flash Organization . . . . .                       | 4         |
| 2.3.2    | Flash Memory Controller (FMC) Registers . . . . .          | 5         |
| 2.3.3    | Flash Write and Erase Mechanics . . . . .                  | 5         |
| 2.3.4    | Detailed HAL FLASH API Reference . . . . .                 | 5         |
| 2.3.5    | Complete Flash Write Function from Bootloader . . . . .    | 8         |
| 2.4      | 4. CRC-16-XMODEM Error Detection . . . . .                 | 9         |
| 2.4.1    | CRC-16-XMODEM Algorithm . . . . .                          | 9         |
| 2.4.2    | CRC in Python . . . . .                                    | 9         |
| 2.4.3    | Why CRC-16-XMODEM? . . . . .                               | 9         |
| <b>3</b> | <b>Foundational Concepts for Flash Loaders</b>             | <b>9</b>  |
| 3.1      | 1. Serial Communication and UART (Mandatory) . . . . .     | 9         |
| 3.2      | 2. Frame-Based Protocols (Mandatory) . . . . .             | 10        |
| 3.3      | 3. Flash Memory Organization (Mandatory) . . . . .         | 10        |
| 3.4      | 4. Bootloader State Machine (Important) . . . . .          | 10        |
| 3.5      | 5. Application Validation (Important) . . . . .            | 10        |
| <b>4</b> | <b>Flash Loader Protocol Specification</b>                 | <b>11</b> |
| 4.1      | Protocol Overview . . . . .                                | 11        |
| 4.2      | Communication Phases . . . . .                             | 11        |
| 4.2.1    | Phase 1: Bootloader Heartbeat (Discovery) . . . . .        | 11        |
| 4.2.2    | Phase 2: Mode Selection . . . . .                          | 12        |
| 4.2.3    | Phase 3: Data Transfer Loop . . . . .                      | 12        |
| 4.2.4    | Phase 4: End of Transfer . . . . .                         | 12        |
| 4.3      | Detailed Frame Structure . . . . .                         | 13        |
| 4.3.1    | XModem Frame Format (CRC-16 Mode) . . . . .                | 13        |
| 4.3.2    | CRC-16 Calculation . . . . .                               | 13        |
| 4.3.3    | Block Number Wraparound . . . . .                          | 13        |
| 4.4      | Control Characters . . . . .                               | 13        |
| 4.5      | Timeout Specifications . . . . .                           | 13        |
| 4.6      | Memory Organization During Transfer . . . . .              | 14        |
| 4.7      | Application Launch and Reset Behavior . . . . .            | 14        |

|          |                                                              |           |
|----------|--------------------------------------------------------------|-----------|
| 4.7.1    | Bootloader Startup and System Clock Initialization . . . . . | 14        |
| 4.7.2    | Bootloader Reset Behavior . . . . .                          | 15        |
| 4.7.3    | Application Validation Before Jump . . . . .                 | 15        |
| 4.7.4    | Bootloader Jump-to-Application Sequence . . . . .            | 16        |
| <b>5</b> | <b>Host-Side Implementation: Python Flash Loader</b>         | <b>17</b> |
| 5.1      | High-Level Flow . . . . .                                    | 17        |
| 5.2      | Key Functions . . . . .                                      | 18        |
| 5.2.1    | crc16_xmodem() . . . . .                                     | 18        |
| 5.2.2    | send_firmware() . . . . .                                    | 18        |
| 5.3      | Error Handling and Retries . . . . .                         | 18        |
| <b>6</b> | <b>Conclusion and Next Steps</b>                             | <b>18</b> |

# 1 Introduction

This document provides a comprehensive technical reference for the STM32F103C8 UART flash loader protocol. It covers the embedded bootloader firmware architecture, communication protocol mechanics, CRC-16 error detection, frame structure, handshake procedures, and the integration of STM32 Hardware Abstraction Layer (HAL), flash programming, and UART drivers. The document targets firmware engineers implementing or extending flash loading capabilities on STM32 microcontrollers.

## 1.1 Scope and Architecture Overview

This flash loader system comprises three main components:

1. **Bootloader Firmware** : Runs on the STM32F103C8 at reset, implements UART communication, flash control, and application validation
2. **Flash Loading Protocol** : Custom XModem-based protocol with CRC-16 error detection operating over UART at 115200 baud
3. **Host – Side Tools** : Python scripts implementing the protocol client, serial communication, and application image management

The integration of these components ensures robust firmware updates even in hostile environments (electrical noise, intermittent connections, power interruptions).

# 2 Core Concepts: HAL, UART, Flash, and CRC

## 2.1 1. Hardware Abstraction Layer (HAL) Architecture

**Definition:** HAL is the standardized interface layer between firmware and microcontroller hardware. STMicroelectronics provides HAL libraries abstracting registers, interrupts, and timing details.

### 2.1.1 Why HAL Matters for Flash Loading

Rather than manipulating raw registers, the bootloader uses HAL functions:

- **Portability:** Code remains compatible if moving from STM32F1 to STM32H7 or STM32L4
- **Abstraction:** Details of flash controller timing, cache behavior, and synchronization are hidden
- **Correctness:** Critical sequences (e.g., wait states, flash unlocking) are pre-tested by STMicroelectronics

### 2.1.2 HAL in the Flash Loader

Key HAL functions used:

- **HAL\_RCC\_\***: Clock configuration (HSE 8MHz, PLL to 72MHz, peripheral clocks)
- **HAL\_UART\_Transmit / Receive**: Synchronized UART I/O with timeout
- **HAL\_FLASH\_Unlock / Lock**: Enable/disable flash programming mode
- **HAL\_FLASH\_Program**: Write 16-bit or 32-bit words
- **HAL\_FLASHEx\_Erase**: Erase full pages (1 KB each on STM32F1)
- **SystemClock\_Config()**: Initialize clock tree during bootloader startup

### 2.1.3 Clock Configuration

The bootloader initializes:

- HSE (8 MHz crystal) as primary oscillator
- PLL with factor 9 ( $\text{HSE}/1 \times 9 = 72 \text{ MHz}$  core clock)
- Flash wait states set to 2 (required for 72 MHz operation)
- AHB, APB1, APB2 dividers configured for proper peripheral timing

This 72 MHz clock provides fast flash operations, deterministic UART timing, and precise timeouts.

## 2.2 2. UART Communication Layer

**Definition:** UART (Universal Asynchronous Receiver Transmitter) is a serial communication standard. The STM32F103C8 contains multiple UART instances; the bootloader uses USART1 (PA9/PA10).

### 2.2.1 UART Configuration

The bootloader configures USART1 as:

- Baud rate: 115200 bits/second
- Data bits: 8
- Stop bits: 1
- Parity: None
- Flow control: None (no RTS/CTS)

At 115200 baud, each bit takes ~8.68 microseconds; one byte takes ~87 microseconds; one 128-byte XModem frame takes ~11 milliseconds.

### 2.2.2 UART Timeout Handling

The bootloader defines critical timeouts:

- **BOOTLOADER\_TIMEOUT (5 seconds):** If no START command received, auto-jump to application
- **HEARTBEAT\_INTERVAL (500 ms):** Send "BOOT" message to inform host of readiness
- **UART\_TIMEOUT\_MS (5 seconds):** Per-byte receive timeout during frame transfer

These timeouts use HAL\_GetTick() system timer for millisecond precision.

## 2.3 3. Flash Memory Organization and Programming

**Definition:** Flash is non-volatile electrically erasable programmable memory. The STM32F103C8 contains 64 KB of flash divided into 1 KB pages.

### 2.3.1 STM32F1 Flash Organization

```
1 Total Flash: 64 KB (0x08000000 - 0x0800FFFF)
2 Page Size: 1 KB (0x400 bytes)
3 Number of Pages: 64
4 Page 0 (0x08000000 - 0x080003FF): Bootloader
5 Pages 1-7 (0x08000400 - 0x08001FFF): Bootloader continuation
6 Pages 8-63 (0x08002000 - 0x0800FFFF): Application area
```

### 2.3.2 Flash Memory Controller (FMC) Registers

Key registers controlled through HAL:

- **FLASH\_ACR:** Access control register (sets wait states, enables prefetch)
- **FLASH\_KEYR:** Key register for unlocking flash programming
- **FLASH\_CR:** Control register (enables erase/program operations)
- **FLASH\_SR:** Status register (indicates busy, operation complete, errors)
- **FLASH\_AR:** Address register (target address for erase)

### 2.3.3 Flash Write and Erase Mechanics

#### Program (Write) Operation:

- Flash must be unlocked (HAL\_FLASH\_Unlock)
- Data written in 16-bit or 32-bit words (half-word or word aligned)
- Flash controller autonomously writes and verifies
- Firmware polls SR register via HAL until BSY=0
- Takes approximately 10-40 microseconds per word

#### Erase Operation:

- Flash must be unlocked
- Set page address in AR register
- Set PER (Page Erase) bit in CR register
- Set STRT (Start) bit to begin erase
- Firmware polls SR.BSY until complete
- Takes approximately 20-40 milliseconds per 1 KB page

### 2.3.4 Detailed HAL FLASH API Reference

The STM32 HAL provides high-level functions to abstract flash hardware complexity. Here are the critical functions used in the bootloader:

#### 1. HAL\_FLASH\_Unlock()

```
1 void HAL_FLASH_Unlock(void)
```

**Purpose:** Unlock flash memory for programming and erase operations.

**Details:**

- Flash is locked by default after reset for safety
- Internally writes magic keys to FLASH\_KEYR register in sequence:
  - First key: 0x45670123
  - Second key: 0xCDEF89AB
- Sets flash controller to allow erase/program commands
- Must be called before any flash programming

#### 2. HAL\_FLASH\_Lock()

```
1 void HAL_FLASH_Lock(void)
```

**Purpose:** Lock flash memory after programming/erase to prevent accidental writes.

**Details:**

- Sets LOCK bit in FLASH\_CR register to 1
- Disables all erase and program operations
- Provides protection against software bugs
- Should be called immediately after HAL\_FLASH\_Program/HAL\_FLASHEx\_Erase
- Safe to call even if flash is already locked

### 3. HAL\_FLASHEx\_Erase()

```
1 HAL_StatusTypeDef HAL_FLASHEx_Erase(
2     FLASH_EraseInitTypeDef *pEraseInit,
3     uint32_t *PageError)
```

**Purpose:** Erase one or more pages of flash memory.

**Parameters:**

- **pEraseInit:** Pointer to erase configuration structure
- **TypeErase:** FLASH\_TYPEERASE\_PAGES (page erase, 1 KB granularity)
- **PageAddress:** Starting address of first page to erase
- **NbPages:** Number of consecutive pages to erase
- **PageError:** Output parameter: page number where error occurred (if any)

**Return value:**

- **HAL\_OK:** Erase successful
- **HAL\_ERROR:** Erase failed (flash is locked or other error)
- **HAL\_TIMEOUT:** Erase operation timed out

### Bootloader usage example:

```
1 FLASH_EraseInitTypeDef erase_init;
2 uint32_t page_error = 0;
3
4 erase_init.TypeErase = FLASH_TYPEERASE_PAGES;
5 erase_init.PageAddress = 0x08002000; /* Application base */
6 erase_init.NbPages = 1; /* Erase 1 page (1 KB) */
7
8 HAL_FLASH_Unlock();
9 status = HAL_FLASHEx_Erase(&erase_init, &page_error);
10 HAL_FLASH_Lock();
11
12 if (status != HAL_OK)
13 {
14     printf("Erase failed
15 }
```

### Important Notes:

- Erase time: ~20-40 ms per page on STM32F103 at 72 MHz
- Entire page is erased to 0xFFFFFFFF (all 1's)
- Bootloader pages (0-7) should NEVER be erased by application code

- After erase, page must be written to complete the cycle

#### 4. HAL\_FLASH\_Program()

```

1 HAL_StatusTypeDef HAL_FLASH_Program(
2     uint32_t TypeProgram,
3     uint32_t Address,
4     uint64_t Data)

```

**Purpose:** Write data to flash memory at specified address.

**Parameters:**

- **TypeProgram:** FLASH\_TYPEPROGRAM\_HALFWORD (16-bit) or FLASH\_TYPEPROGRAM\_WORD (32-bit)
- **Address:** Flash address where data will be written (must be aligned to word size)
- **Data:** Data to write (16-bit or 32-bit value)

**Return value:**

- **HAL\_OK:** Program successful
- **HAL\_ERROR:** Flash not unlocked, address misaligned, or write error
- **HAL\_TIMEOUT:** Flash operation timed out

#### Bootloader usage (16-bit writes):

```

1 uint32_t flash_addr = 0x08002000; /* Application base */
2 uint8_t firmware_data[128];        /* 128-byte chunk from XModem */
3
4 HAL_FLASH_Unlock();
5
6 for (int i = 0; i < 128; i += 2)
7 {
8     /* Combine two bytes into 16-bit word (little-endian) */
9     uint16_t word = (uint16_t)firmware_data[i] |
10                  ((uint16_t)firmware_data[i + 1] << 8);
11
12     status = HAL_FLASH_Program(FLASH_TYPEPROGRAM_HALFWORD,
13                                flash_addr + i,
14                                word);
15
16     if (status != HAL_OK)
17     {
18         HAL_FLASH_Lock();
19         return -1; /* Write error */
20     }
21 }
22
23 HAL_FLASH_Lock();

```

**Key Characteristics:**

- Write time: ~10-40 microseconds per 16-bit word
- Can only program 0 bits to 1 bits (not 1 to 0)
- Page must be erased first before programming
- STM32F1 supports only 16-bit and 32-bit writes (no 8-bit)
- Misaligned writes will silently fail or produce undefined behavior

### 2.3.5 Complete Flash Write Function from Bootloader

Here is the complete **flash\_write\_page()** function as implemented in main.c, demonstrating proper HAL FLASH API usage:

```
1 static int flash_write_page(uint32_t addr, const uint8_t *data, uint16_t size)
2 {
3     uint32_t page_error = 0;
4     FLASH_EraseInitTypeDef erase_init;
5     HAL_StatusTypeDef status;
6     uint16_t i;
7
8     /* Validate address range */
9     if (addr < APP_BASE_ADDR || (addr + size) > (BOOT_BASE_ADDR + FLASH_SIZE))
10    {
11        return -1;
12    }
13
14     /* Validate size is word-aligned */
15     if (size & 1)
16    {
17        return -1;
18    }
19
20     /* Step 1: Unlock flash for programming */
21     HAL_FLASH_Unlock();
22
23     /* Step 2: Configure erase operation for 1KB page */
24     erase_init.TypeErase = FLASH_TYPEERASE_PAGES;
25     erase_init.PageAddress = addr;
26     erase_init.NbPages = 1;
27
28     /* Step 3: Erase the page */
29     status = HAL_FLASHEx_Erase(&erase_init, &page_error);
30     if (status != HAL_OK)
31    {
32         HAL_FLASH_Lock();
33         return -1;
34     }
35
36     /* Step 4: Program data as 16-bit words */
37     for (i = 0; i < size; i += 2)
38    {
39         uint16_t word = (uint16_t)data[i] | ((uint16_t)data[i + 1] << 8);
40         status = HAL_FLASH_Program(FLASH_TYPEPROGRAM_HALFWORD, addr + i, word);
41         if (status != HAL_OK)
42    {
43         HAL_FLASH_Lock();
44         return -1;
45     }
46 }
47
48     /* Step 5: Lock flash after programming */
49     HAL_FLASH_Lock();
50     return 0;
51 }
```

#### Analysis of the function:

1. **Address Validation** : Ensures write stays within application range (0x08002000 - 0x0800FFFF)
2. **Size Validation** : Checks that size is even (required for 16-bit writes)
3. **Unlock** : Enables flash programming via HAL\_FLASH\_Unlock()
4. **Erase** : Clears target page to 0xFF via HAL\_FLASHEx\_Erase()
5. **Program** : Writes data in 16-bit words with error checking
6. **Lock** : Re-protects flash immediately after write via HAL\_FLASH\_Lock()

This page-buffered approach minimizes erase cycles, reduces wear on flash, and fits within 20 KB RAM.

## 2.4 4. CRC-16-XMODEM Error Detection

**Definition:** CRC (Cyclic Redundancy Check) detects transmission errors. The bootloader uses CRC-16-XMODEM polynomial 0x1021 for frame validation.

### 2.4.1 CRC-16-XMODEM Algorithm

The CRC-16-XMODEM polynomial is:

$$G(x) = x^{16} + x^{12} + x^5 + 1$$

Algorithm:

1. Initialize CRC to 0x0000
2. For each byte in the frame:
  - XOR byte into high byte of CRC
  - Shift CRC left by 1 bit, 8 times
  - If MSB was 1, XOR CRC with 0x1021 after each shift
3. Final CRC is appended as 16-bit big-endian value

### 2.4.2 CRC in Python

```
1 def crc16_xmodem(data):
2     """Calculate XModem CRC-16 (polynomial 0x1021)"""
3     crc = 0
4     for byte in data:
5         crc ^= byte << 8
6         for _ in range(8):
7             crc <= 1
8             if crc & 0x10000:
9                 crc ^= 0x1021
10            crc &= 0xFFFF
11    return crc
```

This produces a 16-bit value highly sensitive to any single-bit error. The probability of an undetected error is less than 1 in 65536.

### 2.4.3 Why CRC-16-XMODEM?

The XModem standard is well-tested and industry-proven. Its 16-bit CRC:

- Detects all 1-bit and 2-bit errors
- Detects most burst errors up to 16 bits
- Is computationally efficient (no large lookup tables)
- Is implemented identically on microcontroller and host

## 3 Foundational Concepts for Flash Loaders

Before implementing or understanding the flash loader protocol, you should be comfortable with these foundational topics.

### 3.1 1. Serial Communication and UART (Mandatory)

- Asynchronous communication (no shared clock between sender and receiver)
- Baud rate: Speed in bits per second (115200 for this loader)
- Frame format: Start bit, 8 data bits, 1 stop bit (10 bits total per byte)
- Timeout handling: Must handle device disconnection or power loss

### **3.2 2. Frame-Based Protocols (Mandatory)**

A frame is a discrete message: [Header | Data | Trailer]. The flash loader uses XModem framing with:

- Clear boundaries (SOH delimiter)
- Block numbering (detect duplicates and losses)
- CRC trailer (detect payload corruption)

### **3.3 3. Flash Memory Organization (Mandatory)**

- STM32F103C8 contains 64 KB flash at 0x08000000 - 0x0800FFFF
- Page size: 1 KB (erase granularity)
- Application starts at page 8 (0x08002000) after 8 KB bootloader
- Bootloader flash must never be erased by application

### **3.4 4. Bootloader State Machine (Important)**

The bootloader implements a clear state machine:

- **Heartbeat State:** Send "BOOT" every 500ms, listen for mode
- **Transfer State:** Receive XModem frames, accumulate, write flash
- **Auto-Jump:** Timeout without host response, jump to application

### **3.5 5. Application Validation (Important)**

Before jumping to application, bootloader verifies:

- **Non-zero initial MSP:** If MSP at 0x08002000 is 0x00000000, application is invalid
- **Valid vector table:** First word must point into RAM
- **Code present:** At least one non-0xFF byte in application region

## 4 Flash Loader Protocol Specification



### 4.1 Protocol Overview

**Name:** STM32F103 UART Flash Loader (XModem-CRC variant)

**Transport:** UART at 115200 baud, 8 data bits, 1 stop bit, no parity

**Physical Interface:** USART1 (PA9=TX, PA10=RX) on STM32F103C8

**Error Detection:** CRC-16-XMODEM (polynomial 0x1021)

**Frame Size:** 128-byte data payload + 5-byte header + 2-byte CRC = 135 bytes

**Addressing:** Application space 0x08002000 - 0x0800FFFF (56 KB)

### 4.2 Communication Phases

#### 4.2.1 Phase 1: Bootloader Heartbeat (Discovery)

Upon power-on or reset, bootloader enters heartbeat state:

1. Initialize clocks (72 MHz via PLL)
2. Initialize UART1 (115200 baud)

3. Validate application (check MSP and initial vector)
4. Start 5-second timeout counter
5. Send "BOOT<CR><LF>" every 500 milliseconds

```

1 Bootloader → Host: "BOOT<CR><LF>" (6 bytes)
2 Bootloader → Host: "BOOT<CR><LF>" (after 500ms)
3 Bootloader → Host: ... (continues until mode received or timeout)

```

**Timeout Behavior:** If no mode command within 5 seconds and application is valid, bootloader automatically jumps to application. This allows devices to boot without host intervention.

#### 4.2.2 Phase 2: Mode Selection

Host receives "BOOT" message and sends mode selection:

```

1 Host → Bootloader: 'C' (0x43) - Select CRC-16 mode
2 Host → Bootloader: 'N' (0x4E) - Select checksum mode (legacy)

```

Upon receiving 'C', bootloader exits heartbeat state and begins listening for data frames.

#### 4.2.3 Phase 3: Data Transfer Loop

```

1 Frame Format: SOH BlockNum ~BlockNum Data[128] CRC_H CRC_L
2 Host → Bootloader: 135-byte frame
3 Bootloader: Validate CRC, accumulate in RAM buffer
4 Bootloader: When page full or transfer complete, erase and write flash
5 Bootloader → Host: ACK (0x06) on success or NAK (0x15) on failure
6 Host: Wait for ACK/NAK before sending next frame
7 (Loop continues for each 128-byte block)

```

**Error Recovery:** If NAK received, host retries the same frame (same block number). Bootloader detects duplicate by block number and ignores.

#### 4.2.4 Phase 4: End of Transfer

After all frames sent:

1. Host sends EOT (0x04)
2. Bootloader flushes any remaining buffered data to flash
3. Bootloader sends final ACK (0x06)
4. Bootloader validates complete application
5. Bootloader jumps to application at 0x08002000

```

1 Host → Bootloader: EOT (0x04)
2 Bootloader: Finalize flash write
3 Bootloader → Host: ACK (0x06)
4 Bootloader: Jump to application

```

## 4.3 Detailed Frame Structure

### 4.3.1 XModem Frame Format (CRC-16 Mode)

| Offset  | Size      | Field   | Purpose                            |
|---------|-----------|---------|------------------------------------|
| 0       | 1 byte    | SOH     | 0x01 - marks start of frame        |
| 1       | 1 byte    | Block#  | Frame counter (1-255, wraps)       |
| 2       | 1 byte    | ~Block# | Complement of block# (error check) |
| 3-130   | 128       | Payload | Application data                   |
| 131-132 | 2         | CRC-16  | Big-endian CRC over Payload        |
| Total:  | 135 bytes |         |                                    |

### 4.3.2 CRC-16 Calculation

CRC is calculated over the 128-byte Payload only (not SOH, block numbers, or CRC itself).

- **Polynomial:** 0x1021 (standard XModem)
- **Initial value:** 0x0000
- **Byte order:** Big-endian (MSB first)
- **Output:** Appended as CRC\_H (MSB at offset 131) then CRC\_L (LSB at offset 132)

### 4.3.3 Block Number Wraparound

Block number increments from 1 to 255, then wraps to 1. Block zero is never used. This detects:

- Duplicate frames (same block number received twice)
- Lost frames (block number jumps)
- Out-of-order frames

Sequence: 1, 2, 3, ..., 254, 255, 1, 2, ...

## 4.4 Control Characters

| Name | Value | Meaning                           |
|------|-------|-----------------------------------|
| SOH  | 0x01  | Start of Header (frame delimiter) |
| ACK  | 0x06  | Acknowledge (success)             |
| NAK  | 0x15  | Negative Acknowledge (retry)      |
| CAN  | 0x18  | Cancel (abort transfer)           |
| EOT  | 0x04  | End of Transmission               |
| 'C'  | 0x43  | Initiate CRC-16 mode              |
| 'N'  | 0x4E  | Initiate checksum mode            |

## 4.5 Timeout Specifications

| Context              | Timeout   | Purpose                                      |
|----------------------|-----------|----------------------------------------------|
| Heartbeat            | 500 ms    | Interval between "BOOT" messages             |
| Bootloader Auto-Jump | 5 seconds | Exit bootloader if no host response          |
| Per-Frame Response   | 5 seconds | Wait for ACK/NAK after sending frame         |
| Per-Byte RX          | 5 seconds | Timeout waiting for a single byte            |
| Post-EOT             | N/A       | Bootloader immediately jumps after final ACK |

## 4.6 Memory Organization During Transfer

Bootloader manages flash write as frames arrive:

- **page\_buffer[1024]**: RAM-based accumulation buffer (one flash page size)
- **Buffering strategy**: 8 frames  $\times$  128 bytes = 1024 bytes = 1 complete flash page
- **Write timing**: When buffer full or EOT received, entire page is:
  1. Erased (20-40 ms)
  2. Written as 16-bit words (10-40  $\mu$ s per word)
  3. Verified by read-back

This minimizes erase cycles (erase is slow) while staying within 20 KB RAM.

## 4.7 Application Launch and Reset Behavior

### 4.7.1 Bootloader Startup and System Clock Initialization

When the STM32F103C8 exits reset, the bootloader begins executing at 0x08000000:

```
1 int main(void)
2 {
3     /* Initialize HAL and system clock */
4     HAL_Init();
5     SystemClock_Config();
6
7     /* Initialize UART (115200 baud for flash loader) */
8     MX_USART1_UART_Init();
9
10    /* Run flash loader protocol (heartbeat sent in flash_loader_main) */
11    flash_loader_main();
12
13    /* If we reach here, jump to application in flash */
14    jump_to_application();
15
16    /* Should never reach this point */
17    while (1) { }
18 }
```

#### Key initialization steps:

1. **HAL\_Init()** : Initializes HAL infrastructure (NVIC, SysTick timer, interrupt priorities)
2. **SystemClock\_Config()** : Configures 72 MHz clock from 8 MHz HSE via PLL
3. **MX\_USART1\_UART\_Init()** : Configures USART1 (PA9=TX, PA10=RX) at 115200 baud

```
1 void SystemClock_Config(void)
2 {
3     RCC_OscInitTypeDef RCC_OscInitStruct = {0};
4     RCC_ClkInitTypeDef RCC_ClkInitStruct = {0};
5
6     /* Enable HSE oscillator (8 MHz crystal) */
7     RCC_OscInitStruct.OscillatorType = RCC_OSCILLATORTYPE_HSE;
8     RCC_OscInitStruct.HSEState = RCC_HSE_ON;
9     RCC_OscInitStruct.HSEPredivValue = RCC_HSE_PREDIV_DIV1;
10
11    /* Configure PLL: 8 MHz * 9 = 72 MHz */
12    RCC_OscInitStruct.PLL.PLLState = RCC_PLL_ON;
13    RCC_OscInitStruct.PLL.PLLSource = RCC_PLLSOURCE_HSE;
14    RCC_OscInitStruct.PLL.PLLMUL = RCC_PLL_MUL9;
15
16    HAL_RCC_OscConfig(&RCC_OscInitStruct);
17 }
```

```

18 /* Configure clock distribution */
19 RCC_ClkInitStruct.ClockType = RCC_CLOCKTYPE_HCLK | RCC_CLOCKTYPE_SYSCLK
20           | RCC_CLOCKTYPE_PCLK1 | RCC_CLOCKTYPE_PCLK2;
21 RCC_ClkInitStruct.SYSCLKSource = RCC_SYSCLKSOURCE_PLLCLK;
22 RCC_ClkInitStruct.AHBCLKDivider = RCC_SYSCLK_DIV1;    /* AHB: 72 MHz */
23 RCC_ClkInitStruct.APB1CLKDivider = RCC_HCLK_DIV2;      /* APB1: 36 MHz */
24 RCC_ClkInitStruct.APB2CLKDivider = RCC_HCLK_DIV1;      /* APB2: 72 MHz */
25
26 /* Set flash wait states for 72 MHz (requires 2 wait states) */
27 HAL_RCC_ClockConfig(&RCC_ClkInitStruct, FLASH_LATENCY_2);
28 }

```

### Clock Configuration Details:

- **HSE (High-Speed External):** 8 MHz crystal oscillator
- **PLL multiplication:**  $8 \text{ MHz} \times 9 = 72 \text{ MHz}$  system clock
- **AHB clock:** 72 MHz (no divider, full speed)
- **APB1 clock:** 36 MHz ( $\div 2$ , required limit is 36 MHz max)
- **APB2 clock:** 72 MHz (full speed, used for UART)
- **Flash wait states:** 2 cycles (required at 72 MHz per datasheet)

#### 4.7.2 Bootloader Reset Behavior

##### What happens during reset:

1. Hardware resets all internal state, clears RAM
2. Processor starts in reset handler defined in startup assembly (startup\_stm32f103xb.s)
3. Reset handler initializes stack pointer (MSP) to 0x20005000 (top of 20 KB RAM)
4. Call SystemInit() to set initial clock configuration
5. Initialize BSS section (zero-initialized global variables)
6. Call main()

##### Critical reset details for flash loader:

- Flash controller is LOCKED at reset (HAL\_FLASH\_Unlock must be called)
- All interrupts are disabled at reset
- SysTick timer is not running (HAL\_Init enables it)
- All GPIO pins default to floating input
- UART is not initialized (MX\_USART1\_UART\_Init must be called)
- Vector table at 0x08000000 (bootloader reset vectors)

#### 4.7.3 Application Validation Before Jump

Before jumping to application, bootloader validates:

```

1 static int check_application_valid(void)
2 {
3     uint32_t app_sp = *(volatile uint32_t *) (APP_BASE_ADDR + 0);
4     uint32_t app_reset = *(volatile uint32_t *) (APP_BASE_ADDR + 4);
5
6     /* Check if stack pointer is in valid RAM range */
7     if (app_sp < RAM_START_ADDR || app_sp > RAM_END_ADDR)
8     {

```

```

9     return 0; /* Invalid MSP */
10    }
11
12 /* Check if reset vector is in application flash range */
13 if ((app_reset < APP_BASE_ADDR) || (app_reset > (BOOT_BASE_ADDR + FLASH_SIZE)))
14 {
15     return 0; /* Invalid reset vector */
16 }
17
18 /* Check if reset vector has Thumb bit set (LSB = 1) */
19 if ((app_reset & 0x1) == 0)
20 {
21     return 0; /* Not Thumb code */
22 }
23
24 return 1; /* Application is valid */
25 }

```

### Validation checks explained:

1. **Stack Pointer (MSP)** : Read from offset 0x08002000. Must point into RAM (0x20000000 - 0x20005000)
2. **Reset Vector** : Read from offset 0x08002004. Must point into flash (0x08002000 - 0x0800FFFF)
3. **Thumb Bit** : Reset vector LSB must be 1 (Cortex-M3 requires Thumb mode)

If any check fails, bootloader returns to heartbeat state and continues waiting for flash command.

#### 4.7.4 Bootloader Jump-to-Application Sequence

When jumping to application, bootloader must ensure clean handoff:

```

1 static void jump_to_application(void)
2 {
3     typedef void (*app_entry_t)(void);
4     uint32_t app_sp;
5     uint32_t app_reset;
6     app_entry_t app_reset_handler;
7
8     /* Step 1: Read application vector table */
9     app_sp = *(volatile uint32_t *) (APP_BASE_ADDR + 0);
10    app_reset = *(volatile uint32_t *) (APP_BASE_ADDR + 4);
11
12    /* Step 2: Disable all interrupts before context switch */
13    __disable_irq();
14
15    /* Step 3: Stop SysTick timer */
16    SysTick->CTRL = 0;
17
18    /* Step 4: Disable all NVIC interrupts and clear pending flags */
19    for (uint32_t i = 0; i < 8; i++)
20    {
21        NVIC->ICER[i] = 0xFFFFFFFFU; /* Interrupt Clear Enable */
22        NVIC->ICPR[i] = 0xFFFFFFFFU; /* Interrupt Clear Pending */
23    }
24
25    /* Step 5: Deinitialize HAL peripherals (especially UART) */
26    HAL_UART_DeInit(&huart1);
27    HAL_DeInit();
28
29    /* Step 6: Relocate NVIC vector table to application */
30    SCB->VTOR = APP_BASE_ADDR; /* Vector table offset register */
31
32    /* Step 7: Set application stack pointer */
33    __set_MSP(app_sp);
34
35    /* Step 8: Jump to application reset handler */
36    app_reset_handler = (app_entry_t)app_reset;
37    app_reset_handler();
38

```

```

39  /* Never reached */
40  while (1);
41 }

```

#### Detailed explanation of each step:

1. **Read Vector Table** : Fetch MSP (offset 0) and Reset\_Handler (offset 4) from application base
2. **Disable Interrupts** : CPSID instruction prevents interrupts during transition
3. **Stop SysTick** : Set SysTick->CTRL to 0 to halt system tick timer
4. **Disable All NVIC Interrupts** : Clear all ICER (Interrupt Clear Enable) bits and ICPR (Interrupt Clear Pending) bits
5. **HAL Deinitialization** : Call HAL\_UART\_DeInit() to release UART1 (prevents interference), call HAL\_DeInit() for other HAL cleanup
6. **Set VTOR** : SCB->VTOR (NVIC Vector Table Offset Register) = 0x08002000 relocates interrupt vector table
7. **Set MSP** : \_\_set\_MSP() instruction sets ARM stack pointer to application's initial MSP
8. **Jump to Code** : Function pointer cast and call executes application reset handler

#### Why this sequence is critical:

- **Disable interrupts first:** Prevents IRQ firing while still executing bootloader code
- **Stop SysTick:** SysTick interrupt fires every 1 ms; must be disabled before jump
- **Clear NVIC interrupt flags:** Bootloader may have pending interrupts that application doesn't expect
- **Deinitialize HAL:** Stops UART reception/transmission, closes GPIO handles
- **Set VTOR:** Application's interrupt vectors now routed to 0x08002000, not bootloader's 0x08000000
- **Set MSP last:** After VTOR is set, we update stack pointer to application's expected value
- **Jump last:** Final instruction hands control to application code

After the jump, application's reset handler executes with clean state, all bootloader code and state left behind.

## 5 Host-Side Implementation: Python Flash Loader

The host flash\_loader.py script implements the client side of the protocol.

### 5.1 High-Level Flow

```

1. Open serial port (115200 baud)
2. Wait for "BOOT" message from bootloader
3. Send 'C' to select CRC-16 mode
4. Read firmware binary file
5. Pad to 128-byte blocks (fill with 0xFF)
6. For each 128-byte block:
   a. Construct XModem frame (SOH + BlockNum + ~BlockNum + Data + CRC)
   b. Transmit frame
   c. Wait for ACK/NAK
   d. If NAK: retry (up to 3 times)
7. Send EOT (0x04)
8. Wait for final ACK
9. Close serial port

```

## 5.2 Key Functions

### 5.2.1 crc16\_xmodem()

Computes CRC-16 for frame data (identical algorithm as bootloader):

```
1 def crc16_xmodem(data):
2     crc = 0
3     for byte in data:
4         crc ^= byte << 8
5         for _ in range(8):
6             crc <= 1
7             if crc & 0x10000:
8                 crc ^= 0x1021
9             crc &= 0xFFFF
10    return crc
```

### 5.2.2 send\_firmware()

Main function orchestrating the transfer:

- Opens serial port with retry
- Waits for bootloader heartbeat
- Sends mode selection ('C')
- Loops through firmware blocks
- Builds and transmits XModem frames
- Handles ACK/NAK responses with retry logic (max 3 retries per frame)
- Sends EOT and waits for final ACK

## 5.3 Error Handling and Retries

Host-side error handling includes:

- **Frame Timeout:** If no ACK/NAK within 2 seconds, retry
- **CRC Mismatch NAK:** Bootloader sends NAK; host retransmits same frame
- **Max Retries:** After 3 failed attempts, abort transfer with error message
- **Serial Port Errors:** Graceful handling of disconnection

## 6 Conclusion and Next Steps

The STM32F103 UART Flash Loader provides a complete, robust solution for firmware updates over UART. Key takeaways:

1. **Protocol Design:** XModem-CRC is industry-standard, reliable, and simple to implement
2. **HAL Integration:** Hardware abstraction ensures portability and correctness across STM32 families
3. **State Machine:** Clear, testable bootloader logic with defined phases
4. **Error Recovery:** Timeouts, retries, and CRC validation provide robustness against real-world communication failures
5. **Hands-On Implementation:** Both firmware (C) and host (Python) code provided as complete working references

Future enhancements could include:

- Firmware signatures for cryptographic authentication
- Dual-image OTA (over-the-air) updates with rollback

- USB DFU alternative transport layer
- Watchdog-based auto-recovery for stalled updates
- Bootloader self-update capability
- Progress indication and transfer resumption

This document provides a solid foundation for understanding, implementing, and extending flash loading solutions on STM32 microcontrollers. The combination of clear protocol specification, detailed HAL layer explanation, and working code examples ensures that firmware engineers can confidently deploy reliable firmware update systems in production environments.

## Appendix: Quick Reference

### Bootloader Quick-Start

1. Connect STM32F103C8 to serial port (PA9/PA10)
2. Power on device
3. Observe "BOOT<CR><LF>" message every 500ms
4. Run: python3 flash\_loader.py -p /dev/ttyUSB0 -f application.bin
5. After transfer completes, bootloader jumps to application

### Hardware Requirements

- STM32F103C8 (or STM32F103CB for 128 KB variant)
- USB-to-UART adapter (FTDI, CH340, etc.)
- 5V power supply
- UART connections: TX → PA9, RX → PA10, GND → GND

### Memory Map Summary

```
1 0x08000000 - 0x08001FFF: Bootloader (8 KB)
2 0x08002000 - 0x0800FFFF: Application (56 KB)
3 0x20000000 - 0x20005000: RAM (20 KB)
4 Page Buffer @ RAM: 1 KB for flash write buffering
```

### Frame Checksum Verification

Always verify:

1. Block number consistency (byte 1 = ~byte 2)
2. CRC-16 match over 128-byte payload (bytes 3-130)
3. SOH delimiter presence at frame start

### Working Example: Real Flash Load Session

Below is a real successful flash loading session from a development board:

```
1 $ st-flash reset && sleep 1 && cd /home/ashwin/Desktop/Projects/FlashLoader && python3
   scripts/simple_flash.py -p /dev/ttyUSB0 -f App1/build/Simple_BootLoader.bin
st-flash 1.8.0
2 2026-02-01T11:13:36 INFO common.c: STM32F1xx_MD: 20 KiB SRAM, 64 KiB flash in at least 1 KiB pages.
3 2026-02-01T11:13:36 INFO common.c: NRST is not connected --> using software reset via AIRCR
4 2026-02-01T11:13:36 INFO common.c: Go to Thumb mode
5 Firmware size: 4432 bytes
6 Connected to /dev/ttyUSB0 at 115200 baud
7 Waiting for bootloader...
8 Bootloader detected.
9 Sending START command...
10 START acknowledged.
11 Starting transfer (554 chunks = 4432 bytes)...
12 Chunk 1/554 ( 0 bytes): OK
13 Chunk 129/554 ( 1024 bytes): OK
14 Chunk 257/554 ( 2048 bytes): OK
15 Chunk 385/554 ( 3072 bytes): OK
16 Chunk 513/554 ( 4096 bytes): OK
17 Chunk 554/554 ( 4424 bytes): OK
18
```

```
20 All 554 chunks transferred successfully!
21 Sending END command...
22 Firmware transfer complete!
23 Bootloader will now jump to application at 0x08002000
24 Serial port closed.
25 Flash loading successful.
```

#### Analysis:

- Device detection: STM32F1xx with 20 KiB SRAM and 64 KiB flash
- UART connection established at 115200 baud
- Bootloader heartbeat received ("Bootloader detected")
- Protocol handshake successful (START acknowledged)
- Firmware size: 4432 bytes (fits in 56 KB application space)
- Transfer chunked into 554 blocks with progress indicators at every 128 blocks
- All chunks acknowledged (ACK received for each frame)
- Final confirmation: Application jump initiated to 0x08002000

## Troubleshooting Common Issues

- **No "BOOT" message:** Check UART wiring and baud rate (115200)
- **NAK on every frame:** CRC mismatch—verify CRC algorithm matches
- **Application does not start:** Verify application linked to 0x08002000; check bootloader MSP validation
- **Device hangs mid-transfer:** Flash write timeout—check clock configuration and flash wait states
- **"BOOT" but no mode response:** Try resending 'C' manually; timeout is 5 seconds