

# Bit-Banding GPIO HAL on STM32

Ashok Ashwin

February 15, 2026

## Contents

|                                               |          |
|-----------------------------------------------|----------|
| <b>I What is Bit-Banding?</b>                 | <b>2</b> |
| 1 The Bit-Band Memory Model                   | 2        |
| 1.1 The Alias Address Formula                 | 2        |
| 1.2 Why Bit-Banding Matters                   | 2        |
| <br>                                          |          |
| <b>II The GPIO Bit-Band HAL</b>               | <b>3</b> |
| 2 Project Structure                           | 3        |
| 3 The Core Bit-Band Macro                     | 3        |
| 4 Clock Enable via Bit-Banding                | 3        |
| 5 Pin Configuration                           | 4        |
| 6 Pin Set, Clear, Toggle, and Read            | 4        |
| <br>                                          |          |
| <b>III Application: Blinking PC13</b>         | <b>5</b> |
| 7 The main.c Source                           | 5        |
| 8 Startup and Linker                          | 5        |
| 9 Makefile                                    | 6        |
| <br>                                          |          |
| <b>IV Bit-Banding in Depth</b>                | <b>8</b> |
| 10 Traditional Register Access vs Bit-Banding | 8        |
| 10.1 Traditional: Read-Modify-Write on ODR    | 8        |
| 10.2 Bit-Band: Single Store                   | 8        |
| 11 When to Use Bit-Banding                    | 8        |
| 12 Limitations                                | 8        |

## Part I

# What is Bit-Banding?

ARM Cortex-M3 processors provide a hardware feature called bit-banding that allows atomic, single-instruction access to individual bits in memory. Instead of the conventional read-modify-write pattern required to manipulate a single bit in a register, bit-banding maps each bit in a 1 MB region to a full 32-bit word in a 32 MB alias region. Writing to the alias word sets or clears exactly one bit in the original register — no masking, no shifting, no race conditions.

This project demonstrates bit-banding by building a small GPIO HAL for the STM32F103C8T6 (Blue Pill). The HAL uses bit-band alias addresses to enable clocks, set pins, clear pins, and toggle pins without read-modify-write sequences.

## 1 The Bit-Band Memory Model

The Cortex-M3 defines two bit-band regions:

- Peripheral bit-band region: 0x40000000 – 0x400FFFFF (1 MB)
- Peripheral bit-band alias: 0x42000000 – 0x43FFFFFF (32 MB)
- SRAM bit-band region: 0x20000000 – 0x200FFFFF (1 MB)
- SRAM bit-band alias: 0x22000000 – 0x23FFFFFF (32 MB)

Every bit in the 1 MB region is mapped to a word in the 32 MB alias region. Since 1 MB = 8 million bits and each alias word is 4 bytes, the alias region occupies 32 MB.

### 1.1 The Alias Address Formula

For a peripheral register at address  $A$  and bit number  $n$  (0–31):

$$\text{alias\_addr} = 0x42000000 + (A - 0x40000000) \times 32 + n \times 4$$

The multiplication by 32 converts byte offset to bit offset (8 bits per byte  $\times$  4 bytes per alias word), and the multiplication by 4 selects the correct word within that byte's 8 alias words.

### 1.2 Why Bit-Banding Matters

Consider enabling the GPIOC clock. The traditional approach uses a read-modify-write pattern:

```
1 // Traditional read-modify-write: 3 bus transactions
2 RCC_APB2ENR |= (1 << 4); // READ register, OR with mask, WRITE back
```

This compiles to a load, an OR, and a store — three bus transactions. If an interrupt fires between the load and the store, it could modify the same register, and the write-back would overwrite that change.

With bit-banding:

```
1 // Bit-band: 1 bus transaction, atomic
2 BITBAND_PERIPH(RCC_APB2ENR_ADDR, 4) = 1; // Single STORE to alias
```

This is a single store instruction. The bus hardware translates it into an atomic bit-set on the actual register. No read, no mask, no race.

**Key advantages:**

- Atomic: A single store cannot be interrupted mid-operation
- No masking: No shift or OR/AND operations needed in software
- Interrupt-safe: No window for data corruption between read and write
- Smaller code: One instruction instead of three

## Part II

# The GPIO Bit-Band HAL

## 2 Project Structure

- gpio\_bitband.h
- main.c
- startup.s
- linker.inv
- Makefile

The HAL is implemented as a single header file (gpio\_bitband.h) with static inline functions. This ensures zero call overhead – the compiler inlines every function, producing the same machine code as hand-written register access.

## 3 The Core Bit-Band Macro

```
1 #define PERIPH_BASE      0x40000000U
2 #define PERIPH_BB_BASE   0x42000000U
3
4 #define BITBAND_PERIPH(reg_addr, bit)
5     (*volatile unsigned int *) (PERIPH_BB_BASE +
6         (((unsigned int)(reg_addr) - PERIPH_BASE) * 32U) +
7             ((bit) * 4U))
```

This macro takes a peripheral register address and a bit number, then computes the 32-bit alias address. The result is a volatile dereference, so it can be used on both sides of an assignment:

```
1 BITBAND_PERIPH(addr, bit) = 1;    // Set bit
2 BITBAND_PERIPH(addr, bit) = 0;    // Clear bit
3 x = BITBAND_PERIPH(addr, bit);   // Read bit (returns 0 or 1)
```

## 4 Clock Enable via Bit-Banding

```
1 #define RCC_APB2ENR_ADDR  (RCC_BASE + 0x18U)
2
3 /* RCC_APB2ENR bit positions */
4 #define RCC_IOPAEN_BIT    2U
5 #define RCC_IOPBEN_BIT    3U
6 #define RCC_IOPCEN_BIT    4U
7 #define RCC_IOPDEN_BIT    5U
8
9 static inline void gpio_clock_enable(gpio_port_t port) {
10     BITBAND_PERIPH(RCC_APB2ENR_ADDR, _gpio_rcc_bit(port)) = 1U;
11 }
```

For GPIOC (bit 4 in RCC\_APB2ENR at 0x40021018):

$$\text{alias} = 0x42000000 + (0x40021018 - 0x40000000) \times 32 + 4 \times 4$$

$$= 0x42000000 + 0x21018 \times 32 + 16$$

$$= 0x42000000 + 0x420300 + 0x10$$

$$= 0x42420310$$

Writing 1 to address 0x42420310 atomically sets bit 4 in RCC\_APB2ENR, enabling the GPIOC clock.

## 5 Pin Configuration

Pin configuration involves a 4-bit field (MODE[1:0] + CNF[1:0]) and cannot be done with a single bit-band write. This function uses the traditional read-modify-write pattern:

```
1 static inline void gpio_pin_config(gpio_port_t port, unsigned int pin,
2                                     gpio_mode_t mode, gpio_cnf_t cnf) {
3     unsigned int base = _gpio_base(port);
4     unsigned int offset = (pin < 8U) ? GPIO_CRL_OFF : GPIO_CRH_OFF;
5     unsigned int pos = (pin % 8U) * 4U;
6     volatile unsigned int *cr = (volatile unsigned int *) (base + offset);
7
8     *cr &= ~(0xFU << pos);
9     *cr |= (((unsigned int) cnf << 2U) | (unsigned int) mode) << pos;
10 }
```

This is a deliberate design choice: bit-banding is used where it provides a real benefit (single-bit operations), while multi-bit fields still require the mask-and-set approach.

## 6 Pin Set, Clear, Toggle, and Read

These operations target single bits in ODR or IDR, making them ideal for bit-banding:

```
1 /* Set pin HIGH: write 1 to ODR bit alias */
2 static inline void gpio_pin_set(gpio_port_t port, unsigned int pin) {
3     BITBAND_PERIPH(_gpio_base(port) + GPIO_ODR_OFF, pin) = 1U;
4 }
5
6 /* Clear pin LOW: write 0 to ODR bit alias */
7 static inline void gpio_pin_clear(gpio_port_t port, unsigned int pin) {
8     BITBAND_PERIPH(_gpio_base(port) + GPIO_ODR_OFF, pin) = 0U;
9 }
10
11 /* Toggle: read ODR bit via alias, XOR, write back */
12 static inline void gpio_pin_toggle(gpio_port_t port, unsigned int pin) {
13     volatile unsigned int *alias =
14         (volatile unsigned int *) (PERIPH_BB_BASE +
15             ((_gpio_base(port) + GPIO_ODR_OFF - PERIPH_BASE) * 32U) +
16             (pin * 4U));
17     *alias ^= 1U;
18 }
19
20 /* Read pin: returns 0 or 1 from IDR bit alias */
21 static inline unsigned int gpio_pin_read(gpio_port_t port, unsigned int pin) {
22     return BITBAND_PERIPH(_gpio_base(port) + GPIO_IDR_OFF, pin);
23 }
```

Note that `gpio_pin_toggle` still requires a read-XOR-write on the alias word, but this operates on a single word containing only 0 or 1, which is simpler than masking a 32-bit register. For truly atomic set/clear without any read, use `gpio_pin_set` and `gpio_pin_clear` directly.

# Part III

## Application: Blinking PC13

### 7 The main.c Source

```
1  /* main.c */
2  #include "gpio_bitband.h"
3
4  int main(void) {
5
6      // Enable GPIOC clock via bit-banding (atomic)
7      gpio_clock_enable(GPIO_PORT_C);
8
9      // Configure PC13 as push-pull output, 50 MHz
10     gpio_pin_config(GPIO_PORT_C, 13, GPIO_MODE_OUTPUT_50MHZ, GPIO_CNF_OUTPUT_PP);
11
12     // Blink loop using bit-band toggle
13     while (1) {
14         gpio_pin_toggle(GPIO_PORT_C, 13);
15         for (volatile int i = 0; i < 100000; i++);
16     }
17
18     return 0;
19 }
```

Compare this with the original register-level version:

```
1  /* Original: raw register access */
2  #define RCC_APB2ENR    (*(volatile unsigned int *)0x40021018)
3  #define GPIOC_CRH     (*(volatile unsigned int *)0x40011004)
4  #define GPIOC_ODR     (*(volatile unsigned int *)0x4001100C)
5
6  int main(void) {
7      RCC_APB2ENR |= (1 << 4);
8      GPIOC_CRH &= ~(0xF << 20);
9      GPIOC_CRH |= (0x3 << 20);
10     while(1) {
11         GPIOC_ODR ^= (1 << 13);
12         for(volatile int i = 0; i < 100000; i++);
13     }
14     return 0;
15 }
```

The HAL version is more readable, port-independent, and the clock enable is genuinely atomic thanks to bit-banding.

### 8 Startup and Linker

The startup file and linker script are unchanged from the base project. The startup initializes the stack pointer and calls main():

```

1 .syntax unified
2 .cpu cortex-m3
3 .fpu softvfp
4 .thumb
5
6 .global vtable
7 .global reset_handler
8
9 .section .isr_vector
10 .align 2
11 vtable:
12     .word _estack
13     .word reset_handler + 1
14
15 .section .text
16 .align 2
17 reset_handler:
18     ldr r0, =_estack
19     mov sp, r0
20     bl main
21     b .
22     .ltorg

```

Listing 1: startup.s

The linker script defines Flash and RAM regions and computes the stack top:

```

1 MEMORY
2 {
3     FLASH ( rx )      : ORIGIN = 0x08000000, LENGTH = 64K
4     RAM ( rxw )       : ORIGIN = 0x20000000, LENGTH = 20K
5 }
6
7 _estack = ORIGIN(RAM) + LENGTH(RAM);
8
9 SECTIONS
10 {
11     .text : { *(.isr_vector) *(.text) *(.text*) *(.rodata) *(.rodata*) } > FLASH
12     .data : { *(.data) *(.data*) } > RAM AT> FLASH
13     .bss  : { *(.bss) *(.bss*) *(COMMON) } > RAM
14 }

```

## 9 Makefile

The HAL is header-only, so the Makefile requires no additional source files:

```

1 # Toolchain
2 CC = arm-none-eabi-gcc
3 OBJCOPY = arm-none-eabi-objcopy
4 SIZE = arm-none-eabi-size
5 GDB = arm-none-eabi-gdb
6
7 TARGET = program
8 ASM_SOURCES = startup.s
9 C_SOURCES = main.c
10 OBJECTS = $(ASM_SOURCES:.s=.o) $(C_SOURCES:.c=.o)
11
12 CFLAGS = -Og -g3 -mthumb -mcpu=cortex-m3 -Wall --specs=nosys.specs -nostdlib
13 LDFLAGS = -T linker.inv -Wl,-Map=$(TARGET).map
14
15 all: $(TARGET).elf $(TARGET).bin
16
17 %.o: %.c
18     $(CC) $(CFLAGS) -c $< -o $@
19 %.o: %.s
20     $(CC) $(CFLAGS) -c $< -o $@
21 $(TARGET).elf: $(OBJECTS)
22     $(CC) $(CFLAGS) $(OBJECTS) $(LDFLAGS) -o $@
23     $(SIZE) $@
24 $(TARGET).bin: $(TARGET).elf
25     $(OBJCOPY) -O binary $< $@
26
27 flash: $(TARGET).elf
28     openocd -c "set CPUTAPID 0x1ba01477" -f interface/stlink.cfg -f target/stm32f1x.cfg
29     -c "program $(TARGET).elf verify reset exit"
30
31 clean:
32     rm -f $(TARGET).elf $(TARGET).bin $(TARGET).map $(OBJECTS)
33
34 .PHONY: all flash clean

```

# Part IV

## Bit-Banding in Depth

### 10 Traditional Register Access vs Bit-Banding

To understand the concrete improvement, consider what happens at the instruction level when setting a single GPIO pin.

#### 10.1 Traditional: Read-Modify-Write on ODR

```
1 // C code
2 GPIOC_ODR |= (1 << 13);
```

Compiles to approximately:

```
1 ldr r0, =0x4001100C      ; Load ODR address
2 ldr r1, [r0]                ; READ current ODR value
3 orr r1, r1, #(1 << 13)   ; MODIFY: set bit 13
4 str r1, [r0]                ; WRITE back
```

Four instructions, two bus transactions (load + store). An interrupt between the ldr and str could corrupt the register.

#### 10.2 Bit-Band: Single Store

```
1 // C code
2 BITBAND_PERIPH(0x4001100C, 13) = 1;
```

Compiles to approximately:

```
1 ldr r0, =0x42220034      ; Load alias address for ODR bit 13
2 movs r1, #1                 ; Value to write
3 str r1, [r0]                ; Single STORE - hardware sets bit 13
```

Three instructions, one bus write. The store is atomic — no interrupt can interfere.

### 11 When to Use Bit-Banding

- Single-bit operations on peripheral registers (clock enable, pin set/clear, flag checks)
- Interrupt-safe bit manipulation without disabling interrupts
- SRAM flags shared between main code and ISR handlers
- Any scenario where read-modify-write races are a concern

### 12 Limitations

- Only available on Cortex-M3 and Cortex-M4 (not M0, not M7 in most implementations)
- Only the first 1 MB of each region is bit-addressable
- Multi-bit fields (like GPIO mode configuration) still require read-modify-write
- The alias region consumes 32 MB of address space (no physical memory cost)
- Toggle still needs a read-XOR-write on the alias word