



# AES-128 Encryption Using HPS–FPGA Co-Design

**Presented by:** Husam Aldulaimi

Email:[haa190002@utdallas.edu](mailto:haa190002@utdallas.edu), [enghusam1977@gmail.com](mailto:enghusam1977@gmail.com)

**Course:** EEDG 6370 – Design & Analysis of Reconfigurable Systems

**Toolchain:** Quartus Prime, Platform Designer, Altera SoC EDS(Embedded Design Suite), Linux (HPS)

# Project Motivation

AES-128 is a widely used symmetric encryption standard

Hardware acceleration provides:

Cyclone V SoC enables **tight integration** between:

**Goal:**  
Implement AES-128 encryption in FPGA and control it from HPS using memory-mapped registers.

Higher performance

Deterministic latency

Lower CPU load

ARM Cortex-A9 (HPS)

FPGA fabric

# System Architecture Overview

- Main Components:
- HPS (ARM Cortex-A9)
  - Generates Random 128 bit Key
  - Read the input plaintext
  - Controls encryption
  - Reads FPGA ciphertext
  - Decrypt FPGA ciphertext
- FPGA Fabric
  - AES-128 encryption core
  - Control & status logic
- HPS–FPGA Lightweight AXI Bridge
  - Memory-mapped communication
- Communication Style:  
Register-based (PIO peripherals)



# Hardware Architecture (FPGA Side)

## FPGA Modules:

AES-128 encryption core (128-bit data path)

Input registers (4\*32 bits plaintext)

Key registers (4\*32 bits key)

Output registers (4\*32 bits ciphertext)

Control register (START)

Status register (BUSY, DONE)

## Clock Domain:

CLOCK\_50 (50 MHz)

# HPS $\leftrightarrow$ FPGA Memory Map

- **Bridge:**

- **HPS Lightweight AXI  $\rightarrow$  Avalon-MM**
  - IN0\_addr-IN3\_addr: plaintext
  - KEY0\_addr-KEY3\_addr: Key
  - ENC0\_addr-ENC3\_addr: Ciphertext
  - CTRL\_addr: Control
    - only bit[0] used for START signal
    - wire start = ctrl\_conduit[0];
  - STAUTS\_addr: STATUS
    - bit[0] used for DONE, bit[1] used for BUSY
    - assign status\_conduit = {30'd0, done, busy};

| Register                      | Address Offset | Width  | Direction              |
|-------------------------------|----------------|--------|------------------------|
| IN0_addr–<br>IN3_addr         | INx_BASE       | 32-bit | HPS $\rightarrow$ FPGA |
| KEY0_addr –<br>KEY3_addr<br>– | KEYx_BASE      | 32-bit | HPS $\rightarrow$ FPGA |
| ENC0_addr –<br>ENC3_addr      | ENCx_BASE      | 32-bit | FPGA $\rightarrow$ HPS |
| CTRL_addr                     | CTRL_BASE      | 32-bit | HPS $\rightarrow$ FPGA |
| STATUS_addr                   | STATUS_BASE    | 32-bit | FPGA $\rightarrow$ HPS |

# Control & Status Register Definition

## **CTRL Register**

Bit

0

Name

Function

START

Start encryption

## **STATUS Register**

Bit

0

Name

Function

BUSY

Encryption in progress

1

DONE

Encryption completed

# FPGA Control Logic (FSM Behavior)

```
always @(posedge CLOCK_50 or negedge hps_fpga_reset_n) begin //RESET state
    if (!hps_fpga_reset_n) begin //Idle state
        pt_latched <= 128'd0;
        key_latched <= 128'd0;
        enc_latched <= 128'd0;
        busy <= 1'b0;//not busy
        done <= 1'b0;;
    end else begin if (start && !busy) begin//Busy state
        pt_latched <= in;
        key_latched <= key128;
        busy <= 1'b1;
        done <= 1'b0;
    end
    if (busy) begin//Done state
        enc_latched <= enc_comb;
        busy <= 1'b0;
        done <= 1'b1;
    end
end
end
```



# Software Workflow (HPS Side)

## C Program Steps:

- 1.Map FPGA registers using /dev/mem
- 2.Generate random 128-bit AES key
- 3.Read plaintext from user
- 4.Pack plaintext & key into 32-bit words Registers
- 5.Write KEY and plaintext registers to FPGA
- 6.Clear DONE flag
- 7.Issue START pulse
- 8.Poll STATUS until DONE = 1
- 9.Read ciphertext from FPGA
- 10.Decrypt the FPGA ciphertext

# Start / Handshake Sequence

```
*CTRL_addr = 0x0; // start=0  
BARRIER();  
*INx_addr = plaintxt  
*KEYx_addr = Generated Key  
BARRIER();  
*CTRL_addr = 0x1; // start =1  
BARRIER();  
*CTRL_addr = 0x0;
```

## Polling

- while(((\*STATUS\_addr) & 0x2) == 0);  
FPGA\_ciphertext= \*ENCx\_addr

# Timing Behavior

START asserted → inputs latched

Compute : AES core runs inside  
FPGA)

AES combinational output captured

DONE : write on HPS

# – Experimental Results

- Original plaintext : **I\_LIKE\_EEDG6370**
- Encryption Key: **844981AF4DFDD2D8D4B6D70B9D43D76B**
- Ciphertext from FPGA: **358C295C9EB8C0EBF32C5B385AD52EF6**
- SW decrypted plaintext(from FPGA ciphertext)/Plaintext from FPGA: **I\_LIKE\_EEDG6370**
- SW encrypted: **358C295C9EB8C0EBF32C5B385AD52EF6**

## •Verification:

SW encryption : ✓ returned FPGA encryption  
FPGA Decryption (SW): ✓ returns original plaintext

The screenshot shows a PuTTY terminal window titled "COM3 - PuTTY". The session details indicate a sector size of 512 bytes logical/physical and I/O size minimum/optimal of 512 bytes. The disk identifier is 0x461f365b. The terminal output shows the following steps:

- Mounting the USB drive: /dev/sdd1
- A warning message: "FAT-fs (sdd1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck."
- Mounting again: /dev/sdd1
- A warning message: "mount: /dev/sdd1 already mounted or /media/usb-drive/ busy"
- Mounting according to mtab: /dev/sdai
- Changing directory: cd /media/usb-drive/
- Changing directory to 15\_12\_2: cd 15\_12\_2/
- Running the application: ./my\_final\_project
- Entering the plaintext: I\_LIKE\_EEDG6370
- Encryption Key: 844981AF4DFDD2D8D4B6D70B9D43D76B
- Original plaintext: I\_LIKE\_EEDG6370
- Ciphertext from FPGA: 358C295C9EB8C0EBF32C5B385AD52EF6
- SW decrypted Plaintext(from FPGA Ciphertext): I\_LIKE\_EEDG6370
- SW encrypted: 358C295C9EB8C0EBF32C5B385AD52EF6

# Debug & Validation Techniques

---

Printed CTRL and STATUS registers

---

Verified BUSY/DONE transitions

---

Printed raw FPGA registers (IN, KEY, OUT)

---

Compared FPGA ciphertext vs software AES

---

Identified and fixed endianness mismatch

# Challenges & Solutions

## Challenge

Incorrect ciphertext

Start misbehavior

Debug difficulty

## Solution

Fixed 128-bit word ordering

Used start pulse  
( $0 \rightarrow 1 \rightarrow 0$ )

Status register instrumentation

# Key Learnings



HW/SW co-design  
requires **strict data  
ordering discipline**



Avalon PIOs expose  
**register interfaces**,  
not variables



Non-blocking  
assignments affect  
control timing



Clear handshake  
protocols are  
essential



Debug visibility  
(STATUS/CTRL)  
saves time

## Future Work:



DMA-based data transfer



Performance benchmarking