



# Real-Time Systems

*Lecture Topic – Bus Interfaces for System-on-Chip and Off-Chip*

Dr. Sam Siewert

Electrical, Computer and Energy Engineering

Embedded Systems Engineering Program

# Embedded I/O (HW View)

## ■ Analog I/O

- DAC analog output: servos, motors, heaters, ...
- ADC analog input: photodiodes, thermistors, ...

## ■ Digital I/O

- Direct TTL I/O or GPIO
- Digital Serial (I2C, SPI, ... - Chip-to-Chip)
- Digital Interface to UART/Line Driver, Serializer-deserializer (Serdes), or Physical I/F
  - Digital I/O to Line Driver Interface (e.g. RS232)
  - 10/100 MII, 1G RGMII, 10G XGMII interfaces to Serdes to Phy

# CPU Core I/O (SW View)

## ■ Word

- Register Control/Config, Status, Data
- Typical of Low-Rate I/O Interfaces (RS232)

## ■ Block

- FIFOs (2-32K), Dual-Port RAM and DMA
- Typical of High-Rate I/O Interfaces

## ■ System Memory Map (32 or 64 bit space)

- Physical Memory (SDRAM, DDR, SRAM)
- BootROM (Typically Flash)
- Configuration and Control Registers

# Card / Backplane I/O Expansion

- Scalable Embedded Systems
- DoD, Commercial Aviation, etc.



Backplane  
Card Heights



PC / Server  
Mainboard



Compact PCI Express  
Backplane

# VITA VME/VXS/VXI vs. PCI / PCIe

## VITA VME (VESA Module Expansion - History)

Asynchronous 20 MHz

A32, A24, A16 Addr Bus

D32, D24, D16 Data Bus

Word or Block Transfer

Daisy-Chained Prio Interrupts

Interrupt Data Cycle

Device Designed in MMIO

Custom Bus Integration on 6U

3U/6U D-shell form factor

VME, VXS Bus, VXI Bus

## PCI 2.1, 2.2, 2.3 (Peripheral Component Interconnect)

Synch Clock 33/66 MHz

Muxed 32/64 bit A/D Bus

Burst Transfer Always

Int A-D Routed to APIC, MSI added

Map onto IRQ 0...15

Built-in Hidden Arbiter

Plug 'n' Play Configuration Space

PCI-to-PCI Bridge Scalability

CPCI, PMC, PC/104+, PCI-X

PCI-Express

Common RTES Bus Architectures Used

PCI Express for I/O

- Gen 1 – 2.5 GT/sec
- Gen 2 – 5 GT/sec
- Gen 3 – 8 GT/sec
- Gen 4 – 16 GT/sec
- Gen 5 – 32 GT/sec
- x1 to x16 lanes
- x16 Gen5 = 64 GB/sec

AMBA – ARM Ltd

- SoC Block Interconnect
- AMBA Developer

# Why Did Bus I/O Go To Serial?

| Bus                     | Frequency | Potential Bandwidth | Number of Devices                                |
|-------------------------|-----------|---------------------|--------------------------------------------------|
| PCI 2.x 32-bit          | 33 MHz    | 133 Mbytes/sec      | 4-5                                              |
| PCI 2.x 32-bit          | 66 MHz    | 266 Mbytes/sec      | 1-2                                              |
| PCI-X 1.0a              | 133 MHz   | 533 Mbytes/sec      | 1-2                                              |
| PCI-X 2.0               | 266 MHz   | 1066 Mbytes/sec     | 1 Point-to-Point Bus                             |
| PCI-E x8 bi-directional | 2.5 GHz   | 4 GBytes/sec        | Switched Scalable Differential Serial Byte Lanes |

Parallel Bus

PCI 2.x to PCI-X 2.0

Many board traces  
Issues:

1. Signal skew
2. Signal integrity
3. Layout space
4. Trace length

Serial Bus

PCI Express G1...G5

Scaling

1. Speed
2. Width

# Example Bus Architectures – Review SoC and PCB Specifications

Intel Altera Cyclone V HPS, Cyclone SoC  
Dual Core ARM Cortex-A9



NVIDIA Tegra K1  
Quad Cortex-A15



x86 PC System Architecture  
for Memory and I/O Bus Interfaces to Peripherals



# PCI I/O Bus Key Concepts

## Plug 'n' Play (Resource Config Space)

OS and/or BIOS can probe configuration space registers at well known port address

- For Each Bus (256), Probe to Find all Devices (32)

- Vendor ID and Device ID

- Setup Each Device Function (8)

- Setup Interrupts A-D for Each Function

- Program Command Register for MMIO, IO, and Mastering

- Program BAR 0-5 for MMIO or IO

- Program Int A-D if Applicable

## Hidden Arbitration

Request/Grant During Master to Target Bursts

Master Latency Timer is Minimum Burst

# PCI I/O Bus Key Concepts

Each Tx/Rx Differential Pair forms a Byte Lane

Byte Lanes Ganged x1, x2, x4, x8, x12, x16, x32

Serializer/Deserializer on Each Byte Lane

Driver, Buffering, and PLL on Each Byte Lane

Lane-to-Lane De-skewing Done in Physical Layer

Data Tx/Rx with Packet Protocol



# Single Board Computer SoCs

- SBC = Single Board Computer (Instead of Backplane)
- For RT Systems 2 Boards are Use for High Rate I/O (with Co-Processing)
  - Jetson TK-1 – Multi-Core CPU + GPU Co-Processor
  - DE1-SoC – Multi-Core CPU + FPGA Co-Processor
- For Low Rate, Texas Instruments Tiva TM4C is also an Option
- SBCs are Less Scalable than a CPCI or VXS/VXI Backplane, But SoC Packs Multiple Cores and I/O onto a Single Chip!

# SBC – Typical for Cyclic Executive

- TIVA - ARM Cortex M4 Microprocessor, [IAR](#) or [Code Composer](#) IDE
  - Cyclic Executive or [FreeRTOS](#)

## ARM M-Series SBC:

TIVA TM4C123G DEV BOARD,  
TM4C123G MCU,  
ARM Cortex M4,  
TM4C123GXL Launch Pad

## ARM M-Series IoT Boards:

# IoT Enabled TM4C129X Connected Development Kit , TM4C1294XL Connected Launch Pad



# Embedded GP-GPU Jetson SoC – PCI Express Scaling

## Older Jetson TK1 CPU+GPU

NVIDIA "4-Plus-1" 2.32GHz ARM  
quad-core Cortex-A15

NVIDIA Kepler "GK20a" GPU with  
192 SM3.2 CUDA cores (up to 326  
GFLOPS)



## Newer Jetson Nano Tegra K1

Competitive with R-Pi, TI OMAP, etc. – price point, no fan  
Tegra K1 SoC

Much more compact

Good for student projects involving machine vision, AI



<https://developer.nvidia.com/embedded/jetson-nano-developer-kit>

# Embedded FPGA SoC Devices – DE10-SoC

- Reconfigurable SoC with FPGA Co-processing
- Custom digital interfaces
- Dual-Core ARM Cortex A9, Linux or FreeRTOS
- Newer DE10-SoC replaces DE1-SoC



Copyright © 2019 University of Colorado

