

**XILINX®**



# Architecture

# Spartan-II E Technical Details

## Table of Contents

- Spartan-II E Overview**
- Logic and Routing**
- Embedded Memory**
- System Clock Management**
- Interfaces – Select I/O**
- Configuration Solutions**

# Xilinx

## Your Programmable Logic Solution



# The Spartan-IIIE Solution

## More Than Just Silicon

### I/O Connectivity

SelectIO™ Technology

Support major I/O standards

### Memory Resources

SRL16 registers

Distributed Memory

Block Memory

External Memory



### Logic & Routing

Flexible logic implementation

Vector Based Routing

Internal 3-State bussing

### System Clock Management

Digital Delay Lock Loops (DLLs)

# Spartan-IIIE Features



# Spartan-III E Features



Logic & Routing



# Logic & Routing

- Configurable for simple to complex logic
- Excellent for fast arithmetic operations
- Flexible for logic or distributed RAM implementations

Configurable Logic Block (CLB)



- Predictable routing delays
- Core-friendly architecture
- Quick Place and Route times
- Internal 3-state bussing



# Logic Advantages

- Look Up Table (LUT) versatility
  - CLB primary building block
  - Flexible for logic or distributed RAM implementation
- Fast arithmetic operations
  - Specialized Carry Logic for arithmetic operations
  - Fast DSP functions FIR filters
- Configurable for simple to complex logic
  - Allow up to 6 input functions into a one logic level



# CLB Structure



- Each slice has 2 LUT-FF pairs with associated carry logic
- Two 3-state buffers (BUFT) associated with each CLB, accessible by all CLB outputs



# CLB Slice Structure

- Each slice contains two sets of the following:
  - Four-input LUT
    - Any 4-input logic function
    - Or 16-bit x 1 sync RAM
    - Or 16-bit shift register
  - Carry & Control
    - Fast arithmetic logic
    - Multiplier logic
    - Multiplexer logic
  - Storage element
    - Latch or flip-flop
    - Set and reset
    - True or inverted inputs
    - Sync. or async. control



DS001\_04\_060100



# Four-Input LUT

- Implements combinatorial logic
  - Any 4-input logic function
  - Cascaded for wide-input functions

Truth Table

| Inputs(ABCD) | Output(Z) |
|--------------|-----------|
| 0000         | 0         |
| 0001         | 0         |
| 0010         | 1         |
| 0011         | 0         |
| .....        | ..        |
| 1110         | 1         |
| 1111         | 1         |





# Dedicated Expansion Multiplexers

- MUXF5 combines 2 LUTs to create
  - 4x1 multiplexer
  - Or any 5-input function (LUT5)
  - Or selected functions up to 9 inputs
- MUXF6 combines 2 slices to form
  - 8x1 multiplexer
  - Or any 6-input function (LUT6)
  - Or selected functions up to 19 inputs
- Dedicated muxes are faster and more space efficient





# Distributed RAM

- CLB LUT configurable as Distributed RAM
  - A LUT equals 16x1 RAM
  - Implements Single and Dual-Ports
  - Cascade LUTs to increase RAM size
- Synchronous write
- Synchronous/Asynchronous read
  - Accompanying flip-flops used for synchronous read



# Shift Register

- Each LUT can be configured as shift register
  - Serial in, serial out
- Dynamically addressable delay up to 16 cycles
- For programmable pipeline
- Cascade for greater cycle delays
- Use CLB flip-flops to add depth



# Shift Register



- Register-rich FPGA
  - Allows for addition of pipeline stages to increase throughput
- Data paths must be balanced to keep desired functionality

# Shift Register



- LUT as shift register
  - Used to add pipeline stages
- Increase overall register count
  - 16 bit shift register per LUT
  - 64 bit shift register per CLB



# CLB Arithmetic Logic

- Dedicated carry logic
  - Provides high performance for counters & arithmetic functions
  - Discrete XOR component for single level sum completion
  - Two separate carry chains in CLB allow for 3 operand functions
  - Can also be used to cascade LUTs for wide-input logic functions



# 3 Operand Adder Function



- A, B, C are two-bits wide
  - $SUM = A + B + C$  or  $PARTIAL + C$ , where  $PARTIAL = A + B$
  - Implementation
    - First 2-operand sum ' $A+B$ ' is performed in Slice 0
    - Second 2-operand sum ' $PARTIAL + C$ ' is performed in Slice 1
  - Fast local feedback connection within the CLB
    - Very small delay for on PARTIAL



# Carry Logic for Wide Input Functions

- Higher performance
- Efficient resource utilization
- Common applications
  - Wide input decoding
  - Comparators
- HDL design entry
  - LUT can be inferred
  - MUXCY must be instantiated

# 12- Input AND Function



4-Input AND Truth Table

| Inputs(ABCD) | Output(Z) | Output(HEX) |
|--------------|-----------|-------------|
| 0000         | 0         |             |
| 0001         | 0         |             |
| 0010         | 0         | 0           |
| 0011         | 0         |             |
| .....        | ..        | ..          |
| 1011         | 0         |             |
| 1100         | 0         |             |
| 1101         | 0         |             |
| 1110         | 0         | 8           |
| 1111         | 1         |             |

- Utilization
  - 3 LUTs and 3 MUXCYs
  - As opposed to 4 LUTs
- Performance
  - 1 logic level
  - As opposed to 2 logic levels

# 12- Input OR Function



4-Input NOR Truth Table

| Inputs(ABCD) | Output(Z) | Output(HEX) |
|--------------|-----------|-------------|
| 0000         | 1         |             |
| 0001         | 0         |             |
| 0010         | 0         |             |
| 0011         | 0         |             |
| .....        | ..        | ..          |
| 1011         | 0         |             |
| 1100         | 0         |             |
| 1101         | 0         |             |
| 1110         | 0         |             |
| 1111         | 0         | 0           |

- Utilization
  - 3 LUTs and 3 MUXCYs
  - As opposed to 4 LUTs
- Performance
  - 1 logic level
  - As opposed to 2 logic levels

# Dedicated CLB Multiplier Logic



- Dedicated AND gate
- Highly efficient 'Shift & Add' implementation
  - For a 16x16 Multiplier
    - 30% reduction in area and one less logic level



# Lower Operating Power

- 1.8V core supply
  - Reduces power consumption
- Advanced signaling standards
  - Smaller voltage transitions
  - Reduces switching power
- DLLs reduce clock speed requirements
  - Faster clock propagation
  - Internal multiplication of clock
  - Reduces power on clock nets



# Logic Summary

- Flexible Configurable Logic Block (CLB) implementations
  - Logic
  - Distributed RAM
  - Shift register
- CLB configurable for simple to complex logic
  - Any 6 input function into one logic level
- Excellent for fast arithmetic operations
  - Specialized carry logic for arithmetic operations
  - Fast DSP functions FIR filters



# Spartan-III E Features

Logic & Routing





# Routing

- Core-friendly vector-based routing
  - Provides predictable routing delays independent of
    - IP placement
    - Number of IP
    - Device size
- Superior routing
  - Quick Place and Route times
    - Design to system at 100,000 gates per minute
  - Easier rerouting
- Internal 3-state bussing
  - Eliminates bus routing contention
  - Reduced CLB usage by using 3 states instead of MUXes
  - Increases performance by reducing logic levels



# High-Performance Routing



- Local routing
  - Direct connections
- General Routing Matrix (GRM)
  - Single line, Long line, Hex line
- Dedicated routing
  - Internal 3-state bus
- Global routing
  - Primary Clock Buffer lines, Secondary lines

# Local Routing



- Interconnect among LUTs, FFs, GRM
- CLB feedback path for connections to LUTs in same CLB
- Direct path between horizontally adjacent CLBs

# General Purpose Routing



- 24 single-length lines
  - Route GRM signals to adjacent GRMs in 4 directions
- 96 buffered hex lines
  - Route GRM signals to another GRMs six blocks away in each of the four directions
- 12 buffered Long lines
  - Routing across top and bottom, left and right



# Routing Summary

- Vector-based routing
  - Predictable routing delays independent of device size and routing direction
- Core-friendly architecture
- Quick Place and Route times
  - Design to system at 100,000 gates per minute
  - Easier re-routing
- Internal 3-state bussing
  - Eliminates bus routing contention
  - Improves density and performance



# Spartan-III E Features

Embedded  
Memory





# Spartan-II E Memory Hierarchy

## Shift Register LUT

- 16 registers, 1 LUT
- Compact & fast



- Pipelining
- Buffers

Bytes

## Distributed RAM

- Single-port
- Dual port
- Cascadable



- DSP Coefficients
- Small FIFOs
- Scratch Pad

## Block RAMs

- 4Kbit blocks
- True dual-port



- Cache Tag memory
- Large FIFOs
- Packet buffers
- Video line buffers

Kilobytes

## High-Performance External Memory Interfaces

- DDR I/O
- SSTL, HSTL, CTT



- Collaboration with memory vendors
- IDT, Cypress, Micron, NEC, Samsung, Toshiba...



Megabytes



# Distributed RAM

- CLB LUT configurable as Distributed RAM
  - A LUT equals 16x1 RAM
  - Implements single and dual ports
  - Cascade LUTs to increase RAM size
- Synchronous write
- Synchronous/Asynchronous read
  - Accompanying flip-flops used for synchronous read



# SRL-16 and SRL-16E





# Distributed RAM

## Dual-Port Implementation

- 2 LUTs equal 16x1 dual-port RAM
- A Port
  - Uses A[3:0] address
  - Write and read
- B Port
  - Uses DPA[3:0] address
  - Read only
- Excellent for FIFOs, scratch pads....





# Block RAM



- Most efficient memory implementation
  - Dedicated blocks of memory
- Ideal for most memory requirements
  - 8 to 72 memory blocks
    - 4096 bits per blocks
    - Use multiple blocks for larger memories
- Builds both single and true dual-port RAMs
- CORE Generator provides custom-sized block RAMs
  - Quickly generates optimized RAM implementation



# Block RAM

- Configurable synchronous Block RAM
  - Single-port RAM
  - True dual-port RAM
  - Two independent single-port RAMs
- Block count increases with FPGA size

| Device   | No. of Blocks | Block RAM Bits |
|----------|---------------|----------------|
| XC2S50E  | 8             | 32,768         |
| XC2S100E | 10            | 40,960         |
| XC2S150E | 12            | 49,152         |
| XC2S200E | 14            | 57,344         |
| XC2S300E | 16            | 65,536         |
| XC2S400E | 40            | 163,840        |
| XC2S600E | 72            | 294,912        |



# Block RAM

- Flexible 4096-bit block... Variable aspect ratio
  - 4096 x 1
  - 2048 x 2
  - 1024 x 4
  - 512 x 8
  - 256 x 16
- Increase memory depth or width by cascading blocks



# Block RAM

## Single-Port Implementation

- Easy cascading of block RAMs
- Utilize variable aspect ratio for desired RAM size
- Example
  - Desired RAM size: 1024 x 8
  - $1024 \times 4 + 1024 \times 4 = 1024 \times 8$
- CORE Generator software
  - Efficiently cascades RAM blocks
  - Quick custom RAM implementation



# Dual-Port Bus Flexibility



- Each port can be configured with a different data bus width
- Provides easy data width conversion without any additional logic

# Two Independent Single-Port RAMs



- Added advantage of True Dual-Port
  - No wasted RAM Bits
- Can split a Dual-Port 4K RAM into two Single-Port 2K RAM
  - Simultaneous independent access to each RAM
- To access the lower RAM
  - Tie the MSB address bit to Logic Low
- To access the upper RAM
  - Tie the MSB address bit to Logic High

# CAM in Block RAM

- Content Addressable Memory (CAM)
  - Storage array like a RAM
  - Functionally opposite of a RAM
    - Quickly find the location of a particular stored value
    - Output the address and toggle the MATCH line, if data match is found



- Used in telecommunications, networking, Ethernet, ATM switches
- Xilinx provides reference designs and application notes



# External Memory Interface



- Easy access to high-speed external memory
- SelectI/O™ provides interface to most memory types

| External Memory Type | SelectI/O Standard |
|----------------------|--------------------|
| SRAM                 | SSTL               |
| SGRAM                | HSTL               |
| ZBT SRAM/NoBL        | LVTTL              |
| QDR SRAM             | HSTL               |
| SDRAM                | LVTTL              |
| DDR SRAM             | SSTL2              |
| EDO                  | TTL                |
| FPM                  | TTL                |
| PB                   | TTL                |
| PC100/133            | LVTTL / SSTL       |



# Memory Controller Designs

## Memory Resources

Free!

- DRAM controller
  - 64-bit DDR DRAM controller
  - 16-bit DDR DRAM controller
  - SDRAM controller
- SRAM controller
  - ZBT SRAM controller
  - QDR SRAM controller
  - SigmaRAM controller
- Flash controller
  - NOR / NAND flash controller
- Embedded memory
  - CAMs, FIFOs

- Memory Solutions Portal
- [www.xilinx.com/memory](http://www.xilinx.com/memory)



Download Now!



# Embedded Memory Summary

- Fast distributed RAM
  - Data right beside logic
- Memory requirements solved by Block RAM
  - Single and True Dual-Port RAM implementations
  - FIFO for buffering data
  - Data width conversion
  - Cache
  - Register stacks
  - CAM for high-speed parallel searches
  - Many more
- Direct connection to external high-speed memory



# Spartan-III E Features

System Clock  
Management



# System Clock Management



- 100% Digital DLL Design
  - Noise insensitive
  - Scalable to new processes
  - Excellent Jitter specifications
    - +/- 100ps, <<50ps Typical
    - No cumulative phase error
  - Used in advanced memories
- Every Spartan-II E device has
  - 4 DLLs
  - External clock outputs

4 DLLS in every device

*Delay Locked Loops Lower Board Costs*

# System Clock Management



*Delay Lock Loops (DLLs) Lower Board Costs*

# Generic DLL Operation

- A DLL inserts delay on the clock net until the clock input rising edge is in phase with the clock feedback rising edge
- Requires a well-designed clock distribution network: the clock edges arrive simultaneously everywhere in the part





# DLL Capabilities

- Easy clock duplication
  - System clock distribution
  - Cleans and reconditions incoming clock
- Quick and easy frequency adjustment
  - Single crystal easily generates multiple clocks
- Faster state machine utilizing different clock phases  
Excellent for advance memory types
- De-skew incoming clock
  - Generate fast setup and hold time or fast clock-to-outs





# DLL: Clock Mirrors



- Input clock duplication
  - Provides on and off-chip clocks
  - Clock distribution across system
- Cleans and reconditions backplane or noisy clocks
- Extremely low output skew





# Spartan-II E DLL Example

## 1X Clock Mirror with 180° Output Phase (100MHz)



Benefit - DDR Memory Interface - Avoid external DLLs

# DLL: Multiplication

- Use 1 DLL for 2x multiplication
- Combine 2 DLLs for 4x multiplication
- Reduce board EMI
  - Route low-frequency clock externally and multiply clock on-chip



# DLL: Multiplication Example

- Reduce EMI by increasing data width and decreasing clock frequency
- Cross over clock domains without worries
  - Synchronized clock edges
  - No external drift
  - Minimal external clock skew



# DLL: 2x Multiplication Implementation

- Requires one CLKDLL primitive
- CLK0 output removes skew between registers on the chip
- CLK2X is 2X clock output



# DLL: Division

- Selectable division values
  - 1.5, 2, 2.5, 3, 4, 5, 8, or 16
- Cascade DLLs to combine functions
- 50/50 duty cycle correction available





# DLL: Phase Shift

- Phase shifts
  - $0^\circ$ ,  $90^\circ$ ,  $180^\circ$ , and  $270^\circ$
- Increase system performance by utilizing additional clock phases
- 50/50 duty cycle correction available
- Excellent for external memory interfaces
  - DDR and QDR RAM



# DLL: Speedup Tsu/h and Tco



| External Spec | No DLL | With DLL |
|---------------|--------|----------|
| Setup         | 2.0ns  | 1.7ns    |
| Hold          | 0ns    | -0.4ns   |
| Clock to out  | 4.7ns  | 3.1ns    |

- Nullify clock line delay
  - External clock pin and internal clock are aligned
- Optional duty cycle correction
  - 50/50 duty cycle correction applied when specified
- Low sensitivity to clock input noise
  - Lower-cost oscillator

\* Spartan-II E data sheet module 3 Pin-to-Pin Parameters, LVTTL, 12 mA, Fast Slew Rate





# Spartan-II E DLL Example

## Clock-to-Out Improvement Using DLLs

Output standard = SSTL-3 Class-II

(OBUF\_SSTL3\_II)

Temp=100C, Vdd=2.375V, Vcco=3.3V, Vtt=1.5V

Waveforms:

1: CLKIN

2: DATA OUT (no DLL)

3: DATA OUT (DLL deskewed)

Timing:

w/o DLL    w/ DLL

r->r    r->f    r->r    r->f

3.5n    3.8n    1.1n    1.3n



Benefit - Increases Timing Budget - Allows Use of Cheaper Memories





# System Clock Management Summary

- All digital DLL Implementation
  - Input noise rejection
  - 50/50 duty cycle correction
- Clock mirror provides system clock distribution
- Multiply input clock by 2x or 4x
- Divide clock by 1.5, 2, 2.5, 3, 4, 5, 8, or 16
- Provides 0, 90, 180, and 270 clock phase shift
- De-skew clock for fast setup, hold, or clock-to-out times

# Spartan-III E Features

## System Interfaces





# Comprehensive I/O Connectivity



**8 I/O banks enable multiple simultaneous standards**

- Single ended and differential
  - Up to 514 single-ended, 205 differential pairs
  - 400 Mb/sec LVDS: ideal for Consumer Applications
  - 19 I/O standards, 8 flexible I/O banks
  - PCI 32/33 and 64/66 support
- Multiple package options
- 3 IOB registers: in, out, 3-state
- Voltages: 3.3V, 2.5V, 1.8V, 1.5V

**Chip-to-Chip Interfacing:**



**Backplane Interfacing:**



**High-speed Memory Interfacing:**



# Basic I/O Block Structure





# Programmable Output Driver

## Simultaneous Switching Output Guidelines

- Significant EMI reduction benefit
- Programmable driver strength
  - Pull-up and Pull-down drivers can be individually controlled
  - 16 different setting for each
  - 2 slew rate settings

| Standard                         | Package          |    |          |
|----------------------------------|------------------|----|----------|
|                                  | BGA<br>CS<br>FGA | HQ | PQ<br>TQ |
| LVTTL Slow Slew Rate, 2mA drive  | 68               | 49 | 36       |
| LVTTL Slow Slew Rate, 4mA drive  | 41               | 31 | 20       |
| LVTTL Slow Slew Rate, 6mA drive  | 29               | 22 | 15       |
| LVTTL Slow Slew Rate, 8mA drive  | 22               | 17 | 12       |
| LVTTL Slow Slew Rate, 12mA drive | 17               | 12 | 9        |
| LVTTL Slow Slew Rate, 16mA drive | 14               | 10 | 7        |
| LVTTL Slow Slew Rate, 24mA drive | 9                | 7  | 5        |
| LVTTL Fast Slew Rate, 2mA drive  | 40               | 29 | 21       |
| LVTTL Fast Slew Rate, 4mA drive  | 24               | 18 | 12       |
| LVTTL Fast Slew Rate, 6mA drive  | 17               | 13 | 9        |
| LVTTL Fast Slew Rate, 8mA drive  | 13               | 10 | 7        |
| LVTTL Fast Slew Rate, 12mA drive | 10               | 7  | 5        |
| LVTTL Fast Slew Rate, 16mA drive | 8                | 6  | 4        |
| LVTTL Fast Slew Rate, 24mA drive | 5                | 4  | 3        |
| LVCMOS2                          | 10               | 7  | 5        |
| PCI                              | 8                | 6  | 4        |
| GTL                              | 4                | 4  | 4        |
| GTL+                             | 4                | 4  | 4        |
| HSTL Class I                     | 18               | 13 | 9        |
| HSTL Class III                   | 9                | 7  | 5        |
| HSTL Class IV                    | 5                | 4  | 3        |
| SSTL2 Class I                    | 15               | 11 | 8        |
| SSTL2 Class II                   | 10               | 7  | 5        |
| SSTL3 Class I                    | 11               | 8  | 6        |
| SSTL3 Class II                   | 7                | 5  | 4        |
| CTT                              | 14               | 10 | 7        |
| AGP                              | 9                | 7  | 5        |

# Post-PCB Signal Integrity Adjustment

## Optimizing Performance “As Built”

Initial Design: LVTTL\_F16 (Fast slew, 16 mA)  
*Driver impedance too low – Undershoot!*



Final Design: LVTTL\_F8 (Fast slew, 8 mA)  
*Driver impedance ~50  $\Omega$  -- No Undershoot*



Requires a Bitstream Change Only!

# System Interfaces -- SelectI/O™

## Voltage Standards

3.3V    2.5V    1.8V    1.5V

19 Different Standards Supported!

## Chip-to-Chip Interfaces

LVDS    LVPECL    LVC MOS    LVTTL

## Backplane Interfaces

AGP    GTL    GTL+    PCI    BLVDS

## High-speed Memory Interfaces

CTT    HSTL    SSTL

- w Supports multiple voltage and signal standards simultaneously
- w Eliminate costly bus transceivers



# SelectI/O™ Standards

| Standard | $V_{REF}$ | $V_{CCO}$ |
|----------|-----------|-----------|
|----------|-----------|-----------|

## Chip to Chip Interface

|          |    |     |
|----------|----|-----|
| LVTTL    | na | 3.3 |
| LVCMOS2  | na | 2.5 |
| LVCMOS18 | na | 1.8 |
| LVDS     | na | 2.5 |
| LVPECL   | na | 3.3 |

## Backplane Interface

|                |      |     |
|----------------|------|-----|
| PCI 33MHz 3.3V | na   | 3.3 |
| PCI 66MHz 3.3V | na   | 3.3 |
| GTL            | 0.80 | na  |
| GTL+           | 1.00 | na  |
| AGP-2X         | 1.32 | 3.3 |
| Bus LVDS       | na   | 2.5 |

## Memory Interface

|               |      |     |
|---------------|------|-----|
| HSTL-I        | 0.75 | 1.5 |
| HSTL-III & IV | 0.90 | 1.5 |
| SSTL3-I & II  | 1.50 | 3.3 |
| SSTL2-I & II  | 1.25 | 2.5 |
| CTT           | 1.50 | 3.3 |



- $V_{REF}$  defines input threshold reference voltage
- Available as user I/O when using internal reference



# I/Os Separated into 8 Banks



IOB=I/O Blocks

# I/O Signal Types



NOTE: Only the popular IO types shown here

# Single Ended I/O

- Traditional means of data transfer
- Data is carried on a single line
- Bigger voltage swing between logic Low and High





# System I/O

## Single-Ended I/O Standards Summary

| Type           | Chip to chip                  | Chip to Backplane              | Chip to Memory                                      |                                                                            |
|----------------|-------------------------------|--------------------------------|-----------------------------------------------------|----------------------------------------------------------------------------|
| Key Standards  | LVTTL, LVCMOS                 | GTL, GTL+, AGP                 | HSTL I, III, IV                                     | SSTL2, SSTL3                                                               |
| Key Highlights | Higher voltage swing          | Low voltage swing              | Low voltage swing, low power, low noise, 200-400MHz | Low voltage swing, low power, low noise, SSTL3 82-166MHz, SSTL2 166-333MHz |
| Primary Usage  | Legacy interface              | Pentium CPU, backplanes        | High speed SRAM, MIPS/UltraSparc-II                 | Synchronous DRAM interfaces (SDR & DDR)                                    |
| Applications   | Glue logic, ASIC chip to chip | Datacom, Pentium, add-in cards | Line cards, graphics cards, digital cameras, modems | 3-D graphics cards, plasma LCD displays, DTV interfaces, Set-Top Boxes     |
| Vendors        | Most vendors                  | Intel, TI                      | Micron, IDT, Cypress, MIPS, IBM, etc.               | Micron, Samsung, Toshiba, Hyundai, NEC, Siemens, etc.                      |

# Differential I/O

- Latest means of data transfer
- One data bit is carried through two signal lines
  - Voltage difference determines logic High or Low
- Smaller voltage swing between logic Low and High
  - Higher performance
  - Lower power
  - Lower noise





# SelectI/O: Differential I/O Types

- LVDS (Low Voltage Differential Signal)
  - Unidirectional data transfer
- Bus LVDS
  - Bi-directional communication between 2 or more devices
  - Can transmit and receive LVDS signals through the same pins
- LVPECL (Low Voltage Positive Emitter Coupled Logic)
  - Unidirectional data transfer
  - Popular industry standard for fast clocking



# More Differential I/O Information

- Xilinx web site  
(<http://www.xilinx.com/apps/xapp.htm>)
  - Application Notes
    - XAPP230, XAPP231, XAPP232, XAPP233,
    - XAPP237, XAPP238, XAPP243, XAPP245
- National Semi. web site  
(<http://www.national.com/appinfo/lvds>)
  - LVDS Design Guide
  - BLVDS White Paper



# System Interface Summary

- SelectI/O™ supports 19 IEEE/JEDEC I/O standards
  - High speed with differential I/Os
    - Low power, less noise
  - External high speed memory interface
    - Use HSTL and SSTL standards
  - High performance backplane applications
    - Use PCI, GTL and GTL+ standards
- Flexible I/O block
  - Programmable slew rate for EMI and ground bounce control
  - Independent input, output and programmable 3-state registers
  - Input delay for 0 hold time

# Spartan-III E Features



Configuration

# Configuration Basics



- Spartan-IIIE device
  - Is SRAM-based and hence volatile
    - Needs a configuration data source
    - Needs to be re-configured (re-programmed) upon power-up
  - ISP
    - Re-programmable/upgradable in the field
- Configuration
  - Programming the device with design logic





# Configuration

- Configuration data source
  - PROM
    - Serial/Parallel PROMs
  - Hard disk
  - Microprocessor memory
- Configuration interface
  - Simple serial
  - High-speed parallel
  - JTAG or boundary scan
  - IRL
  - Microprocessor
  - CPLD





# JTAG Basics

- Also known as
  - IEEE/ANSI standard 1149.1
  - Boundary scan
- Set of design rules that facilitate
  - Testing
  - Programming
  - Debug
- Can be done at the chip, board, and systems level
- Can also have user-defined instructions
  - Example: vendor-specific instructions: configure and verify

**IEEE JTAG  
1149.1**





# JTAG Basics (cont'd)

- Rapid and automatic detection and isolation of defects due to common failures
  - Detect opens and shorts
- Ensure all components on PCB are
  - Mounted properly
  - In right place
  - Have proper interconnects among them
- Allows complete control and access to the boundary pins of a device without the need for
  - Bed-of-nails
  - Other test equipment

**IEEE JTAG  
1149.1**



# JTAG Compliant Device



- Includes a boundary-scan cell connected to each input, output or bi-directional pin
  - Transparent and inactive under normal conditions
- Test mode
  - Input signals captured and output signals set to affect other devices on the board



# JTAG Mode



- Supports readback through boundary scan port
- Can mix any Xilinx device (FPGA, CPLD, PROM) and non-Xilinx devices in the chain





# JTAG Mode (Cont'd)

- Dedicated TDI, TCK, TDO and TMS pins must operate at LVTTL
  - $V_{CCO}$  for bank 2 must be at 3.3V
- Maximum configuration rate of 33 MHz



# Xilinx Web: Configuration Solutions



For Academic Use Only  
For Academic Use Only





# Xilinx Download Cables

- Types
  - MultiLINX™ cable
  - Parallel cable
- Perfect source for prototype and debugging
- Supports all traditional and JTAG-based configuration methods





# Cable Software Support

- iMPACT software
  - Included in Xilinx Alliance and Foundation ISE software tools

# Summary



# Spartan-II E: A System-Level Solution

- Hierarchical memory support
  - SelectRAM+ can be used to create bytes or Kbytes of internal storage and access megabytes of fast external memory
- System speedup and synchronization
  - Nullify clock distribution delays - 160 MHz system performance
  - Synthesize clocks for internal and external use
  - Synchronize systems: create clock mirrors/nullify board delay
- System level integration
  - Connect directly to existing and emerging I/O standards
- Vector-based interconnect
  - Much more predictable before place and route
  - Enhances synthesis-based flows



# Spartan-II E: A System-Level Solution

- IP solutions
- Software
  - Based on proven timing-driven place and route technology
- System-level features
  - RAM, DLLs, I/O standards
- Re-programmable



# Reference Slides



# SelectI/O™

- I/O can be programmed for 19 signal standards
  - Provides industry-standard IEEE/JEDEC I/O standards
  - Single-ended and differential
- Allows connection to
  - Processors, memory, bus-specific standards, mixed signal
  - High-performance backplanes
- Improved power and grounds ratio to minimize ground bounce
- Simple entry of I/O standards in design tools



# Chip-to-Chip Interface Standards



ETL Enhanced transceiver logic

# Chip-to-Chip Interface Standards (Cont'd)



# Backplane Interface Standards



# Memory Interface Standards





# SelectI/O Input Bank Rules

- Each bank has a single input reference voltage ( $V_{REF}$ )
  - Shared among all I/Os in the bank
  - All I/O types in a bank must use the same reference voltage
  - All  $V_{REF}$  pins in a bank must be tied to the same voltage
- Inputs not requiring a  $V_{REF}$  fit in the bank
  - LVTTL, LVCMOS, LVPECL, LVDS, PCI
- $V_{REF}$  pins in a bank available as additional I/O, iff ...
  - I/O type does not require  $V_{REF}$
  - Otherwise, all  $V_{REF}$  pins must be used to supply reference voltage
- OBUFTs with Keepers require a reference voltage and are treated as IOBUFs
- Input buffers with LVTTL, LVCMOS2/18, PCI33/66 supplied by  $V_{CCO}$





# SelectI/O Output Banks

- Each bank has a single source voltage ( $V_{CCO}$ )
  - Shared among all I/Os in that bank
  - All I/O types in a bank must use the same voltage source
  - All  $V_{CCO}$  pins in a bank must be tied to the same voltage
- Only one  $VCCO$  voltage for smaller pin count packages
  - TQ144, PQ208
- Outputs not requiring  $V_{CCO}$  fit in the bank
  - GTL, GTL+
- Configuration pins need special consideration
  - Configuration pins are located on the right side of device in Banks 2 and 3
  - $V_{CCO}$  must be 3.3 volts for serial PROMs configuration



# Single-Ended I/O Standards

## Benefits

- Reduced EMI compared to 3.3V TTL
  - Low Output Voltage Swing
  - Slow Edge Rates ( $dV/dt$ )
- Reduced Power Consumption
- Reduced Noise With External Termination
  - Reduced reflection
  - Ringing
  - Cross-talk
- Higher Performance/Higher Bandwidth



# Differential I/O Benefits

## I/O Connectivity

- Significant Cost Savings
  - Reduced EMI
  - Fewer pins
  - Fewer PCB layers, fewer PCB traces (PCB area savings)
  - Fewer/smaller connectors
  - No external transceivers
- High performance per pin pair - up to 400 Mb/sec
- Reduced EMI due to low output voltage swing
- High noise immunity
- Reduced power consumption
- Spartan-II E Supports LVDS, Bus LVDS, and LVPECL





# SelectI/O: Differential I/O

- Differential I/O is a standard feature
  - Supported in all devices densities, all speed grades
- More differential I/Os within a device
  - Up to 240 I/O pairs
  - Offers flexibility in board layout
- Flexible differential I/Os
  - Use any I/O as input, output or bi-directional
- Spartan-IIIE
  - Can be driven by any standard LVDS/LVPECL driver
  - Complies with LVDS/LVPECL receiver specs





# SelectI/O: Differential I/O Configurations

- Point to Point
  - One transmitter and one receiver
  - Mostly used by LVDS/LVPECL in chip-to-chip applications
- Multi-Drop
  - One transmitter and multiple receivers
  - Used by Bus LVDS/LVPECL in backplane applications
- Multi-point
  - Multiple transceivers
  - Used by Bus LVDS/LVPECL in backplane applications





# SelectI/O: LVDS & LVPECL

- All I/Os have LVDS/LVPECL capability
- Differential signal pairs can be used as
  - Synchronous inputs or outputs
  - Asynchronous inputs
  - Some as asynchronous outputs
- Synchronous
  - Signal comes from IOB flip-flop
- Asynchronous
  - Signal comes from internal logic



# What is LVDS?



- LVDS - Low Voltage Differential Signaling
- LVDS is a differential signaling interconnect technology
  - Requires two pins per channel
- LVDS was first used as a interconnectivity technology in laptops and displays to alleviate EMI issues
- Technology is now widely used
  - A broad spectrum of telecom and networking applications
  - Mainstream consumer applications like digital video and displays



# LVDS Benefits

- Higher I/O speed
- Lower cost
  - Serialize multiple single-ended to differential channel signals
  - Save I/O pins
  - Use a smaller package
  - Save board space
- Technology and process independent
  - Easy migration path for lower supply voltages
  - Maintain same signal levels
  - Maintain same performance
- Low power
- Low noise
- Low EMI





# LVDS Low Power Advantage

- LVDS technology saves power in several important ways
- Power dissipation at the terminator is ~1.2 mW
  - RS-422 driver delivers 3 V across a termination of  $100 \Omega$ , for 90 mW power consumption... 75 times more than LVDS!
- Due to the current mode driver design, the frequency component of ICC is greatly reduced
  - Compared to TTL/CMOS transceivers where the dynamic power consumption increases exponentially with frequency



# LVDS Noise Immunity Advantage

- $R_{OUT}$  is clean even in cases of extreme common mode noise contamination





# LVDS benefits - Low EMI

- Low voltage swing (~350mV)
- Slow edge rates compared to other technologies (1V/ns)
- Current mode of operation ensures low  $I_{CC}$  spikes
- High noise immunity
  - Switching noise cancels between the two lines
  - Data is not effected by the noise
    - External noise effects both lines, but the voltage difference stays about the same





# LVDS Applications

- Communications and Networking
  - Switches
  - Repeaters
  - Wireless base stations
- Data Communications
  - Routers
  - Hubs



# LVDS Applications (cont'd)

- Consumer Electronics
  - Digital cameras
  - Flat panel displays
- Office/Home
  - Printers
  - Copiers
- Various backplane applications



# Spartan-II E LVDS Benefits

- Exceptional performance
  - Up to 400Mb/sec. per differential pair
- Significant Cost Savings
  - Reduced EMI
  - Fewer pins (smaller package)
  - Fewer PCB layers
  - Fewer PCB traces (PCB area savings)
  - Fewer/smaller connectors
  - No transceivers
- Quicker Time-to-market
  - Fewer EMI issues



# LVDS Driver and Receiver

## Driver



x133\_19\_122799

## Receiver



x133\_29\_122799



# SelectI/O: Bus LVDS

- All I/Os have Bus LVDS capability
- Fully compatible with industry-standard Bus LVDS devices from National Semiconductor and other vendors

# LVDS Benefits – Reduced I/O Count

Example



Single-ended I/O

# of Pins: 80



LVDS I/O

# of Pins: 46





# Spartan-IIIE LVDS Example

## Clock Distribution



**Benefits - Higher performance, low EMI, lower cost, fewer components**



# Spartan-IIIE LVDS Example

## Clock Conversion with Zero Delay



Benefits - Low EMI, lower cost, fewer components



# LVPECL Benefits

- Higher I/O speed
- Board-level clock distribution
  - Zero-delay conversion of LVPECL clocks into virtually any other I/O standard
- Lower cost
  - Serialize multiple single-ended to differential channel signals
  - Save I/O pins
  - Use a smaller package
  - Save board space
- Low power
- Low noise
- Low EMI





# LVPECL Applications

- Backplanes
- High performance clocking
  - 100 MHz and above
- Optical Transceiver
- High speed networking
- Mixed-signal interfacing



# LVPECL Driver and Receiver

## Driver



x133\_20\_122799

## Receiver



x133\_21\_122799

# LVPECL: Clock Conversion



- Receive and convert high speed clocks with zero delay
- Zero-delay clock generation to any of SelectI/O Standards
- Eliminate costly bus translators



# Configuration Methods

- Master serial mode
- Slave serial mode
- Slave parallel mode
- JTAG mode
- IRL
- Multiple devices can be daisy-chained in
  - Master serial mode
  - Slave serial mode
  - JTAG mode





# Master Serial Mode



- Spartan-IIIE device acts like a master
  - Generates configuration clock (CCLK) using internal oscillator
- PROM stores the configuration data
- Configuration rate selectable from 4-60 MHz
  - -30% to +45% variance due to process dependence



# Slave Serial Mode



- Spartan-IIIE device acts like a slave
  - An external clock source drives the CCLK pin
- Configuration data is stored in PROM, flash, micro-controller or microprocessor memory
- Maximum configuration rate of 66 MHz





# Slave Parallel Mode



*Single or multiple Spartan-IIIE devices connected in parallel*



# Slave Parallel Mode (cont'd)

- Spartan-II E device acts like a slave
  - An external clock source drives the CCLK pin
  - Microprocessor, Microcontroller or CPLD controls configuration
- Configuration data is stored in parallel PROM, flash, Microcontroller or Microprocessor memory
- Fastest configuration mode
  - 8 bits per CCLK cycle
  - 50MHz configuration rate (400 Mbit/sec)
- Supports Readback
  - Bi-directional read/write port for configuration and readback





# IRL and Xilinx Online

- Internet Reconfigurable Logic (IRL)
  - IRL is a design methodology to create field upgradable applications
  - Supported by products, design guidelines and reference designs
- Xilinx Online
  - Xilinx program to enable, identify and promote field upgradable applications



# IRL Methodology Elements

- 4 main elements in IRL model
  - Host / Server
  - Network
  - Target to be updated
  - Payload(s)
- Xilinx provides an API (PAVE) and a set of design guidelines that define how remote devices can be upgraded via a network.

WindRiver®

PAVE

Host





# PAVE Features

- Configures FPGAs / CPLDs
  - IEEE 1149.1 JTAG / SelectMAP
- PAVE Payload upgrades PLD + system software
- Systems Integration Framework (SIF) within Wind River's Tornado® environment
- PAVE source distributed and supported by Xilinx





# MultiLinx™ Cable

- Configuration and Readback support
  - Using boundary scan (JTAG) mode
  - Slave serial/parallel mode
- Supports USB interface on PC
  - Fastest configuration
  - Baud rate up to 12M
- Supports RS-232 interface on PC and UNIX
  - Baud rate
    - Up to 57.6K on PC
    - Up to 38.4K on UNIX





# MultiLINX Cable

## Top View



## Bottom View



# Parallel Cable

- Configuration and Readback support
  - Using boundary scan (JTAG) mode
- Supports parallel port on PC
  - Baud rate up to 57.6K



# Parallel Cable

