

## 1. Design Process for Integrated Circuits (ICs) and Printed Circuit Boards (PCBs)

### 1.1 Design and Realization Alternatives

### 1.2 Design Methods

### 1.3 Use of Hardware Description Languages (HDLs)





Fakultät Informatik, Institut für Technische Informatik, Professur für Adaptive Dynamische Systeme

There are more embedded electronic systems around us than we think

Audio  
Video  
Television  
Medical Equipment  
House keeping  
Office

In an automobile



Fakultät Informatik, Institut für Technische Informatik, Professur für Adaptive Dynamische Systeme

## Characteristics: distributed system

In a premium class car up to:

- 80 computers (Electronic Control Units ECU)
- > 100 Electrical Motors
- > 2 km wiring
- millions of lines of code



# Automotive Electronic Control Units

## Characteristics: distributed, mechatronic

Fakultät Informatik, Institut für Technische Informatik, Professur für Adaptive Dynamische Systeme



Mechatronic Example:  
Hydraulic  
Brake System



Mit freundlicher Genehmigung  
der Daimler AG



ECU's are part of  
mechatronic systems  
for  
measurement and  
control

Software  
is part of  
ECU

Fakultät Informatik, Institut für Technische Informatik, Professur für Adaptive Dynamische Systeme







Xilinx VIRTEX  
XCV2000 E

## Properties:

- ca. 2.500.000 gates
- 80\*120 CLB-Array
- 43.200 “Logic Cells”
- Board system clock: 33MHz
- 0.18  $\mu$ m 6- Layer CMOS-Process
- Can be reprogrammed infinitely times
  - > thanks to SRAM memory
- 1 MB internal SRAM memory
- 130 MHz internal CLB clock frequency
- I/O Performance: 622 Mb/s
- 404 In-/outputs



## FPGA-Architectures:

- SRAM-based Look-up Tables (LUTs)
- Problems:
  - Interconnection: reduces performance
  - Relation: active / passive elements



**Rekonfigurable Interconnections (Switching Boxes)**



Source: R. Hartenstein

**Set of programmable “Logic Gates”, is integrated into a flexible interconnection network -> a “user-programmable” alternative to dedicated circuits**



**Programmable Gate**

## Solution:



**Look-up-Table (LUT)  
With 2 inputs**



## Virtex-II FPGA: integrated PowerPC ® 405 RISC CPUs (PPC405)





- Increasing Complexity
  - Structure size decreases
  - Chip dimensions increase
  - Number of transistors per chip quadruples every three years
- Higher Performance (clock, speed, power dissipation)
- Pressure of Competition
  - Almost no monopolies: extreme price pressure
  - Continually decreasing time for development
    - ⇒ “Time to market” decides about success or failure of a product
- “Design Reuse” for Productivity and Quality Improvement
  - ⇒ Requires complete and understandable documentation (IP - Intellectual Property)

Fakultät Informatik, Institut für Technische Informatik, Professur für Adaptive Dynamische Systeme



## 130nm Itanium® 2 Processor Highlights



- 410M transistors
- 374mm<sup>2</sup> die size
- 6MB on-die L3 cache
- 1.5GHz at 1.3V
- 6.4GB/s 400MT/s 4-way bus interface
- Plug-in compatible with existing platforms
- Extensive RAS, DFT and DFM features

**Dunnington:**

**6 Penryn cores  
on one chip**

**3 MByte L2-Cache  
per dual-core**

**16 MByte L3-Cache**

**1,8 billion transistors**

Publication:  
Intel Developer Forum  
Spring 2008





**Westmere-EX:**

**10 cores, 20 logical cores**

**30 MByte L3-Cache**

**2,9 billion transistors**

**32 nm structures**

**Publication:  
Spring 2011**

Fakultät Informatik, Institut für Technische Informatik, Professur für Adaptive Dynamische Systeme



**General Purpose  
Graphics Processing  
Unit (GPGPU)**

**Spring 2011  
largest chip ever built**

**3 billion transistors**

**512 cores**



- 32 cores form a SM Streaming Multiprocessor
- 16 SIMD units
- ECC support Error Detection and Correction
- Unified Level 2 Cache
- Six 64-bit memory partitions (384 bit memory interface)

Fakultät Informatik, Institut für Technische Informatik, Professur für Adaptive Dynamische Systeme

- 512 CUDA Cores (Compute Unified Device Architecture)
- Parallel computing architecture developed by NVIDIA
- Executes one floating point and one integer instruction per clock
- 16 Load / Store Units per SM
- SFU special function unit handles sine, cosine, square root, interpolation etc.
- 64KB Level 1 Cache



Fakultät Informatik, Institut für Technische Informatik, Professur für Adaptive Dynamische Systeme

| GPU                                        | G80               | GT200               | Fermi                       |
|--------------------------------------------|-------------------|---------------------|-----------------------------|
| Transistors                                | 681 million       | 1.4 billion         | 3.0 billion                 |
| CUDA Cores                                 | 128               | 240                 | 512                         |
| Double Precision Floating Point Capability | None              | 30 FMA ops / clock  | 256 FMA ops /clock          |
| Single Precision Floating Point Capability | 128 MAD ops/clock | 240 MAD ops / clock | 512 FMA ops /clock          |
| Warp schedulers / SM                       | 1                 | 1                   | 2                           |
| Special Function Units (SFUs) / SM         | 2                 | 2                   | 4                           |
| Shared Memory / SM                         | 16 KB             | 16 KB               | Configurable 48 KB or 16 KB |
| L1 Cache                                   | None              | None                | Configurable 16 KB or 48 KB |
| L2 Cache                                   | None              | None                | 768 KB                      |
| ECC Memory Support                         | No                | No                  | Yes                         |
| Concurrent Kernels                         | No                | No                  | Up to 16                    |

Principles for mastering complexity:

**1. Modularization**

**2. Hierarchy**

**3. Abstraction**

(Divide and Conquer)

Fakultät Informatik, Institut für Technische Informatik, Professur für Adaptive Dynamische Systeme

Modularization = subdivide a system into several smaller, logically related parts (subsystems, modules), based on several criteria\*:

- Black Box Concept
- Completeness
- Fully defined interfaces
- Manageability
- Testability
- Limited access  
(information hiding, access only through well defined interfaces)
- Minimum (no) mutual interference

\* according to E. Denert: "Software Modularisierung", Informatik Spektrum, Volume 2, Magazine 4, 1979

## Entity concept in VHDL

VHDL = VHSIC Hardware Description Language

VHSIC = Very High Speed Integrated Circuits



- Hierarchy = Principle of order (formally: partial order) for the components of a system. Different relations between the components are considered for the creation of hierarchies.
- Example: “Uses” hierarchy  
Module A uses module B, if the correct operation of module A depends on the correct operation of module B, but not the other way round.

“Uses” hierarchy in VHDL via instances



- A partial order  $\leq$  on a set  $X$  is a binary relation that is reflexive, antisymmetric and transitive, i.e., it holds for all  $a, b$  and  $c$  in  $X$  that:

- $a \leq a$  (reflexivity)
- $a \leq b \wedge b \leq c \rightarrow a \leq c$  (transitivity)
- $a \leq b \wedge b \leq a \rightarrow a = b$  (antisymmetry)

- Example:

Set  $X := \{a, b, c, d, e, f, g, h\}$

Relation:  $\leq$  (Hierarchy)

Hierarchy  $\leq$  on  $X$  is a partial order:

- $\forall x \in X: x \leq x$  ( $a \leq a, b \leq b, \dots$ )
- From  $d \leq a$  and  $e \leq d$  results  $e \leq a$
- From  $h \leq g$  and  $g \leq h$  results  $g = h$





n: Number of Gates  
m: Number of Modules

## Without Modularization



## With Modularization and Hierarchy



Fakultät Informatik, Institut für Technische Informatik, Professur für Adaptive Dynamische Systeme

- Abstraction is hiding of details:  
Important and unimportant objects are determined.
- Creation of abstraction levels:  
Only those features of an object that are important on a certain level are considered on that level; unimportant details are not considered.
- Important basis for the creation of abstraction levels is the evenness of abstraction:  
All data selected for the description on a certain abstraction level should have the same degree of abstraction.
- Abstraction supported in VHDL:  
Algorithmic Level, Register Transfer Level, Logic Level

## Y-Diagram according to Gajski and Walker



- Geometrical view

- Polygons
- Cells
- Floor plan



- Structural view

- Transistors, Wires
- Gates, Flip-Flops
- Modules, Subsystems



- Behavioral view

- Differential Equations
- Boolean Equations
- Register Transfers, Algorithms

```

A := 3*B+C
IF (D=TRUE) THEN
    A := A + 1
ELSE A := A - 1
ENDIF

```

Fakultät Informatik, Institut für Technische Informatik, Professur für Adaptive Dynamische Systeme







```
-- Mealy Machine
ENTITY mealy IS
  PORT (x, clock : IN bit;
        z         : OUT bit);
END mealy;

ARCHITECTURE behavioral OF mealy IS
  TYPE state_type IS (s0, s1, s2, s3);
  SIGNAL current_state, next_state:
    state_type;
BEGIN
  -- process to hold synchronous elements
  sync: PROCESS
  BEGIN
    WAIT UNTIL clock'EVENT AND clock='1';
    current_state <= next_state;
  END PROCESS;

  -- process to hold combinational logic
  comb: PROCESS (current_state, x)
  BEGIN
    CASE current_state IS
      WHEN s0 =>
        IF x='0' THEN
          z <= '0';
          next_state <= s0;
        ELSE

```

```
          z <= '1';
          next_state <= s2;
        END IF;
      WHEN s1 =>
        IF x='0' THEN
          z <= '0';
          next_state <= s0;
        ELSE
          z <= '0';
          next_state <= s2;
        END IF;
      WHEN s2 =>
        IF x='0' THEN
          z <= '1';
          next_state <= s2;
        ELSE
          z <= '0';
          next_state <= s3;
        END IF;
      WHEN s3 =>
        IF x='0' THEN
          z <= '0';
          next_state <= s3;
        ELSE
          z <= '1';
          next_state <= s1;
        END IF;
      END CASE;
    END PROCESS;
  END behavioral;
```

- **System Level**

- Basic Characteristics  
compute performance  
and communication  
bandwidth
- Simplified Functionality
- No Timing

– Description Methods:

- Behavior:
- Structure:
- Geometry:



System/Block Specification  
CPUs, Memory, Interface Units  
Partitioning of Chip/Board Area

- Algorithmic Level

- Small kernel algorithms
- No consideration of target architectures (floating point, word length)
- No Timing Details
- No System Clock

- Description Methods:

- |              |                          |
|--------------|--------------------------|
| – Behavior:  | Functions and Procedures |
| – Structure: | Subsystems, Busses       |
| – Geometry:  | Cluster                  |

## Behavioral View

```

A := 3*B+C
IF (D=TRUE) THEN
    A := A + 1
ELSE A := A - 1
ENDIF

```



- Register-Transfer Level

- Basic Operations (e.g. Addition, bit\_vectors)
- Transfers between Registers
- Timing via Clock/Reset Signals
- discrete values, discrete time
- Two- (multi-) valued logic signals



- Description Methods:

- Behavior:      Arithmetic Operations,  
                        Finite State Machines
- Structure:     Modules (Registers, Encoder, ...)
- Geometry:      Floor Plan



- Logic Level

- Multi-valued logic signals  
still discrete time
- Single wires and busses
- Timing (Delays)



- Description Methods:
  - Behavior:
  - Structure:
  - Geometry:

Boolean Equations  
 Gates, Flip-Flops (Net List)  
 Cells



- Circuit Level

- Value- and Time-Continuous Signals (Voltages and Currents)
- Actual Components (transistors, diodes, resistors, capacitors, inductors)
- Detailed Component Models
- Description Methods:
  - Behavior: Ordinary Nonlinear Differential Equations
  - Structure: Transistors, Resistors, Capacitors, Wires, Current/Voltage Sources
  - Geometry: Polygons





Fakultät Informatik, Institut für Technische Informatik, Professur für Adaptive Dynamische Systeme









| <b>Phase</b>                        | <b>Subjects</b>                                                                                                                                        | <b>Result</b>                                                                                |
|-------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------|
| <b>Study</b>                        | <b>Brainstorming, Market analysis,<br/>Realization analysis,<br/>Check for principle realization</b>                                                   | <b>Definition of the requirements<br/>General description<br/>Rough cost / time estimate</b> |
| <b>Concept</b>                      | <b>Behavior oriented description,<br/>Defining the function as a goal,<br/>Defining test strategy</b>                                                  | <b>Task manual<br/>Behavioral specification</b>                                              |
| <b>Design</b>                       | <b>Architecture development,<br/>Hardware /Software tradeoff,<br/>Technical description of the development,<br/>Detail specification, test methods</b> | <b>Target specification<br/>Architecture specification<br/>Test specification</b>            |
| <b>Implementation</b>               | <b>Refinement<br/>Encoding,<br/>Manual design and synthesis,<br/>Module Test</b>                                                                       | <b>Verified components<br/>Implementation documentation<br/>Integration specification</b>    |
| <b>Integration and Test</b>         | <b>Integration of modules<br/>Verification of subsystems<br/>Verification of system</b>                                                                | <b>Verified prototype<br/>Integration documentation</b>                                      |
| <b>Production<br/>Commissioning</b> | <b>Transfer to product management<br/>Production</b>                                                                                                   | <b>Cleared product</b>                                                                       |
| <b>Maintenance and Service</b>      | <b>Changes, Improvements, Upgrades</b>                                                                                                                 | <b>Maintenance / error protocols</b>                                                         |
| <b>Field Service</b>                | <b>Recycling , Disposal</b>                                                                                                                            | <b>Lifecycle documentation</b>                                                               |

Fakultät Informatik, Institut für Technische Informatik, Professur für Adaptive Dynamische Systeme



Fakultät Informatik, Institut für Technische Informatik, Professur für Adaptive Dynamische Systeme



Independent of  
Technology

Dependent on  
Technology

## Hardware Description

```
architecture structural of first_tap is
signal x_q_red : std_logic_vector(bitwidth-1
downto 0);
signal mult   : std_logic_vector(2*bitwidth-1
downto 0);

begin
delay_register:
process(reset,clk)
begin
if reset='1' then
  x_q <= (others => '0');
elsif (clk'event and clk='1') then
  x_q <= x_in;
end if;
end process;

mult <= signed(coef)*signed(x_q);
```

**Synthesis  
(Synopsys)**

**Net List**



**Synthesis  
(Synopsys)**



**Chip Wafer**

**Place & Route  
(Cadence/  
Mentor)**

**Manufacture**

**Levels of Abstraction**



**Mask  
Layout**





## Description Data on Level i



## Description Data on Level i + 1

Fakultät Informatik, Institut für Technische Informatik, Professur für Adaptive Dynamische Systeme

## Development of Electronic Systems

=

### Sequence of Design Steps

(Analysis and Synthesis, creative, manual and automatic)

and





- Abstraction is applied both Top-Down and Bottom-Up
  - Example:
    - Abstraction top-down:  
No separate data path and control path after switching from register-transfer level to logic level.
    - Abstraction bottom-up:  
Value- and time-continuous description on the electric circuit level becomes value- and time-discrete description on the logic level.
- Modularization is possible on every abstraction level
- Hierarchy (concept orthogonal to abstraction)
  - A hierarchical description may comprise different abstraction levels.
  - A hierarchical description can be applied to one distinct abstraction level.





## Architecture specification



## Algorithmic specification

```

A := 3*B+C
IF (D=TRUE) THEN
  A := A + 1
ELSE A := A - 1
ENDIF
    
```

## Block Diagram (independent of manufacturer and technology)



## Schematic Net list (independent of manufacturer and technology)



## Implementation

### Architecture Synthesis

- Partitioning
- Parallel processes
- Analysis

### Algorithmic Synthesis

- Synthesis of digital circuits
- Optimization
- Design alternatives (serial / parallel)

### Net list optimization

- Inclusion of Libraries
- Refining the structural description

### Conventional CAD-System

Standard ICs  
PCBs

Standard cells  
Gate Arrays



























- Complexity
  - About every three years: complexity \* 4
  - Today: typically >> 100 000 gates/chip
  - Computer aided design necessary as manual design is no longer possible
- Computer aided design tools in various areas:
  - Specification
  - Simulation
    - Prototype production is too expensive and time-consuming
    - Early verification is only possible through simulation
  - Synthesis
    - Given current design complexity, manual design is no longer possible
    - Machine-readable description for specification, simulation and synthesis is necessary

- Data exchange
  - Between customer and developer (requirements specification, target specification)
  - During development between
    - Team members
    - Different tools
    - Different computer systems
  - Between development and production (production data)
    - » Uniform format for data exchange is necessary
- Documentation
  - Maintenance / Service / Further development
  - Re-use of designs
    - » Human readable format necessary for documentation

- System design without using an HDL:

- System developer defines architecture and requirement specification for modules
- Modules are developed bottom-up by different development teams, each team designing and verifying its own module
- All modules must have been designed and connected before simulation of the system is possible
- Comparison of specified data and characteristic data occurs very late in the development cycle
- Specification data is generally subject to change during the design process

- System design with an HDL:

- System developer defines architecture and functional requirement specification for modules using an HDL
- HDL as functional specification can be simulated, therefore the system's performance can be determined at an early stage: e.g. hardware/software-tradeoff, hardware and micro-code can be developed in parallel
- Tools for logic synthesis can be used earlier. A specification written in an HDL is no longer a reference document for the design, but actually represents the design itself. (Synthesis simplifies verification: "correct by construction")
- Regular comparison of specified data and characteristic data is possible