

# 高级计算机体系结构

Advanced Computer Architecture

集成电路设计流程

沈明华



# 目录

CONTENTS

01

芯片设计

02

数字流程

03

模拟流程

04

总结思考

# PART **02**

## 数字流程

# ■ 数字芯片设计流程



上述过程通过计算机软件(EDA)完成芯片设计

# ■ 数字芯片设计方法

□ 数字流程: Verilog 电路程序 → GDSII 版图

□ 示例: Western Digital RISC-V SweRV Core™ 处理器芯片[1]



- 32位
- 9级流水
- 28nm CMOS
- 1.8GHz

[1] [https://github.com/westerndigitalcorporation/swerv\\_eh1](https://github.com/westerndigitalcorporation/swerv_eh1).

# ■ Function Verification (数字仿真器)



# ■ Function Verification (Verification IP)



# ■ Universal Verification Methodology (UVM)

- UVM是一个以SystemVerilog为主体的验证平台开发框架
- 验证工程师利用其可重用组件可以构建具有标准化层次结构和接口的功能验证环境
- UVM是一个库，在这个库中，几乎所有的东西都是使用类(class)来实现的。
- 基于UVM的验证平台可以类似分为五个层次：信号层、命令层、功能层、场景层和测试层



# ■ Function Verification (硬件加速仿真器)



# ■ 高层次综合(High-Level Synthesis, HLS)

- HLS provides an automated path from system description to RTL
- Accelerate design time by working at higher level of abstraction
  - 1 Year reduced to a few months
  - New features added in days not weeks
- Cut verification time by 25-50%
  - C++/SystemC Simulates 50-1000x faster
  - Faster functional verification and debug iterations
- Determine optimal hardware microarchitecture
  - Rapidly explore multiple options for optimal PPA



# ■ 逻辑综合(Logic Synthesis)

- 逻辑综合是将所设计数字电路的RTL级描述，在满足约束条件下，将RTL级描述转化为指定的工艺库中单元电路的连接。主要包括三个阶段：翻译、优化和工艺映射
- 翻译：用HDL语言描述的电路转化为用GTECH库元件组成的逻辑电路的过程。
- 优化：对翻译后的初始电路在满足设计约束的前提下对功耗、速度和面积进行优化。
- 工艺映射：将优化后的电路映射到制造商提供的包含具体工艺信息(面积、时延、功耗、负载等)的库上，输出工艺相关的门级网表。



# ■ 逻辑综合(Logic Synthesis)

## ➤ 逻辑综合要做的事情:

- ✓ 逻辑变化: 从行为级描述变换成结构化的网表
- ✓ 保证变化前后的电路功能与行为描述一致
- ✓ 是一种多目标优化问题

## ➤ 优化目标

- ✓ Timing满足周期约束 (delay最小)
- ✓ 面积最小
- ✓ 功耗最小
- ✓ Congestion问题最少
- ✓ Fanout满足约束 (fanout数最小)
- ✓ DRC满足约束 (max transition/max cap 最小)
- ✓ ...

## ➤ 现有优化流程

- ✓ 分阶段逐次逼近目标
- ✓ 启发式迭代

## ➤ 现有流程存在的问题

- ✓ 无法达到最优解
- ✓ 逻辑优化缺乏对physical design的预测, 逻辑综合的最优解 (或次优解) 不一定是placement和routing后的最优解



# ■ Logic Synthesis History



# ■ Logic Synthesis Flow (E.g. Using Design Compiler)



# ■ 物理综合与签核(Physical Synthesis and Signoff)



# ■ 布局布线(Placement & Routing)

Placement



Floorplanning



Macro Placement



Cell Placement

Routing



Global Routing



Detailed Routing



Maze Routing

# ■ 布图规划(Floorplanning)

**Floorplanning : mapping between the logic description (graph) and the physical description (floorplan)**

**Goal: arrange blocks, pin assignment, feedthrough, decide locations of the I/O pads and power pads, etc.**

**Objectives:** Minimize chip area, delay, thermal, routing congestion, etc.



# ■ 布局(Placement)



**Global Placement:** Compute the best position for each node(cells, macros) to minimize the cost(e.g., WL, Rout ability, Timing), with local nodes overlaps allowed.

**Legalization:** Remove overlaps & place nodes to legal rows.

**Detailed Placement:** Refine the solution(Timing satisfied, Rout ability improved, etc)

Often bundled as Detailed Placement

# ■ 布线(Routing)

## Problem Definition

Routing : Given a placement, and a fixed number of metal layers, find a valid pattern of horizontal and vertical wires that connect the terminals of the nets.

Goal: 100% connectivity of a system, minimum wirelength

Constraints: Number of routing layers, design rules, timing, crosstalk, process variations

Two-step process: Global routing, detailed routing



## Global Routing



## Detailed Routing



## Layer Stacks



# ■ 静态时序分析(Static Timing Analysis, STA)

Types of paths considered for timing analysis



Setup and hold checks



# ■ 图划分(Graph Partition)

**诉求:** 最大百亿个有依赖的计算任务，放到有限的K个可执行单元（最大执行10K级任务数且有内存限制）进行执行，确保执行**性能最优**。



## 切分目标:

执行单元再负载均衡，满足容量多约束，执行单元间交互尽量少，整体的执行性能最优

| 约束    | 详细要求                               |
|-------|------------------------------------|
| 负载均衡  | 各执行单元负载均衡，允许一定范围波动                 |
| 容量多约束 | 单个执行单元满足一定的计算任务个数，需要同时考虑计算任务的内存占用  |
| IO交互  | IO交互有一定的通信带宽限制，IO尽量少               |
| 通信时延  | 需要考虑执行单元间的通信时延，执行间隔更远可能需要多跳到达，时延更长 |
| 指定节点  | 某些任务必须在同一个执行单元执行                   |

# ■ DFM (Design-for-Manufacture)

## DFM Analysis

### Litho Friendly Design



### Mask Analysis & Scoring



### Pattern Matching



### Via Redundancy Checking



### Chemical Mechanical Polishing



### Critical Area Analysis



Broadest range of applications, foundries, processes

## YieldEnhancer for Layout Modification

### Advance Filling With SmartFill



### ECO Fill Flow



### Via Modification



### Net/Timing Aware Flow



### Programmable Edge Modification



### Power Via Flow



# ■ DFT (Design-for-Test)

Ensure semiconductor devices begin defect-free



High quality manufacturing test  
for 0 DPPM device ship quality

Ensure any new defects are quickly detected throughout device operational life



Embedded device monitoring for ongoing defect detection

# ■ 测试、诊断、良率(Test, Diagnosis, Yield Learning)



# ■ 自动化测试向量生成

## □ 故障仿真在芯片测试流程中运行时间长影响芯片开发效率

- 自动测试向量生成(Automatic Test Pattern Generation, ATPG)

每条或每组测试向量生成之后，都需要进行故障仿真

在ATPG流程中故障仿真运行时间可以达到30%以上

- 故障诊断Diagnosis

故障诊断是根据测试结果反向推导根因的过程

在诊断流程中故障仿真运行时间可以达到70%以上



ATPG流程

# ■ 故障仿真

- 给定电路(circuit)、故障模型(fault model)、测试向量(test patterns),评估电路中有哪些故障可以被哪些向量检测到。



# ■ Layout Pattern Analysis for Yield

核心问题：如何定义Layout Representation，在保持旋转、镜像、平移不变性前提下，有效计算Pattern相似距离函数。



1. Open/Bridge POI 定义:



1. 以POI为中心点提取LP;



**感谢！**

---