

# CircuitFusion: Multimodal Circuit Representation Learning for Agile Chip Design

Wenji Fang Shang Liu Jing Wang Zhiyao Xie

wfang838@connect.ust.hk

Hong Kong University of Science and Technology

ICLR 2025

# Circuit Representation Learning



# Background: AI for IC Design

- **Remarkable achievements**
  - **Design quality evaluation**
    - Power, timing, area, routability, etc.
  - **Functional reasoning**
    - Arithmetic word-level abstraction, SAT, etc.
  - **Optimization**
    - Design space exploration, etc.
  - **Generation**
    - RTL code, verification, etc.
  - ...



# Paradigm Shift in AI for IC Design

- Traditional: **task-specific supervised learning**

- Tedious, time-consuming, not generalized



- New trend: **general self-supervised representation learning**

- Encode circuit into **embeddings** for various **EDA tasks**
    - *Pre-training* to capture circuit intrinsics
    - *Fine-tuning* for EDA tasks





# Existing Circuit Representation Learning

- Explorations emerging from 2021
- Cover all main design stages
  - HLS, RTL, netlist, layout
- Research focus
  - Pre-training techniques
    - Circuit-related supervisions
    - Circuit-specific self-supervised
  - Predictive EDA tasks
- **Limitation: only focus on circuit *graph structure***





# Multimodal Representation Learning

- **Encode & fuse information from diverse modalities**

- Vision-language
- Graph-language
- Software-graph
- .....



- **Can we fuse multiple circuit modalities to learn better circuit representation?**

# CircuitFusion: Multimodal Circuit Representation Learning



# Summary of Circuit Modalities

- Multimodal nature of RTL-stage circuits

- ❖ **Functionality Summary**
- ❖ **Implementation Details**

**Functionality  
Summary**

*semantic*

```
reg [1:0] R0,R1;  
reg [2:0] R2;  
wire [2:0] W1,W2;  
...  
assign W1 = R0 + R1;  
...  
always @ (posedge clk)  
R2 <= W2;
```

**HDL  
Code**

8



**Structure  
Graph**

*structure*



# CircuitFusion Overview

- Multimodal fused & implementation-aware RTL encoder
  - Propose 4 strategies according to 4 unique circuit properties
  - Achieve SOTA performance on various EDA tasks





# CircuitFusion Model Architecture

- 3 **unimodal encoders**: graph, summary, code
- 1 **multimodal fusion encoder**: cross attention, summry-centric fusion
- 1 **auxiliary netlist encoder**: implementation-aware





# Step 1: Circuit Preprocessing

- **Property 1: parallel execution**

- Combinational logic calculates simultaneously
- Sequential registers are updated only at each clock cycle

- **Strategy 1: sub-circuit generation**

- Split based on register cones
  - Backtrace all combinational input logic
- Advantages
  - Consistency in Modality & stage
  - Complete state transition of 1 cycle
  - Intermediate granularity





# Step 2: CircuitFusion Pre-Training

- **Property 2: functional equivalent transformation**
  - Circuit w. similar **function** may have different **structures**
- **Strategy 2: semantic-structure pre-training**
  - Self-supervised Task #1-3 for each modality and multimodal fusion





# Step 2: CircuitFusion Pre-Training

- **Property 3: multiple design stages**
  - RTL (high-level semantics) → netlist (low-level details)
- **Strategy 3: implementation-aware alignment**
  - Pre-training with netlist encoder across design stage (Task #4)





# Step 3: Application for EDA Tasks

- **Property 4: circuit reusability**
  - Reuse circuit IPs rather than design from scratch
- **Strategy 4: Retrieval-augmented inference**
  - Retrieves most similar circuits based on embeddings



# Experimental Results

# Design Quality Prediction Tasks



- SOTA performance on RTL-stage PPA prediction vs.
  - Circuit task-specific solutions
  - Text encoders
  - Software code encoders

| Type                  | Method               | Slack       |            | WNS         |            | TNS         |            | Power       |            | Area        |            |
|-----------------------|----------------------|-------------|------------|-------------|------------|-------------|------------|-------------|------------|-------------|------------|
|                       |                      | R           | MAPE       |
| Hardware Solution     | RTL-Timer            | 0.85        | 17%        | 0.9         | 16%        | 0.96        | 25%        | N/A         |            | N/A         |            |
|                       | MasterRTL            | N/A         |            | 0.89        | 18%        | 0.94        | 28%        | 0.89        | 26%        | 0.98        | 16%        |
|                       | SNS v2               | N/A         |            | 0.82        | 22%        | N/A         |            | 0.76        | 28%        | 0.93        | 25%        |
| Text Encoder          | NV-Embed-v1          | N/A         |            | 0.49        | 17%        | 0.97        | 55%        | 0.85        | 44%        | 0.86        | 24%        |
| Software Code Encoder | UnixCoder            | N/A         |            | 0.46        | 21%        | 0.95        | 44%        | 0.83        | 29%        | 0.85        | 26%        |
|                       | CodeT5+ Encoder      | N/A         |            | 0.55        | 21%        | 0.63        | 43%        | 0.49        | 46%        | 0.45        | 39%        |
|                       | CodeSage             | N/A         |            | 0.23        | 25%        | 0.86        | 45%        | 0.8         | 38%        | 0.77        | 41%        |
| <b>Ours</b>           | <b>CircuitFusion</b> | <b>0.87</b> | <b>12%</b> | <b>0.91</b> | <b>11%</b> | <b>0.99</b> | <b>15%</b> | <b>0.99</b> | <b>13%</b> | <b>0.99</b> | <b>11%</b> |



# Ablation Study

- Impact on proposed **strategies** and circuit modalities





# Design Quality Prediction Tasks

- Zero-shot retrieval
  - First hardware solution to support zero-shot inference
  - Outperform text / software encoders

Table 3: MAPE(%) results of the zero-shot top-k similar circuit retrieval.

| Method            | Slack |       |       |        | Sub-circuit Power |       |       |        | Sub-circuit Area |       |       |        |
|-------------------|-------|-------|-------|--------|-------------------|-------|-------|--------|------------------|-------|-------|--------|
|                   | top-1 | top-3 | top-5 | top-10 | top-1             | top-3 | top-5 | top-10 | top-1            | top-3 | top-5 | top-10 |
| LLM Encoder       | 51    | 35    | 33    | 34     | 92                | 90    | 90    | 90     | 90               | 88    | 88    | 88     |
| UnixCoder         | 56    | 36    | 36    | 36     | 90                | 89    | 90    | 91     | 89               | 88    | 89    | 89     |
| CodeT5+ Embedding | 57    | 35    | 35    | 36     | 88                | 87    | 89    | 90     | 87               | 86    | 87    | 88     |
| CodeSage          | 50    | 36    | 36    | 36     | 89                | 87    | 88    | 91     | 88               | 85    | 86    | 87     |
| Ours              | 21    | 22    | 23    | 26     | 36                | 40    | 42    | 53     | 35               | 40    | 42    | 51     |

- Performance scaling up w. model & data size



# Conclusion & Future Work



# Conclusion

- **CircuitFusion:** first multimodal RTL encoder
  - 4 strategies according to 4 unique circuit properties
  - Support various EDA tasks
- Future work
  - Multimodal netlist encoder via text-attributed graph [DAC'25]
  - Align circuit encoders with generative LLM decoders



# Thank You!