

# Ziyao Yin

(470) 656-1775 • [tzuyaoyin@gmail.com](mailto:tzuyaoyin@gmail.com) • [LinkedIn](#)

## Education

### Georgia Institute of Technology

M.S., Department of Electrical and Computer Engineering

Aug 2024 - May 2026

Coursework: Advanced VLSI Systems; Advanced Computer Architecture; Digital Systems Testing; Phys Design Automat-VLSI; GPU Hardware and Software; Physical Foundations of Computer Engineering; HW-SW Co-Design of ML Systems.

### Southern University of Science and Technology

B.E., Department of Computer Science & Engineering

Sep 2020 - July 2024

GPA: 3.85/4.0, Rank: 16/220

Coursework: Computer Architecture; Digital Logic; Operating Systems; Object-Oriented Programming; Data Structure.

## Working Experience

### Silicon Jackets, Georgia Institute of Technology

Verification Team Member

Aug 2025 - Present  
Atlanta, GA, USA

- Owned UVM verification for the Tapeout-2 RISC-V CPU RAS predictor: authored test plan + functional/spec docs, built UVM environment, and achieved 97%+ functional coverage.
- Developing a cycle-accurate C++ golden model for Tapeout-3 RISC-V superscalar CPU; implemented Spike trace-driven checking for incremental validation as pipeline/RTL modules land.
- Designed a modular reverse-order simulation framework (stable interfaces + unit tests + docs) to enable parallel development across multiple contributors.

### EIC Lab, Georgia Institute of Technology

Graduate Research Intern

Aug 2024 - Dec 2024  
Atlanta, GA, USA

- Built a systolic-array accelerator baseline RTL in Verilog and a simulation testbench to validate functional correctness.
- Prototyped an MoE runtime switching mechanism, providing an extensible baseline for subsequent feature completion.

### CORSA Lab, UC Irvine

Undergraduate Research Intern

May 2023 - Sep 2024  
Irvine, CA, USA

- Built HyTrans, an HBM-based end-to-end Transformer accelerator on Xilinx Alveo U280 (HBM2) using Vitis HLS.
- Designed hybrid dataflow optimizations and integrated 2D + 1D systolic arrays; achieved 1.69 ms end-to-end prefill latency at 250MHz; invited for a DAC 2024 poster. [\[Project Page\]](#)

## Selected Projects

### RTL-to-GDSII for Double-Buffered Memory Compute Block (Silicon Jackets) | SystemVerilog, Tcl, STA

- Designed a 32-bit microprocessor using SystemVerilog, featuring memory access, double-buffered data movement.
- Built a structured SystemVerilog testbench (driver/monitor/scoreboard), reaching 99% functional coverage.
- Took the design through an RTL-to-GDSII flow and post-route STA using OpenLane and Cadence Innovus, with Tcl/Python automation for timing report analysis and macro placement.

### GPU Microarchitecture Analysis (GT ECE 8803) | C/C++, CUDA C++, SASS Lifter, LLVM

- Modeled and profiled a modern GPU microarchitecture in MacSim to establish a baseline for performance characterization.
- Implemented and compared multiple warp scheduling policies across compute and tensor execution.
- Built a SASS-to-LLVM analysis flow to extract CFG information, enabling automated identification of branch divergence patterns.

### Digital System Test (GT ECE 6140) | Python, Fault Simulation, ATPG (PODEM)

- Implemented a Python-based ATPG engine using PODEM with 5-valued logic to generate tests for ISCAS stuck-at faults.
- Built a deductive fault simulator to report per-vector detected faults at primary outputs, validating ATPG correctness.
- Added random testing and coverage analysis to estimate vectors needed for 90% fault coverage

### Corograph: Graph Algorithm Framework (Research) | C/C++, Intel Xeon, NUMA/Cache Optimization [\[Paper\]](#)

- Built a NUMA- and cache-aware graph computing framework for Intel Xeon server CPUs, improving data locality and intra-thread parallelism while delivering state-of-the-art performance on mainstream graph algorithms; published in **VLDB 2024**.

## Technical Skills

### Programming

Verilog/SystemVerilog, UVM, C/C++, Python, CUDA C++, JAVA, MATLAB, JAX

### EDA Tools

Synopsys VCS/DC, Cadence Genus/Innovus, Cadence Virtuoso, Vivado, Vitis HLS, OpenRAM

### Hardware Platforms

FPGA (Xilinx), ASIC flow (sky130 PDK), GPU microarchitecture, Google TPU (JAX)