

# TinyTapeout

Matt Venn\*

\*Cofounder, TinyTapeout

Email: matt@tinytapeout.com

**Index Terms**—Open Source Silicon, TinyTapeout

## I. INTRODUCTION

**TINYTAPEOUT** [1] is an educational project that makes it easier and cheaper than ever to get ASIC designs manufactured. The digital design flow consists of templating a GitHub [2] repository, adding a design, waiting for the tests and binary layout files (GDS [3]) generation to complete, then submitting to a quarterly shuttle.

Up to 500 designs are multiplexed to 24 general purpose input/output (GPIO) pins, and after manufacture the chips are mounted to a demonstration board for easy testing. Each design can be activated and tested in turn. Documentation submitted with each project forms a printable datasheet [4] as well as an online index at [TinyTapeout.com/runs/](http://tinytapeout.com/runs/) [5]

Design entry is done mostly with Verilog or Wokwi [6]. Wokwi is a web based schematic based editor that is an easy way to get started for people with no prior hardware description language (HDL) experience. The TinyTapeout website includes a basic getting started guide for drawing circuits with Wokwi available in English and Spanish.

The first [7] free and experimental shuttle with 152 designs was submitted to the seventh Google sponsored [8] lottery multi project wafer (MPW) shuttle in September 2022. The next 4 shuttles combined 582 designs and were sponsored by and manufactured with the Efabless [9] chipIgnite MPW service.

Each tile is approximately  $100 \times 160\mu m$ , enough for around 1000 logic gates and is priced at \$50. The physical chip and demo board are optional and cost an additional \$250. Individuals pay a reduced \$100 for their first chip and board thanks to sponsorship by Efabless [9].

By separating the cost of area and the cost of the chip, a group of 10 could submit 10 designs and share 1 board for \$600.

The GitHub templates [10] make use of GitHub Actions [11] - an automatic continuous integration system that is triggered every time the repository is updated. There are 4 main jobs:

- 1) GDS - installs OpenLane and the Sky130 process design kit (PDK), then builds the GDS, and generates a summary of the design that includes utilization, standard cells used, and a 2D and 3D model of the GDS. This job can optionally also run a gate-level verification.
- 2) Verification - installs the YosysHQ open source CAD suite which includes many common electronic design automation (EDA) tools. Then iVerilog [12] and cocotb [13] are used to run any testbenches included.

3) Documentation - generates a preview of the documentation.

4) Precheck - a number of tests are run to make sure that the design doesn't cause design rule check (DRC) errors after integration into the chip.

Successful GDS, Documentation, and Precheck jobs are required to submit to a shuttle. Verification is optional but highly encouraged. Wokwi designs can make use of an integrated truth table testing system [14].

While the process can be done entirely in the browser, it's also possible to install a local copy of the tools, which can help to reduce iteration time, especially for tests and verification.

### Routing stats

| Utilisation (%) | Wire length (um) |
|-----------------|------------------|
| 51.7            | 16572            |

### Cell usage by Category

| Category    | Cells                                                                                                                                                                                                                      | Count |
|-------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|
| Fill        | decap fill                                                                                                                                                                                                                 | 1145  |
| Combo Logic | o21ai nand3b a221o o31a o2b a21o a21bo nor2b o31ai a41o a21oi a211o o21ba o21a o21ai a26b2o or3b o221a a31o a22o o32a o22a a32o<br>a2111o and2b and3b or4b or4b o211a a2111oi o2bb2a a211oi and4bb o211ai a22oi a31oi o31a | 249   |
| Tap         | tappyprvnd                                                                                                                                                                                                                 | 246   |
| Flip Flops  | d1tp                                                                                                                                                                                                                       | 146   |
| Buffer      | b1f b1bf                                                                                                                                                                                                                   | 127   |
| AND         | and2 a21boi and3 and4                                                                                                                                                                                                      | 97    |
| Misc        | dlygate4sd3 dlymeta6s2s comb                                                                                                                                                                                               | 84    |
| OR          | or3 xor2 or2 or4                                                                                                                                                                                                           | 81    |
| NOR         | xnor2 nor2 nor3 nor4                                                                                                                                                                                                       | 64    |
| NAND        | nand2 nand3 nand4 nand2b                                                                                                                                                                                                   | 52    |
| Inverter    | inv                                                                                                                                                                                                                        | 37    |
| Multiplexer | mxu2 mxu4                                                                                                                                                                                                                  | 9     |
| Diode       | diode                                                                                                                                                                                                                      | 1     |

947 total cells (excluding fill and tap cells)

Fig. 1. The summary table of the GDS job.



Fig. 2. The 2D render of the cells in use, with empty areas visible in the lower left corner.

TABLE I  
STATISTICS FOR EACH OF THE TINYTAPEOUT SHUTTLE RUNS.

| Run  | Launched   | Closed     | Shuttle | Designs | Chips Expected | Estimated delivery date         |
|------|------------|------------|---------|---------|----------------|---------------------------------|
| TT01 | 2022-08-17 | 2022-09-01 | MPW7    | 152     | 2024-01-30     | Not expecting to ship this test |
| TT02 | 2022-11-09 | 2022-12-02 | 2211Q   | 165     | 2023-10-17     | 2024-01-30                      |
| TT03 | 2023-03-01 | 2023-04-23 | 2304C   | 249*    | 2024-01-15     | 2024-02-28                      |
| TT04 | 2023-07-01 | 2023-09-08 | 2309    | 143     | 2024-02-28     | 2024-04-15                      |
| TT05 | 2023-09-11 | 2023-11-04 | 2311    | 174     | 2024-04-12     | 2024-05-12                      |
| TT06 | 2024-02-01 | 2024-04-19 | 2404    | TBD     | TBD            | TBD                             |



Fig. 3. The interactive 3D viewer.



Fig. 4. A deep zoom to a D-type flip-flop, isolated from the rest of the design.

Community engagement has been strong with 756 designs submitted over the 5 shuttles. The Discord community has 1000 members with 1600 subscribed to the mailing list.

## II. SCANCHAIN ARCHITECTURE

TinyTapeout started as an experiment in fitting as many designs as possible into the  $10\text{mm}^2$  available on the Google lottery shuttles. As a fast proof of concept, a scan chain was chosen. Each design had 8 inputs and 8 outputs. Clock and reset were optional and not treated specially. The chain was formed of scan flops [15], a type of flip flop with an integrated multiplexer at its input.

Each design sends data into the scan flops secondary input and receives input from the output of the flop via a latch. The chain is built [16] by sending data from the output of the previous scan flop into the next scan flop's primary input.



Fig. 5. How TT04 submitters identified themselves.



Fig. 6. 500 designs connected in a chain for TT01, with the scan chain driver in the lower left corner.



Fig. 7. A simplified view of 2 designs in the chain.

This arrangement allows the loading of data into any of the designs, and then capturing the output and clocking that through the rest of the chain to the output.

While relatively easy to implement, the downside is the latency. The more designs in the chain, the longer it takes to send and receive data.

Assuming a 50MHz scan chain clock, 250 designs with 8 inputs and 8 outputs, the maximum refresh rate is  $50M/(8 \times 250) = 25kHz$ .

TT01's scan chain was embedded into each design, which meant that a user could unintentionally remove it, breaking the chain. This risk was mitigated by formally [17] proving the chain was present in the submitted design. For TT02 and TT03, the scan chain was separated into a separate macro block that the user can't modify.



Fig. 8. TT02 designs with separate scan chain blocks.

Another concern was hold violations due to the large number of serially connected flops and potentially large clock skews due to long signal wires. This was mitigated by reclocking the output data with a negedge flop, providing substantially more hold margin.

After static timing analysis (STA) it was discovered that the clock duty cycle could change substantially due to the 500 sequential clock drivers. Depending on the clock buffers and capacitance between each design, the clock duty cycle could either increase or decrease, with this effect accumulated over the chain.

For TT01 and TT02 each design used two clock buffers, with the internal flops driven after the first buffer. TT03 used inverting clock buffers, with only one between the clock in and out.



Fig. 9. Differences between TT02 and TT03 scanchain clock buffers.

By inverting the clock between each design, any asymmetry in the clock pulse is evenly spread across the negative and positive cycles.

The verification effort [18] was broad and included a community review, register transfer level (RTL) and gate level (GL) simulation, Formal Verification [19], static timing analysis, layout vs schematic (LVS), DRC, and device level static verification [20].

### III. CIRCUIT BOARDS

After manufacture, the chips are mounted onto small carrier boards with 0.1 inch headers. This allows people with limited equipment or surface mount technology (SMT) assembly experience to build their own demonstration boards.

The carrier fits onto the demonstration board which provides:

- USB-C for power connection,
- 1.8v and 3.3v power supplies for core and IO,
- 20MHz oscillator,
- buttons for reset and single-step clock,
- an 8-way DIP switch for inputs,
- a 9-way DIP switch for design selection,

- a 7-segment LED display for the outputs,
- headers for all IO, including 2 standard Digilent ports (PMOD),
- a header to select internal or external clock,
- a header to select internal or external scan chain driver,
- a header to engage an automatic clock divider in input pin 0.



Fig. 10. The demonstration board. Certified Open Source Hardware ES000040 [21].

#### IV. SCANCHAIN SILICON RESULTS

TT02 chips were received in October 2023, 11 months after the chips were submitted for manufacture on Efabless chipIgnite 2211Q. The chips were tested for the first time in public on a livestream [22]. The chain was validated, and a few of the designs were shown to be working.



Fig. 11. Measurement from TT02 silicon, with input clock in yellow and the distorted output clock in blue.

In the following days another 30 designs were tested and shown to be working.

After measuring the clock skew and maximum frequency it was decided to run the production boards with a  $20\text{MHz}$  oscillator, resulting in a  $10\text{MHz}$  scan chain.

Some designs didn't function as expected, which in most cases was due to faults in the submitted design.

As well as 82 Verilog designs, 64 used the Wokwi graphical editor, 5 used Amaranth, and 1 used Chisel. Some Wokwi designs using combinational logic in flop clocks failed in



Fig. 12. The TT02 demo board running a design.

hardware but worked in simulation. This is due to the lack of timing data in the simulation, and wasn't detected by STA because the clocks were not known. A detailed analysis has yet to be carried out. The addition of SR flops to Wokwi will help to alleviate this, as well as the start of an ERC check.



Fig. 13. Combination logic in the clock path of one of the failed designs.

At the time of writing, PCBs are in production and are expected to ship to customers by the end of January 2024.

TinyTapeout 3 silicon was received in January 2024, and the updated scanchain shows a more symmetric output clock at the end of the chain.



Fig. 14. Measurement from TT03 silicon.

#### V. BEYOND THE SCANCHAIN

The biggest limitation of the Tiny Tapeout architecture was the IO latency. For Tiny Tapeout 4 a new architecture was

needed, and a series of proposals was gathered from the community. An online video call was held and the 10 proposals discussed. The winning design was a fairly straightforward multiplexer design.



Fig. 15. The multiplexer design.

The physical layout consists of a central controller connected up and down to two vertical spines. Twenty-four horizontal muxes connect to the spine with each supporting 16 designs. This allows up to 384 separate single tile designs. Multiple tile designs were also enabled, allowing a maximum project size of  $2 \times 8$  tiles or  $1359 \times 225\mu m$  - around 20,000 logic cells.



Fig. 16. The TT03.5 test design.

Another major limitation of TT1 to 3 was the small number of IO. The scan controller used 9 GPIOs to select the currently active design, which, while simplifying the demo board, wasted valuable pins. With TT04, the parallel design selection was dropped in favor of a serial protocol. The extra pins were then used as bidirectional pins, giving each design clock, reset, and 24 IO.

An invite-only experimental shuttle [23] was submitted with 32 designs to Efabless chipIgnite 2306C. Two of the designs included a power gate as a stepping stone to supporting analog and mixed-signal designs.

TABLE II  
COMPARISON BETWEEN TT03 AND TT04

| Parameters             | Tiny Tapeout 3        | Tiny Tapeout 4         |
|------------------------|-----------------------|------------------------|
| Max clock speed        | $12.5\text{kHz}$      | $50\text{MHz}$         |
| Max design size        | $150 \times 170\mu m$ | $1359 \times 225\mu m$ |
| Input pins             | 8                     | 10                     |
| Output pins            | 8                     | 8                      |
| Bidirectional I/O pins | None                  | 8                      |
| Custom GDS file        | X                     | ✓                      |

## VI. MULTIPLEXER SILICON RESULTS

After silicon was received, the worst round trip latency was measured to be  $20\text{ns}$ .



Fig. 17. Round trip latency on a rising edge of about  $20\text{ns}$ .



Fig. 18. Round trip latency on a falling edge of about  $16\text{ns}$ .

The new chip pinout and serial design selection required a new demo board that included an easy way to select the design. The RP2040 microcontroller was chosen as a co-processor as it allows:

- Drag and drop firmware updates on any OS,
- Runs MicroPython [24], ideal for beginners to test their designs,
- External memory emulation via PIO and DMA.

An additional PMOD expansion port was added for the bidirectional pins, and the community has started to standardize on pinouts [27] making it easier to test each other's designs. A new repository was created to house user-contributed PMODs [28].

```
enabling design tt_um_test by sending 102 [0b01100110] pulses
design repo https://github.com/TinyTapeout/tt03p5-test @ 434c5d508d20053bea346881a61355f87ea1ca91
0 0 0 0
1 0 0 0
0 1 0 0
1 1 0 0
0 0 1 0
1 0 1 0
0 1 1 0
1 1 1 0
0 0 0 1
```

Fig. 19. A MicroPython program [25] enabling a design, clocking it, and printing the results.



Fig. 20. The TT04+ demo board [26].

## VII. IMPROVING THE MULTIPLEXER AND MIXED SIGNAL SUPPORT

TT05 split the mux into two parts to improve performance. As each spine segment is now half as long, it will have half the capacitance. We expect to reduce the round trip latency to around 10ns.

For TT06, the Caravel harness will be replaced by Open-Frame [29], an alternative harness provided by Efabless that uses the same padding but removes the RISC-V coprocessor. This adds an extra 5mm<sup>2</sup> more space for user designs, and an extra 12 pins that will be used for analog.

For increased safety, all designs will be power-gated, which will allow designers to take more risks or use custom flows.

Analog and mixed-signal designs will be enabled by adding an analog multiplexer based on transmission gates [30]. This allows up to 192 designs to share the 8 analog pins between them.

TT06 is planned to open for digital designs at the end of January 2024, for analog designs at the end of February, and to close on April 19th, 2024.

## VIII. SILICON SHOWCASE

A small sample of the types of designs possible with TinyTapeout are listed below:

- Serial FPGA (Link)
- Synthesizable Digital Temperature Sensor (Link)
- 395 standard cells with mux (Link)
- FM transmitter with I2S input (Link)
- USB full speed - (Link)
- A Linux capable RISCV CPU - (Link)



Fig. 21. A user-contributed VGA output PMOD.



Fig. 22. VGA clock design running on TT03.5 silicon.



Fig. 23. Transmission gate tested in TT05 that will be used to form the analog multiplexer.



Fig. 24. The Synthesizable Digital Temperature Sensor.

## REFERENCES

- [1] “TinyTapeout,” <https://tinytapeout.com/>.
- [2] “GitHub,” <https://github.com/>.
- [3] “GDS2 File Format - Zero to ASIC Course,” <https://www.zerotoasiccourse.com/terminology/gds2/>.
- [4] “Datasheet PDF - TinyTapeout Repository,” <https://github.com/TinyTapeout/tinytapeout-02/raw/tt02/datasheet.pdf>.
- [5] “TinyTapeout Runs,” <https://tinytapeout.com/runs/>.
- [6] “Wokwi,” <https://wokwi.com/>.
- [7] “First Shuttle - TinyTapeout Runs,” <https://tinytapeout.com/runs/tt01/>.
- [8] “Announcing the GlobalFoundries Open MPW Shuttle Program - Google Open Source Blog,” <https://opensource.googleblog.com/2022/10/announcing-globalfoundries-open-mpw-shuttle-program.html>.
- [9] “Efabless,” <https://efabless.com/>.
- [10] “Verilog Template - TinyTapeout Repository,” <https://github.com/TinyTapeout/tt06-verilog-template>.
- [11] “GitHub Actions Documentation,” <https://docs.github.com/en/actions>.
- [12] “iVerilog,” <https://github.com/steveicarus/iverilog>.
- [13] “cocotb,” <https://www.cocotb.org/>.
- [14] “Wokwi Automated Testing - TinyTapeout,” [https://tinytapeout.com/digital\\_design/wokwi\\_automated\\_testing/](https://tinytapeout.com/digital_design/wokwi_automated_testing/).
- [15] “Skywater PDK Documentation,” [https://skywater-pdk.readthedocs.io/en/main/contents/libraries/sky130\\_fd\\_sc\\_hdll/cells/sdfxtp/README.html](https://skywater-pdk.readthedocs.io/en/main/contents/libraries/sky130_fd_sc_hdll/cells/sdfxtp/README.html).
- [16] “Updating Inputs and Outputs of a Design - TinyTapeout Repository,” <https://github.com/TinyTapeout/tinytapeout-02/blob/tt02/INFO.md#updating-inputs-and-outputs-of-a-specified-design>.
- [17] “TinyTapeout Scan - GitHub Repository,” [https://github.com/jix/tinytapeout\\_scan](https://github.com/jix/tinytapeout_scan).
- [18] “Verification Documentation - TinyTapeout Repository” <https://github.com/TinyTapeout/tinytapeout-02/blob/tt02/VERIFICATION.md>.
- [19] “SymbiYosys (SBY) - YosysHQ Repository,” <https://github.com/YosysHQ/sby>.
- [20] “CVC - Verification Documentation - TinyTapeout Repository,” <https://github.com/TinyTapeout/tinytapeout-02/blob/tt02/VERIFICATION.md#cvc>.
- [21] “Open Source Hardware Certification - TinyTapeout,” <https://certification.oshwa.org/es000040.html>.
- [22] “TT02 Silicon is Alive! - Zero to ASIC Course Blog,” <https://www.zerotoasiccourse.com/post/tt02-silicon-is-alive/>.
- [23] “TinyTapeout 03p5 - GitHub Repository,” <https://github.com/TinyTapeout/tinytapeout-03p5>.
- [24] “MicroPython Official Website,” <https://micropython.org/>.
- [25] “Demo Firmware Test Script - TinyTapeout Repository,” <https://github.com/TinyTapeout/tt3p5-demo-fw/blob/main/tt3p5-test/test.py#L119>.
- [26] “TT04+ Demoboard PCB - TinyTapeout Repository,” <https://github.com/TinyTapeout/tt-demo-pcb>.
- [27] “Pinouts Specifications - TinyTapeout,” <https://tinytapeout.com/specs/pinouts/>.
- [28] “Awesome TinyTapeout PMODs - GitHub Repository,” <https://github.com/TinyTapeout/awesome-tinytapeout-pmobs>.
- [29] “Caravel Openframe Project,” [https://github.com/efabless/caravel-openframe\\_project](https://github.com/efabless/caravel-openframe_project).
- [30] “TT05 Analog Test,” <https://github.com/iic-jku/tt05-analog-test>.