

# Tiny Tapeout: A Shared Silicon Tapeout Platform Accessible To Everyone

Matt Venn  
 Tiny Tapeout, YosysHQ  
 Email: matt@tinytapeout.com

**Index Terms**—ASIC, Multi Project Chip, Open Source Silicon, Tiny Tapeout.

## I. INTRODUCTION

**TINY TAPEOUT** is a multi project chip platform that makes it easier and cheaper to get application specific integrated circuit (ASIC) designs manufactured.

Open source tools and process design kits (PDK [?]) are used so no licenses or non disclosure agreement (NDAs) are required. As the tools run on remote cloud servers no software needs to be installed locally on the user's machine. As long as the template structure is followed, however, Tiny Tapeout does support the use of proprietary tools.

Each Tiny Tapeout ASIC production run sees around 400 open source designs multiplexed to 24 general purpose input/output (GPIO) pins. After manufacture the resulting chip is mounted to a demonstration board for ease of testing. Each chip contains a copy of every design, which can be selected and tested in turn.

At the same time each participant submits documentation for their design, which used to create a printable datasheet [?] along with an online project index at [TinyTapeout.com/runs/](http://TinyTapeout.com/runs/) [?]. The datasheet helps participants explore other designs on the chip in addition to their own.

By separating the cost of area on a silicon wafer and the finished physical chip, the Tiny Tapeout participant group is able to share the cost of chip packaging and circuit board manufacture while still being able to test and measure all the designs on the chip. For use in educational settings it is possible for multiple students to submit individual designs while sharing the finished chips and circuit boards, reducing the cost still further.

Each Tiny Tapeout tile (Fig. 1) is approximately  $160 \times 100 \mu\text{m}^2$ . This provides enough room for around 1000 logic gates when built upon the SkyWater 130nm open source PDK. Multiple tiles can be interconnected to enable larger designs, while analog and mixed signal support is on the roadmap for the next shuttle.

Community engagement in Tiny Tapeout has been strong, with 756 designs submitted over the first five shuttles. A curated selection of projects is provided in section IX. An online chat server for participants has 1000 members with 1600 subscribers to the project's mailing list. Individuals submitting designs to Tiny Tapeout tend to self identify as hobbyists, students, and teachers, as shown in Fig. 2.



Fig. 1. A 2-D render of a single Tiny Tapeout tile.



Fig. 2. Tiny Tapeout 4 participant self identification.

The first [?] Tiny Tapeout production run, which was provided as a free and experimental effort with a total of 152 designs, was submitted to the seventh Google-sponsored [?] lottery based multi project wafer (MPW) shuttle in September 2022. The next four shuttles combined a total of 582 designs, all sponsored by and manufactured through the Efabless [?] chipIgnite MPW service. Table I shows a summary of all Tiny Tapeout shuttle runs to date.

The rest of this paper will detail the Tiny Tapeout design flow, multiplexer evolution, circuit board design, the results of post production silicon testing, and the project's next steps.

TABLE I  
TINY TAPEOUT SHUTTLE SUMMARY

| Run  | Launched   | Shuttle | Designs | Delivery date | Architecture              | Number of IOs | IO bandwidth | Analog support |
|------|------------|---------|---------|---------------|---------------------------|---------------|--------------|----------------|
| TT01 | 2022-08-17 | MPW7    | 152     | n/a           | Scan chain                | 16            | 5 kHz        | no             |
| TT02 | 2022-11-09 | 2211Q   | 165     | 2024-01-30    | Scan chain                | 16            | 5 kHz        | no             |
| TT03 | 2023-03-01 | 2304C   | 249     | 2024-02-28    | Scan chain inverted clock | 16            | 10 kHz       | no             |
| TT04 | 2023-07-01 | 2309    | 143     | 2024-04-15    | Mux                       | 26            | 50 MHz       | no             |
| TT05 | 2023-09-11 | 2311    | 174     | 2024-05-12    | Split Mux                 | 26            | 50 MHz       | no             |
| TT06 | 2024-02-01 | 2404    | TBD     | 2024-11-30    | Split Mux                 | 38            | 50 MHz       | yes            |



Fig. 3. Design Submission Flow

## II. DESIGN SUBMISSION FLOW

Tiny Tapeout designs are primarily developed in the Verilog hardware description language (HDL) or Wokwi [?]. Wokwi is a web based visual schematic editor for hardware description, designed as an easier way for individuals with no prior HDL experience to get started. The Tiny Tapeout website [?] includes a basic Wokwi getting started guide, demonstrating how to use the tool to draw circuits, which is made available in English and Spanish.

The submission flow [?] starts with the participant creating a GitHub [?] source code repository based on provided templates then adding their ASIC design. This triggers automated tests and the generation of binary layout files in GDSII [?]. If all tests pass and the binary layout files are correctly generated, the design can then be submitted to a quarterly shuttle for production in silicon.

The Tiny Tapeout GitHub templates [?] make use of GitHub Actions [?—an automatic continuous integration system triggered every time the repository is updated. This reduces duplicated effort and makes it possible for Tiny Tapeout to support large numbers of participants without excessive technical overhead.

There are four main jobs in the continuous integration system:

- 1) GDS: installs OpenLane [?] and the SkyWater Sky130 [?] PDK, builds the binary layout files, and generates a summary of the design (Fig. 4). The summary includes utilization, standard cells used, a 2-D render (Fig. 1) and an interactive 3-D viewer (Fig. 5). This job can also optionally run a gate-level verification of the design.
- 2) Verification: installs the YosysHQ open source computer-aided design (CAD) suite, which includes many common electronic design automation (EDA) tools; uses iVerilog [?] and cocotb [?] to run included testbenches.

## Cell usage by Category

| Category    | Cells                                                                                                                                                                                                              | Count |
|-------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|
| Fill        | decap fill                                                                                                                                                                                                         | 1145  |
| Combo Logic | o21a nand3b a2210 o31a or2b a210 a21bo nor2b o31ai a410 a21oi a2110 o21ba o21a o21bai a2bb2o or3b o221a a31o a220 o32a o22a a320 a2111o and2b and3b or4b or4bb o211a o2bb2a a2110j and4bb o211ai a220j a31oj o311a | 249   |
| Tap         | tapwpwvngd                                                                                                                                                                                                         | 246   |
| Flip Flops  | dfxtp                                                                                                                                                                                                              | 146   |
| Buffer      | buf clkbuf                                                                                                                                                                                                         | 127   |
| AND         | and2 a21boi and3 and4                                                                                                                                                                                              | 97    |
| Misc        | dlygate4sd3 dlymetal6s2s conb                                                                                                                                                                                      | 84    |
| OR          | or3 xor2 or2 or4                                                                                                                                                                                                   | 81    |
| NOR         | xnor2 nor2 nor3 nor4                                                                                                                                                                                               | 64    |
| NAND        | nand2 nand3 nand4 nand2b                                                                                                                                                                                           | 52    |
| Inverter    | inv                                                                                                                                                                                                                | 37    |
| Multiplexer | mux2 mux4                                                                                                                                                                                                          | 9     |
| Diode       | diode                                                                                                                                                                                                              | 1     |

947 total cells (excluding fill and tap cells)

Fig. 4. A summary table from the GDS continuous integration job.

- 3) Documentation: generates a preview of the documentation.
- 4) Precheck: runs design rule check (DRC) tests to ensure the design can be integrated into the multi project chip.

Successful GDS, Documentation, and Precheck job completion are all required for a design to be submitted to a shuttle for production. Verification is optional but highly encouraged. Submissions designed in Wokwi are able to make use of its integrated truth table testing system [?].

Projects are submitted to a shuttle through the Tiny Tapeout website [?]. Projects can be continually updated up until the closing date of the shuttle.

While the Tiny Tapeout continuous integration system can be run entirely in the user's web browser, it is also possible to install a local copy of the tools [?] on a participant's computer. Locally installed tools can help to reduce the time between design iterations, especially for the test and verification jobs.

## III. SCAN CHAIN ARCHITECTURE

Tiny Tapeout started as an experiment in fitting as many designs as possible into the 10 mm<sup>2</sup> available on the Google lottery shuttles (Fig. 6). To rapidly prove the concept, initial designs were based on a scan chain architecture to simplify testing. Each Tiny Tapeout 1 design has eight inputs and eight



Fig. 5. The interactive 3-D viewer.



Fig. 6. 500 designs connected in a chain for Tiny Tapeout 1; the scan chain driver can be seen in the lower left corner.

outputs. Clock and reset signals were optional and not treated specially. The chain was formed of scan flops [?], a type of flip flop with an multiplexer integrated at its input. An example showing a two design scan chain is shown in Fig. 7.

Each design sends data into the secondary input of the scan flop and receives its own input from the output of the flop via a latch. The chain is built [?] by sending data from the output of the previous scan flop into the primary input of the next scan flop. This arrangement allows the loading of data into any of the designs, followed by the capturing of the output and its clocking through the rest of the chain to the overall chain output.

While relatively easy to implement, a scan chain architecture has a downside: high latency. As more designs are added to the chain it takes longer to send and receive data through it. For example: assuming a 50 MHz scan chain clock with 250 designs each having eight inputs and eight outputs, the maximum refresh rate of the resulting chain is  $50 \text{ MHz}/(8 \times 250) = 25 \text{ kHz}$ .



Fig. 7. A simplified view of two Tiny Tapeout 1 designs in the scan chain.



Fig. 8. Tiny Tapeout 2 designs, showing the discrete scan chain blocks.

The Tiny Tapeout 1 scan chain was embedded into each design, meaning a user could unintentionally remove it and break the chain. This risk was mitigated with a formal [?] equivalence check which proves the chain was present in each submitted design. For Tiny Tapeout 2 and Tiny Tapeout 3 the scan chain was separated into a discrete macro block which participants cannot modify.

Another concern with the scan chain design was hold violations, due to the large number of serially connected flops and potentially large clock skews over long signal wires. This was mitigated by reclocking the output data with a negative edge (negedge) flop, providing substantially more hold margin.



Fig. 9. The Tiny Tapeout 3 architecture buffers the output from the clock network into each design. Clock polarity is alternated between designs to minimize asymmetry between positive and negative cycles.

Following static timing analysis (STA) it was discovered that the clock duty cycle could change substantially due to the 500 sequential clock drivers in the chain. Depending on the clock buffers and capacitance between each design the clock duty cycle could either increase or decrease, with this effect accumulated over the chain.

For Tiny Tapeout 1 and Tiny Tapeout 2 each design used two clock buffers, with the internal flops driven after the first buffer. Tiny Tapeout 3 used inverting clock buffers, with only one between the clock input and output. Fig. 9 shows a comparison between the TT02 and TT03 clock buffer designs. By inverting the clock between each design any asymmetry in the clock pulse is evenly spread across the negative and positive cycles.

The verification effort [?] for the scan chain was broad and included a community review, register transfer level (RTL) and gate level (GL) simulation, formal verification [?], STA, layout vs. schematic (LVS), DRC, and device level static verification [?].

#### IV. CIRCUIT BOARDS

After manufacture, Tiny Tapeout chips are mounted onto small carrier boards with 0.1 inch pin headers. These carriers allow people with limited surface mount technology (SMT) assembly experience to build their own demonstration boards around the chips.

The carrier fits onto the demonstration board shown in Fig. 10. The demonstration board is designed primarily for ease of use by beginners, though offers enough flexibility for power users. As all signals are below 50 MHz, no special layout was needed in the design of the demonstration board.

The demonstration board provides:

- USB Type-C as a power connection,
- 1.8 V and 3.3 V power supplies for core and IO,
- 20 MHz oscillator,



Fig. 10. The demonstration board, which has been recorded as Certified Open Source Hardware ES0000040 [?].



Fig. 11. Measurement from Tiny Tapeout 2 silicon, with input clock in yellow and the distorted output clock in blue.

- Buttons for reset and single step clock,
- An eight way DIP switch for inputs,
- A nine way DIP switch for design selection,
- A seven segment LED display for the outputs,
- Headers for all IO, including two standard Digilent ports (PMOD),
- A header to select the internal clock or to provide one externally,
- A header to select an internal or external scan chain driver,
- A header to engage an automatic clock divider in input pin zero.

#### V. SCAN CHAIN SILICON RESULTS

Tiny Tapeout 2 chips were received in October 2023, 11 months after the chips were submitted for manufacture on the Efabless chipIgnite 2211Q shuttle. The chips were tested for the first time on a public livestream [?]. During this testing the chain was validated and a small number of the designs were demonstrated to be working. In further testing following the livestream another 30 designs were shown to be working.

After measuring the clock asymmetry (Fig. 11) and maximum frequency it was decided to run the production boards with a 20 MHz oscillator, resulting in a 10 MHz scan chain clock and a 5 kHz IO update rate in order to maximize stability.



Fig. 12. Combinational logic in the clock path of one of the failed designs.



Fig. 13. Measurement from Tiny Tapeout 3 silicon.

Some Tiny Tapeout 2 designs did not function as expected; in most cases this was determined to be due to faults in the submitted design.

Of the designs submitted 82 used the Verilog HDL, 64 were created using the Wokwi graphical editor, and six used alternative HDLs including VHDL, Amaranth [?], and Chisel [?].

Some of the designs created using Wokwi using combinational logic in clock paths (Fig. 12) worked in simulation but failed in hardware. This was determined to be due to the lack of timing data in the simulation, and was not detected by STA because the clock paths were not known. A detailed analysis on these failures has yet to be carried out.

At the time of writing, PCBs are in production and are expected to ship to customers by the end of January 2024.

Silicon from Tiny Tapeout 3 production was received in January 2024. The updated scan chain design shows a more symmetric output clock at the end of the chain (Fig. 13). The improved clock symmetry will allow for the use of a faster scan chain clock, resulting in a faster update frequency.

## VI. BEYOND THE SCAN CHAIN

The biggest limitation of the scan chain based architecture was its IO bandwidth and latency. As a result, a new architecture was needed for Tiny Tapeout 4 and proposals were gathered from the community. An online video call was held with community members, and the ten submitted architecture proposals discussed. The winning architecture was a straightforward multiplexer design, shown in Fig. 14. This architecture was chosen as the simplest to implement while providing the most benefit in terms of additional IOs and higher bandwidth.



Fig. 14. A simplified diagram of the Tiny Tapeout 4 multiplexer architecture.



Fig. 15. The Tiny Tapeout 3.5 test design.

The physical layout (shown in Fig. 15) consists of a central controller connected up and down to two vertical wiring spines. For this experimental run we also included loopback test designs at the end of each multiplexer so we could measure performance for each multiplexer position. Twenty-four horizontal multiplexers connect to the wiring spines, each of which supports 16 designs. This allows for up to 384 separate single tile designs. The new architecture also enabled multiple tile designs, allowing a maximum project size of  $8 \times 2$  tiles or  $1359 \times 225 \mu\text{m}^2$ —around 20 000 logic cells. Table II shows the key differences between Tiny Tapeout 3 and Tiny Tapeout 4.

Another major limitation of Tiny Tapeout 1 to Tiny Tapeout 3 was the small number of IO pins. The scan controller used nine GPIOs to select the currently active design. While this simplified the demonstration board board, it also wasted valuable IO pins.

With Tiny Tapeout 4 the parallel design selection architecture used in previous chips was dropped in favor of a serial protocol. The extra pins were then used bidirectionally, giving each design a clock pin, reset pin, and 24 IO pins.

An invitational experimental shuttle dubbed Tiny Tapeout 3.5 [?] was submitted for production, with 32 designs, to Efabless chipIgnite 2306C. Two of the designs included a power gate as a stepping stone to supporting analog and mixed signal designs.

TABLE II  
A COMPARISON BETWEEN TINY TAPEOUT 3 AND TINY TAPEOUT 4.

| Parameters            | TT03                           | TT04                            |
|-----------------------|--------------------------------|---------------------------------|
| Max clock speed       | 8 kHz                          | 50 MHz                          |
| Max design size       | $150 \times 170 \mu\text{m}^2$ | $1359 \times 225 \mu\text{m}^2$ |
| Input pins            | 8                              | 10                              |
| Output pins           | 8                              | 8                               |
| Bidirectional IO pins | None                           | 8                               |
| Custom GDS file       | X                              | ✓                               |



Fig. 16. The Tiny Tapeout 3.5 round trip latency on a rising edge, measured at about 20 ns.



Fig. 17. The Tiny Tapeout 3.5 round trip latency on a falling edge, measured at about 16 ns.

## VII. MULTIPLEXER SILICON RESULTS

After Tiny Tapeout 4 silicon was received and tested, the worst round trip latency was measured to be 20 ns as shown in Fig 16 and 17.

Some Tiny Tapeout 3.5 designs have been validated, including a VGA clock project (Fig. 18) which takes advantage of the new higher speed IO.

The overhead of multiplexing multiple tiles makes power consumption of the infrastructure only a minor concern at this point. No direct comparison of the power consumption impact of the multiplexer over the previous scan chain architecture is available at the time of writing. The motivation of the multiplexer approach, however, was to substantially increase the bandwidth of the IOs.

The new chip pinout and serial design selection required the creation of a new demonstration board (Fig. 20), which was required to include an easy way to select a chosen design. The Raspberry Pi RP2040 microcontroller was chosen as a coprocessor on the revised demonstration board, as it offers:

- Drag and drop firmware updates on any operating system,



Fig. 18. A VGA clock design running on Tiny Tapeout 3.5 silicon.

```
enabling design tt_um_test by sending 102 [0b01100110] pulses
design repo https://github.com/TinyTapeout/tt03p5-test @ 434c5d508d20053bea346881a61355f87ea1ca91
0 0 0 0
0 0 0 0
0 1 0 0
1 1 0 0
0 0 1 0
1 0 1 0
0 1 1 0
1 1 1 0
0 0 0 1
```

Fig. 19. A MicroPython program [?] enabling a chosen design, clocking it, and printing the results.



Fig. 20. The Tiny Tapeout 4+ demonstration board [?].

- MicroPython [?] support, an ideal language and programming environment through which beginners can enable and test their designs (Fig. 19),
- External memory emulation via programmable input output (PIO) and direct memory access (DMA) support.

An additional PMOD expansion port was added to the demonstration board for the bidirectional pins. The Tiny Tapeout community has begun to standardize on pinouts [?], making it easier to test each design. A new repository was created to house user-contributed PMOD [?] accessories, for example the VGA PMOD shown in Fig. 21.

A further set of three PMOD expansion ports were added, mixing input and outputs, which allows the most common



Fig. 21. A user-contributed VGA output PMOD accessory for the demonstration board.

standard PMODs to be used with the demonstration board. For more information about the circuit board, pinout, and PMOD support see the repository [?].

### VIII. IMPROVING THE MULTIPLEXER AND MIXED SIGNAL SUPPORT

Tiny Tapeout 5 split the multiplexer into two parts in order to improve performance. As a result of the split the controller was also updated to multiplex between the two halves. Some other small changes to the multiplexer include the tweaking, addition, or removal of buffers to improve the STA results.

As each spine segment in Tiny Tapeout 5 is now half as long as in Tiny Tapeout 4, it will exhibit half the capacitance. As a result, we expect to see the round trip latency reduced to around 10 ns.

For Tiny Tapeout 6 the Caravel harness will be replaced by OpenFrame [?], an alternative harness provided by Efabless that uses the same padding but removes the RISC-V coprocessor. This results in an additional  $5\text{ mm}^2$  of space for user designs, and 12 more IO pins which will be used for analog signals.

For increased safety, all designs will be power-gated. This allows designers to take more risks in their submissions, or to use custom flows not previously used in Tiny Tapeout.

Analog and mixed signal designs will be enabled by adding an analog multiplexer based on transmission gates [?]. This allows up to 192 designs to share the analog pins between them. These transmission gates were tested as part of an experimental analog submission to Tiny Tapeout 5, shown in Fig. 22.

Noise coupling between analog and digital power domains is a known concern. However, due to other limitations of our current setup including its limited number of low bandwidth analog interfaces, we target educational low to medium performance analog and mixed signal designs where noise coupling is a lesser concern.

Tiny Tapeout 6 is planned to open for submission of digital designs at the end of January 2024, for analog designs at the



Fig. 22. A ring oscillator and digital to analog converter (DAC) design submitted to Tiny Tapeout 5, with a transmission gate highlighted. This design was automatically placed and routed using an experimental analog place and route (P&R) tool.

end of February 2024, and to close for submissions on 19 April 2024.

### IX. SILICON SHOWCASE

A curated list showcasing the types of designs possible through Tiny Tapeout is provided below:

- Serial FPGA ([Link](#))
- Synthesizable Digital Temperature Sensor ([Link](#))
- 395 standard cells with mux ([Link](#))
- FM transmitter with I2S input ([Link](#))
- USB Full Speed - ([Link](#))
- A Linux capable RISC-V CPU - ([Link](#))

An index of all designs submitted to Tiny Tapeout can be found at [TinyTapeout.com/runs/](#) [?].

### ACKNOWLEDGMENTS

- 1) U. Shaked for Wokwi development and lots more.
- 2) S. Munaut for help with scan chain improvements and multiplexer design.
- 3) M. Thompson and M. Bailey for verification expertise.
- 4) Jix for formal verification support.
- 5) Proppy for help with GitHub actions.
- 6) M. Balestrini for 3-D renders and the interactive GDS viewer.
- 7) H. Pretl for advice and analog support.
- 8) The team at YosysHQ and all the other open source EDA tool makers.
- 9) Efabless for running the shuttles and providing OpenLane and sponsorship.
- 10) T. Ansell and Google for supporting the open source silicon movement.
- 11) P. Deegan for PCB development.
- 12) The Tiny Tapeout community for all its contributions.
- 13) The Zero to ASIC course community for all its support.