

November 30, 2025

Professor Sorin Cotofana  
Editor-in-Chief  
IEEE Transactions on Nanotechnology

Dear Professor Cotofana,

This letter accompanies the manuscript “RTL-to-Atoms Synthesis of a Machine Learning Accelerator on Atomic-Scale Computers”, submitted for consideration as a Regular Paper in IEEE Transactions on Nanotechnology (TNANO), Special Section associated with the 25th IEEE International Conference on Nanotechnology (IEEE NANO 2025). The work lies primarily in computational nanotechnology, with strong connections to circuits and architectures, and aligns with the NANO 2025 scope on quantum, neuromorphic, and unconventional computing.

The manuscript presents an end-to-end synthesis framework that maps a quantized matrix-multiply processing element for machine learning acceleration from register-transfer level (RTL) descriptions down to dot-accurate silicon dangling bond (SiDB) layouts suitable for fabrication. By combining a hierarchical, parameterized RTL architecture with SiDB-aware logic synthesis and physical design, the study shows how design decisions at the architecture, arithmetic, and layout levels jointly influence area, robustness, and scalability in atomic-scale computing platforms, thereby contributing to TNANO’s focus on nanoscale devices, circuits, and systems.

This manuscript is a substantially extended journal version of our IEEE NANO 2025 conference paper “Building a Machine Learning Accelerator with Silicon Dangling Bonds: From Verilog to Quantum Dot Layout”, which received the Best Student Paper Award. The paper is properly mentioned and cited in the manuscript. A detailed comparison between the original and new contributions is provided in Table 1, highlighting substantial additions at every step of the synthesis pipeline. In particular, the journal version introduces parameterizable bit-widths for systematic scaling studies, SiDB-specific arithmetic logic unit (ALU) implementations, and placement-and-routing cost functions tailored to SiDB logic, and a significantly expanded experimental evaluation across multiple precision settings.

The authors confirm that this submission represents original work that has not been published previously and is not under consideration for publication elsewhere. All co-authors have approved the manuscript and its submission to IEEE Transactions on Nanotechnology. The author order has been updated relative to the conference version to reflect additional contributions to the extended study.

Thank you very much for considering this manuscript for publication in the IEEE Transactions on Nanotechnology Special Section associated with the 25th IEEE International Conference on Nanotechnology. The authors appreciate the time and effort of the editors and reviewers in evaluating this work.

Sincerely,

Samuel S. H. Ng  
Ph.D. Candidate  
Department of Electrical and Computer Engineering  
University of British Columbia  
Vancouver, BC, Canada  
[samueln@ece.ubc.ca](mailto:samueln@ece.ubc.ca)

Table 1: Summary of conference and journal contributions.

| Topic             | Conference contributions                                                                                                                                                                                                                                                                          | New contributions for journal extension                                                                                                                                                                                                                                                                                                                                                                                                        |
|-------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| RTL design of MXU | <ul style="list-style-type: none"> <li>Hierarchical RTL for the matrix-multiply unit with a processing-element core and clocked shell.</li> <li>Each layer in the hierarchy verified with dedicated test benches.</li> <li>Fixed weight and activation precision at W8A8.</li> </ul>              | <ul style="list-style-type: none"> <li>Parameterizable weight and activation bit-widths that enable scaling studies and benchmarking of state-of-the-art placement-and-routing algorithms when 8-bit layouts become intractable.</li> </ul>                                                                                                                                                                                                    |
| RTL-to-netlist    | <ul style="list-style-type: none"> <li>Used <i>Yosys</i> for RTL-to-AIG conversion with its default ALU mapping geared toward CMOS libraries.</li> <li>Optimized the AIG using ABC’s <i>&amp;deepsyn</i> strategy.</li> </ul>                                                                     | <ul style="list-style-type: none"> <li>Implemented SiDB-aligned ripple-carry adders and array multipliers as gate-level netlists and integrated them into <i>Yosys</i> for technology-aware mapping.</li> </ul>                                                                                                                                                                                                                                |
| Netlist-to-atoms  | <ul style="list-style-type: none"> <li>Applied figure-of-merit-aware technology mapping that favored robust SiDB gates despite the area overhead.</li> <li>Extended the hexagonalization flow to align input and output pins with the fabric clocking scheme.</li> </ul>                          | <ul style="list-style-type: none"> <li>Added SiDB-specific cost objectives to the <i>gold</i> placement-and-routing algorithm so that optimization targets the layout rules of SiDB fabrics rather than metrics tuned for other FCN platforms.</li> </ul>                                                                                                                                                                                      |
| Experiment        | <ul style="list-style-type: none"> <li>Compared synthesized MXU layouts against manually estimated blueprints from prior studies.</li> <li>Evaluated uniform versus figure-of-merit-aware synthesis for the W8A8 configuration using the <i>ortho</i> placement-and-routing algorithm.</li> </ul> | <ul style="list-style-type: none"> <li>Compared synthesis results across W8A8, W4A4, and W2A2.</li> <li>Evaluated all bit-width configurations with figure-of-merit-aware technology mapping.</li> <li>Reported <i>gold</i> placement-and-routing results for W4A4 and W2A2, which remain within <i>gold</i>’s tractable range.</li> <li>Benchmarked new <i>gold</i> cost objectives to quantify SiDB-specific layout improvements.</li> </ul> |