



# Silicon nitride-on-silicon bi-layer grating couplers designed by a global optimization method

JASON C. C. MAK,<sup>1,\*</sup> QUENTIN WILMART,<sup>2</sup> SÉGOLÈNE OLIVIER,<sup>2</sup>  
SYLVIE MENEZO,<sup>2</sup> AND JOYCE K. S. POON<sup>1</sup>

<sup>1</sup>*Department of Electrical and Computer Engineering, University of Toronto, 10 King's College Road, Toronto, Ontario M5S 3G4, Canada*

<sup>2</sup>*CEA-Leti, 17, rue des Martyrs, 38000 Grenoble, France*

\*[jcc.mak@mail.utoronto.ca](mailto:jcc.mak@mail.utoronto.ca)

**Abstract:** Silicon nitride-on-silicon bi-layer grating couplers were designed for the O-band using an optimization-based procedure that accounted for design rules and fabricated on a 200 mm wafer. The designs were sufficiently robust to fabrication variations to function well across the wafer. A peak fiber-to-chip coupling efficiency to standard single mode fiber of -2.2 dB and a 1-dB bandwidth of 72.9 nm was achieved in the representative device. Over several chips across the wafer, we measured a median peak coupling efficiency of -2.1 dB and median 1-dB bandwidth of 70.8 nm. The measurements had good correspondence with simulation.

© 2018 Optical Society of America under the terms of the [OSA Open Access Publishing Agreement](#)

**OCIS codes:** (230.3120) Integrated optics devices; (130.0130) Integrated optics; (230.4170) Multilayers.

## References and links

1. E. Bernier, P. Dumais, D. J. Goodwill, H. Mehrvar, D. Celo, J. Jiang, C. Zhang, F. Zhao, X. Tu, C. Zhang, S. Yan, J. He, M. Li, W. Liu, Y. Wei, and D. Geng, "Large-scale silicon photonic switch," in "Optical Fiber Communication Conference," (Optical Society of America, 2018), pp. Th1J-1.
2. K. Suzuki, R. Konoike, J. Hasegawa, S. Suda, H. Matsuura, K. Ikeda, S. Namiki, and H. Kawashima, "Low insertion loss and power efficient 32 × 32 silicon photonics switch with extremely-high-Δ PLC connector," in "Optical Fiber Communication Conference Postdeadline Papers," (Optical Society of America, 2018), p. Th4B.5.
3. L. Qiao, W. Tang, and T. Chu, "32× 32 silicon electro-optic switch with built-in monitors and balanced-status units," *Scientific Reports* **7**, 42306 (2017).
4. J. Sun, E. Timurdogan, A. Yaacobi, Z. Su, E. S. Hosseini, D. B. Cole, and M. R. Watts, "Large-scale silicon photonic circuits for optical phased arrays," *IEEE Journal of Selected Topics in Quantum Electronics* **20**, 264–278 (2014).
5. J. C. Hulme, J. K. Doylend, M. J. R. Heck, J. D. Peters, M. L. Davenport, J. T. Bovington, L. A. Coldren, and J. E. Bowers, "Fully integrated hybrid silicon two dimensional beam scanner," *Optics Express* **23**, 5861–5874 (2015).
6. A. Ramaswamy, J. Roth, E. Norberg, R. S. Guzzon, J. Shin, J. Imamura, B. Koch, D. Sparacin, G. Fish, B. G. Lee, R. Rimolo-Donadio, C. Baks, A. Rylyakov, J. Proesel, M. Meghelli, and C. Schow, "A WDM 4x28Gbps integrated silicon photonic transmitter driven by 32nm CMOS driver ICs," in "Optical Fiber Communication Conference," (Optical Society of America, 2015), pp. Th5B-5.
7. W. D. Sacher, Y. Huang, G.-Q. Lo, and J. K. S. Poon, "Multilayer silicon nitride-on-silicon integrated photonic platforms and devices," *Journal of Lightwave Technology* **33**, 901–910 (2015).
8. W. D. Sacher, Z. Yong, J. C. Mikkelsen, A. Bois, Y. Yang, J. C. C. Mak, P. Dumais, D. Goodwill, C. Ma, J. Jeong, E. Bernier, and J. K. S. Poon, "Multilayer silicon nitride-on-silicon integrated photonic platform for 3D photonic circuits," in "Conference on Lasers and Electro-Optics," (Optical Society of America, 2016), p. JTh4C.3.
9. S. Malhouitre, B. Szelag, S. Brision, Q. Wilmart, D. Fowler, C. Dupré, and C. Kopp, "Heterogeneous and multi-level integration on mature 25Gb/s silicon photonic platform," in "CPMT Symposium Japan (ICSJ), 2017 IEEE," (IEEE, 2017), pp. 223–226.
10. C. Baudot, M. Douix, S. Guerber, S. Crémér, N. Vulliet, J. Planchot, R. Blanc, L. Babaud, C. Alonso-Ramos, D. Benedikovich, D. Pérez-Galacho, S. Messaoudéne, S. Kerdiles, P. Acosta-Alba, C. Euvrard-Colnat, E. Cassan, D. Marriss-Morini, L. Vivien, and F. Boeuf, "Developments in 300mm silicon photonics using traditional CMOS fabrication methods and materials," in "Electron Devices Meeting (IEDM), 2017 IEEE International," (IEEE, 2017), p. 34.
11. A. Michaels and E. Yablonovitch, "Inverse design of near unity efficiency perfectly vertical grating couplers," *Optics Express* **26**, 4766–4779 (2018).
12. W. D. Sacher, Y. Huang, L. Ding, B. J. F. Taylor, H. Jayatilleka, G.-Q. Lo, and J. K. S. Poon, "Wide bandwidth and high coupling efficiency Si3N4-on-SOI dual-level grating coupler," *Optics Express* **22**, 10938–10947 (2014).

13. F. Van Laere, T. Claes, J. Schrauwen, S. Scheerlinck, W. Bogaerts, D. Taillaert, L. O'Faolain, D. Van Thourhout, and R. Baets, "Compact focusing grating couplers for silicon-on-insulator integrated circuits," *IEEE Photonics Technology Letters* **19**, 1919–1921 (2007).
14. W. S. Zaoui, A. Kunze, W. Vogel, M. Berroth, J. Butschke, F. Letzkus, and J. Burghartz, "Bridging the gap between optical fibers and silicon photonic integrated circuits," *Optics Express* **22**, 1277–1286 (2014).
15. C. R. Doerr, L. Chen, Y.-K. Chen, and L. L. Buhl, "Wide bandwidth silicon nitride grating coupler," *IEEE Photonics Technology Letters* **22**, 1461–1463 (2010).
16. Q. Zhong, V. Veerasubramanian, Y. Wang, W. Shi, D. Patel, S. Ghosh, A. Samani, L. Chrostowski, R. Bojko, and D. V. Plant, "Focusing-curved subwavelength grating couplers for ultra-broadband silicon photonics optical interfaces," *Optics Express* **22**, 18224–18231 (2014).
17. M. T. Wade, F. Pavanello, R. Kumar, C. M. Gentry, A. Atabaki, R. Ram, V. Stojanović, and M. A. Popović, "75% efficient wide bandwidth grating couplers in a 45 nm microelectronics CMOS process," in "Optical Interconnects Conference (OI), 2015 IEEE," (IEEE, 2015), pp. 46–47.
18. D. R. Jones, C. D. Perttunen, and B. E. Stuckman, "Lipschitzian optimization without the Lipschitz constant," *Journal of Optimization Theory and Applications* **79**, 157–181 (1993).
19. J. M. Gablonsky and C. T. Kelley, "A locally-biased form of the DIRECT algorithm," *Journal of Global Optimization* **21**, 27–37 (2001).
20. Q. Liu, G. Yang, Z. Zhang, and J. Zeng, "Improving the convergence rate of the direct global optimization algorithm," *Journal of Global Optimization* **67**, 851–872 (2017).
21. Y. Shi and R. C. Eberhart, "Parameter selection in particle swarm optimization," in "International Conference on Evolutionary Programming," (Springer, 1998), pp. 591–600.
22. Á. E. Eiben, R. Hinterding, and Z. Michalewicz, "Parameter control in evolutionary algorithms," *IEEE Transactions on Evolutionary Computation* **3**, 124–141 (1999).
23. W. J. Morokoff and R. E. Caflisch, "Quasi-Monte Carlo integration," *Journal of Computational Physics* **122**, 218–230 (1995).
24. S. G. Johnson, "The NLOpt nonlinear-optimization package," <http://ab-initio.mit.edu/nlopt>. [Online].
25. J. Notaros and M. Popović, "Band-structure approach to synthesis of grating couplers with ultra-high coupling efficiency and directivity," in "Optical Fiber Communication Conference," (Optical Society of America, 2015), pp. Th3F–2.
26. L. Su, R. Trivedi, N. V. Sapra, A. Y. Piggott, D. Vercruyse, and J. Vučković, "Fully-automated optimization of grating couplers," *Optics Express* **26**, 4023–4034 (2018).
27. Lumerical Inc., <http://www.lumerical.com/tcad-products/fdtd/>.

## 1. Introduction

Silicon (Si) photonics can potentially realize low-cost, large-scale photonic integrated circuits (PICs) for applications such as optical switching [1–3], phased arrays [4, 5] and transceivers [6]. However, waveguide loss and fiber-to-chip coupling loss limit the size Si photonic PICs. Multi-layer silicon nitride (SiN)-on-Si foundry fabricated integrated photonic platforms address this problem by monolithically integrating lower optical loss, nonlinearity, and thermo-optic coefficient SiN waveguide layers atop the Si layer [7–10]. Such multi-layer platforms also support improved surface fiber-to-chip coupling through bi-layer grating couplers (GCs) with composite SiN and Si teeth. Bi-layer GCs allow for enhanced upwards directivity over single layer gratings by providing sufficient degrees of freedom to control phases and amplitudes of scattered waves, such that the upward emitted waves coupled into the fiber constructively interfere and other scattered waves and modes destructively interfere, as per the analysis in [11]. A coupling efficiency or insertion loss (IL) of -1.3 dB with a 1-dB bandwidth ( $\Delta\lambda_{1dB}$ ) of 80 nm near center wavelength 1550 nm was demonstrated by the SiN-on-Si grating in [12], without the need for a back-reflector or other post-processing.

In this work, we expand upon [12] by demonstrating a systematic optimization design method for bi-layer GCs, using a new SiN-on-Si platform at CEA-LETI. The fabricated GCs have high performance consistent with simulation. Over the 9 chips measured from across the wafer, we measured a median IL of -2.1 dB and median  $\Delta\lambda_{1dB}$  of 70.8 nm. To compare GCs with respect to both coupling efficiency and bandwidth at different center-wavelengths, we define a figure of merit efficiency-bandwidth product (EBWP) to be

$$EBWP = \eta \frac{\Delta\lambda_{1dB}}{\lambda_{center}}, \quad (1)$$



Fig. 1. (a) Cross-section of the SiN-on-Si platform in this work. Target layer dimensions are:  $t_{BOX} = 2 \mu m$ ,  $t_{BOX} = 2 \mu m$ ,  $t_{Si} = 300 \text{ nm}$ ,  $t_{etch} = 150 \text{ nm}$ ,  $\sigma = 200 \text{ nm}$ ,  $t_{SiN} = 600 \text{ nm}$ , and  $t_{clad} = 1 \mu m$ . Refractive index at nominal center wavelength 1310 nm are:  $n_{Si} = 3.507$ ,  $n_{SiO2} = 1.447$ , and  $n_{SiN} = 1.873$ . (b) Side view of the geometry parameterization of the 1D uniform bi-layer GC used for 2D FDTD simulation. Layer thicknesses and color scheme correspond to Fig. 1(a). A Gaussian source is launched from the top at angle  $\theta$  into the grating, and the transmission spectrum is measured at the left output SiN waveguide to the PIC. Grating is periodic, and continues to the right beyond the figure. Units for the geometric variables are in  $\mu m$ , and  $\theta$  is in degrees. The figure is not to scale. (c) Top view of the focusing bi-layer GC derived from 1D design, drawn to scale, using the color scheme of Fig. 1(a) to indicate material layers and thicknesses. Light coupled into the GC exits the taper into a 840 nm wide SiN waveguide.

where  $\eta$  is the coupling efficiency,  $\Delta\lambda_{1dB}$  is the 1-dB bandwidth, and  $\lambda_{center}$  is the wavelength at which the spectrum achieves the peak coupling efficiency. The worst performing device with respect to EBWP had IL = -4.5 dB with  $\Delta\lambda_{1dB} = 56.6 \text{ nm}$ , and the device with the best EBWP had IL = -2.2 dB with  $\Delta\lambda_{1dB} = 72.9 \text{ nm}$ . In comparison with prior work, the median EBWP of these GCs exceed reported single layer GCs, even with back-reflectors [13–16]. The EBWP is lower than the Si-on-Si bi-layer GC reported in [17], but that GC requires additional post-processing, and wafer-scale measurements have not been reported.

We proceed in Section 2 by describing the optimization procedure to design the bi-layer GCs. Fabrication and measurements are described in Section 3. We compare our results with prior work in Section 4 and conclude after a discussion on avenues for improvement in Section 5.

## 2. Design

### 2.1. Device parameterization and simulation setup

In our previous work [12], we heuristically found designs for uniform bi-layer GCs, which were subsequently apodized. Here, we present a systematic procedure for searching for uniform bi-layer GCs with the maximum coupling efficiency using a global optimization algorithm. The GCs were implemented in a new SiN-on-Si platform that is currently being developed at CEA-Leti. The layer thicknesses and material refractive indices were fixed and shown in Fig. 1(a). The GCs used the partially etched Si (150 nm thick) and the SiN (600 nm thick) layers, since GCs using the 300 nm thick Si layer were found to be less efficient.

For the optimization algorithm, the GCs were parameterized according to Fig. 1(b). All the parameters have units of  $\mu m$ , except angle parameters which have units of degrees. The GCs couple light from an angled polished SMF-28 fiber into a SiN waveguide on the chip. The SiN teeth have width,  $t$ , and gap,  $g$ ; and the Si teeth have an offset  $o_x$  and width  $t_x$ . We include two transition teeth, which have a width of  $t_a$ , a gap of  $g_a$ , and offset of  $o_a$ , at the start of the GC. These transition teeth have been found to improve coupling efficiency [12]. The minimum feature sizes and clearances of this foundry process are accounted for by setting lower bounds on the parameters. The distance between the SiN waveguide and the center of the fiber is  $x_s$ , and  $\theta$  is

Table 1. The top results from coarse sampling

| Sample No. | $t$   | $g$   | $o_x$  | $t_x$ | $x_s$ | $\theta$ | $\eta$ |
|------------|-------|-------|--------|-------|-------|----------|--------|
| 781        | 0.736 | 0.842 | -0.491 | 0.456 | 5.3   | 30.4     | 0.26   |
| 1795       | 0.790 | 0.736 | -0.656 | 0.296 | 6.8   | 30.3     | 0.24   |
| 925        | 0.759 | 0.898 | -0.967 | 0.917 | 6.6   | 32.7     | 0.22   |
| 1431       | 0.920 | 0.660 | -0.474 | 0.364 | 5.6   | 30.0     | 0.21   |
| 859        | 0.877 | 0.750 | 0.757  | 0.511 | 6.0   | 31.6     | 0.21   |
| <b>min</b> | 0.736 | 0.660 | -0.967 | 0.296 | 5.3   | 30.0     |        |
| <b>max</b> | 0.920 | 0.898 | 0.757  | 0.917 | 6.8   | 32.7     |        |

the emission angle of the grating.

The optimization algorithm uses the value of the merit function computed using 2D Finite Difference Time Domain (FDTD) simulations of the 1D uniform GC in Fig. 1(b). For the FDTD calculations, the fiber input is approximated as a Gaussian source with a mode field diameter (MFD) of 9.2  $\mu\text{m}$  at 1310 nm, matching that of SMF-28. The merit function is coupling efficiency,  $\eta$ , at 1310 nm, which is extracted from the optical power coupled into the SiN waveguide mode. After the 1D uniform GC with the highest  $\eta$  is found, the grating teeth are curved to form a focusing GC [Fig. 1(c)] [13].

## 2.2. The DIRECT global optimization algorithm

For this work, we chose the DIviding RECTangles (DIRECT) optimization algorithm [18, 19] for its reliability and ease of use. DIRECT is derivative-free (i.e., only evaluations of a merit function is needed). This is suitable here because FDTD does not directly provide the gradient of  $\eta$ . DIRECT takes advantage of a regularity condition of the merit-function (Lipschitz continuity) to prioritize search in the most promising regions of the domain, making it much more efficient than brute force parameter sweeping. This is done by recursively taking a rectangular domain, partitioning it, and estimating the optimal values of the sub-domains from the size of the domain and the maximum rate-of-change of the merit-function (Lipschitz constant). Sub-domains that are not promising are eliminated, and promising sub-domains are refined toward the optimum. For electromagnetic design problems, the merit function tends to be continuous functions of the geometric parameters (and hence Lipschitz continuous); thus, DIRECT is often compatible. This is certainly the case of finding the GCs with the maximum  $\eta$ .

A major advantage of DIRECT is that it works reliably without requiring extensive tuning of algorithm parameters to balance between global exploration and local search [20], unlike particle swarm [21] or genetic algorithms [22], which may get trapped in local optima without tuning of the algorithm parameters. However, a limitation of DIRECT is that its rate of convergence tends to slow down as the optimization progresses [20], making the search for larger domains disproportionately more time consuming than smaller domains. Reducing the size of the search domain greatly speeds up the convergence.

## 2.3. GC optimization

To speed up the convergence of the optimization for the GC, we first used a coarse sampling step that searched over a large feasible design domain to identify a smaller promising region, on which we applied DIRECT. Using this procedure, we were able to identify a promising design within 4630 simulations.



Fig. 2. (a) Convergence of DIRECT towards nominal 1D design. The red dots represent  $\eta$  sampled at each iteration, while the blue line is the best  $\eta$  sampled hitherto. An  $\eta$  of 0.73 at 1316 nm (-1.37 dB) is achieved for a 1D design. (b) Comparison of simulation spectra of the the 1D and focusing design. Focusing the grating directly to a 840 nm SiN waveguide incurs a slight penalty to the coupling efficiency, reducing the IL to -1.5 dB. (c) Fraction of power emitted upwards  $T_{up}$  (green) and back-reflection (blue). (d) Profile of the emitted power by the GC at the chip surface. The exponential decaying behavior is characteristic of a uniform GC. The fiber is optimally positioned at  $x = 5.83\mu\text{m}$ , which is approximately at the center of this emission profile.

### 2.3.1. Coarse sampling

The sampling step was restricted to periodic designs parameterized as  $p^{(0)} = (t, g, o_x, t_x, x_s, \theta)$ , with the addition of 2 transition teeth below the SiN waveguide retaining the same teeth and gap widths as the Si teeth of the main grating to reduce the number of dimensions (i.e.  $t_a = t_x$ ,  $g_a = g + t - t_x$ , and  $o_a = o_x + t_x - t$  for a the gap from the first Si teeth in the grating and the first transition tooth is  $g + t - t_x$ ). We sampled over a wide yet physically realistic range for each design parameter. Due to factors such as substrate reflections and mode matching, it cannot be known *a priori* the emission angle,  $\theta$ , that maximizes  $\eta$ . We let the coupling angle to vary in the range of  $\theta \in (0^\circ, 35^\circ)$  corresponding to positive coupling angles with a safe margin less than the total internal reflection condition for the cladding-air interface ( $43.7^\circ$ ). The center of the source  $x_s$  is bounded within  $(-2, 10)\mu\text{m}$ , since this is likely where there would be a non-trivial amount of emission from the grating. The  $g$ ,  $t$ ,  $t_x$ ,  $t_a$ , and  $g_a$  are set between minimum allowable features sizes (0.2  $\mu\text{m}$  for  $g$  and  $t$ , and 0.12  $\mu\text{m}$  for  $t_x$ ,  $t_a$ , and  $g_a$ ) up to coarse features of around 1  $\mu\text{m}$  corresponding to an upper limit of roughly wavelength per period, beyond which the resultant grating period would not be efficient for the target wavelength. The offset  $o_x$  is allowed to vary approximately over an entire period, with  $o_x \in (-1, 1)\mu\text{m}$ . These ranges give rise to lower bounds  $l^{(0)} = (0.2, 0.2, -1, 0.12, -2, 0)$  and upper bounds  $u^{(0)} = (1, 1, 1, 1, 10, 35)$  for the sampling.

Instead of a grid based sweep, we used a low-discrepancy sequence (LDS) to select sampling points. LDS is a deterministic substitute for uniformly distributed random numbers with more even coverage that is useful for sampling high-dimensional functions and commonly applied to perform Monte-Carlo integration [23]. LDS allows us to select the number of points that we want to sample without being constrained by a grid and increase resolution of the sampling as necessary. We sampled 4000 points from this parameter space using the Halton low discrepancy sequence, which struck a balance between adequate resolution and simulation time. The best 5 results with respect the coupling efficiency  $\eta$ , listed in Table 1, were then used to define a range



Fig. 3. (a)-(d) Sensitivity of the 1D design to (a) x-offset between layers, variations in fill factor for (b) SiN and (c) Si teeth, and variation in thicknesses of (d)  $\text{SiO}_2$  interlayer spacing, (e) Si teeth layer, (f) SiN teeth layer.

of parameters to search for in the DIRECT optimization.

### 2.3.2. DIRECT refinement

After the sampling, the GC designs are refined using DIRECT. In this step, the transition teeth are allowed to vary. We parameterize the grating with  $p^{(1)} = (t, g, o_x, t_x, o_a, t_a, g_a, x_s, \theta)$ , and initialize the bounds as  $l^{(1)} = (0.5, 0.5, -1, 0.14, -1, 0.14, 0.14, 3, 10)$ ,  $u^{(1)} = (1.2, 1.2, 1, 1, 1, 1.5, 6, 32)$ , using the minimum and maximum of Table 1 with some padding. We use the nlopt [24] implementation of DIRECT. Including the sampling, the entire process took around 16 hours to complete using a desktop with 8 core Intel i7 processor and 16 GB RAM. The improvement of coupling efficiency over 630 iterations is shown in Fig. 2(a). Although the best result was found after 335 iterations, the algorithm continues to search other partitions. In contrast to local optimization algorithms, where the figure of merit converges with increasing iterations, DIRECT is a global optimization algorithm which continues exploring other regions of the parameter space to test whether better solutions exist. Here, we terminated the algorithm when the figure-of-merit no longer improved.

We select the best design,  $p^{(1)*} = (0.85, 0.617, -0.675, 0.304, -0.74, 0.283, 0.255, 5.83, 28)$ , achieving coupling efficiency of 73%, to be our nominal 1D design. Figure 2(b) plots the transmission of the 1D design, and the corresponding focusing design simulated with 3D-FDTD. An IL of -1.5 dB at 1312 nm with  $\Delta\lambda_{1dB} = 73$  nm is expected for the focusing design, representing a EBWP of  $3.9 \times 10^{-2}$ . Figure 2(c) shows that the GC has a high directionality with low back-reflection characteristic of bi-layer gratings, consistent with [11, 12]. Figure 2(d) shows the profile of the emitted power by the GC at the chip surface. The centering of this emission profile is



Fig. 4. (a) Optical micrographs of the fabricated GC test structures and an individual GC. (b) Wafer map of the measured chips, with measured SiO<sub>2</sub> interlayer spacing thickness  $\sigma$  (in nm) annotated. The intended value of  $\sigma$  is 200 nm [see Fig. 1(a)]. (c) Measured spectra of the nominal focusing design from samples over the wafer, as indicated in (b). The dotted black line is the simulated transmission spectrum of the focusing design assuming a  $\sigma = 200$  nm. (d) Comparison of the insertion loss (green) and center wavelength (blue) with respect to  $\sigma$  between measured (circle markers) and simulated data (dotted lines). A close correspondence is present, with some additional loss for the measured data due to various factors (e.g. lack of planarization at the chip surface, fiber array positioning, etc.). Other variations, such as layer offset and under/over etching of the teeth contribute to the loss and center wavelength variations in the measurement.

consistent with where the fiber core is expected to be positioned ( $x = 5.83\mu\text{m}$ ). Figures 3(a)-3(d) show the sensitivity of the nominal 1D design to variations in layer offset, duty cycle, and layer spacing and thickness. The peak coupling efficiency stays within 1 dB to layer offsets of  $\pm 60$  nm, and layer spacing variations of  $\pm 60$  nm. The worst case is due to a variation of -40 nm in the width of the Si teeth. The bandwidth is generally negligibly affected, but the center wavelength shifts by up to 50 nm for a layer spacing variation of +60 nm from the nominal.

### 3. Fabrication and measurement

Test structures consisting of pairs of GCs connected by SiN waveguides as in Fig. 4(a) were fabricated on a 200 mm silicon-on-insulator wafer using 193 nm deep ultraviolet (DUV) photolithography on the Si layer and 248 nm DUV photolithography for the SiN layer, as part of a trial for a SiN-on-Si platform that was under development at CEA-LETI. Fabrication targeted the layer thickness values in Fig. 1(a). While uniformity was good across the wafer for Si and SiN layers, the SiO<sub>2</sub> interlayer spacing thickness  $\sigma$  had a large variability. Across the wafer, the layer thicknesses were characterized through ellipsometry:  $t_{Si}$  had a range of 8 nm with median of 300 nm and a standard deviation of 2 nm,  $t_{etch}$  had a range of 6 nm with median of 153 nm and a standard deviation of 1.4 nm, and  $t_{SiN}$  had a range of 14 nm with median of 605 nm and a standard deviation of 4.4 nm. The intended values were 300 nm, 150 nm and 600 nm respectively. From examining Fig. 3(e)-3(f), we do not expect this range of variations to have a major impact on performance. In contrast, the interlayer spacing  $\sigma$  varied significantly. Over the wafer,  $\sigma$  had a range of 140 nm with median 229 nm and standard deviation 37.4 nm. The intended value was 200 nm. Based on the simulation in 3(d), we expect the variation of  $\sigma$  to be a dominant factor in

Table 2. Performance summary of nominal designs from across the wafer [see also Fig. 4(c)].

| $\sigma$ [nm] | $\Delta\lambda_{1dB}$ [nm] | IL [dB] | $\lambda_{center}$ [nm] | $\eta$ | EBWP [ $\times 10^{-2}$ ] |
|---------------|----------------------------|---------|-------------------------|--------|---------------------------|
| 193           | 68.9                       | -2.0    | 1319.3                  | 0.63   | 3.28                      |
| (best) 206    | 72.9                       | -2.2    | 1306.2                  | 0.61   | 3.40                      |
| 216           | 72.5                       | -2.0    | 1337.7                  | 0.63   | 3.40                      |
| 218           | 71.3                       | -2.1    | 1316.6                  | 0.62   | 3.34                      |
| 239           | 70.4                       | -2.1    | 1319.6                  | 0.62   | 3.30                      |
| 259           | 73.2                       | -2.1    | 1332.0                  | 0.61   | 3.37                      |
| 262           | 66.6                       | -2.2    | 1331.4                  | 0.60   | 3.01                      |
| 276           | 54.5                       | -3.4    | 1335.5                  | 0.46   | 1.87                      |
| (worst) 333   | 56.6                       | -4.5    | 1343.2                  | 0.36   | 1.50                      |
| <b>max</b>    | 73.2                       | -2.0    | 1343.2                  | 0.63   | 3.40                      |
| <b>median</b> | 70.9                       | -2.1    | 1331.7                  | 0.61   | 3.32                      |
| <b>min</b>    | 54.5                       | -4.5    | 1306.2                  | 0.36   | 1.50                      |

Fig. 5. Measured spectra of geometry variation corners in the chip with interlayer spacing of 206 nm. Measured spectra showing the effects of lithographically defined interlayer offsets in the (a)  $x$ -direction and (b)  $y$ -direction, as well as variations in the widths of the (c) SiN and (d) Si teeth.

the interdie variations across the measured samples.

We tested 9 samples from across the wafer [Fig. 4(b)], with the corresponding  $\sigma$  labeled. We used a fiber array polished at  $29^\circ$  with index fluid matching fluid (Norland IML150, refractive index  $n = 1.5$ ) applied at the fiber-chip interface. The input laser polarization was set to transverse electric (TE) with a polarization controller to maximize the coupling efficiency. The transmission spectra are plotted in Fig. 4(c), and Table. 2 summarizes the performance of the measured samples. Over the 9 chips measured from across the wafer, we obtained a worst performance of  $EBWP = 1.5 \times 10^{-2}$  (IL =  $-4.5$  dB with  $\Delta\lambda_{1dB} = 56.6$  nm), a best performance of  $EBWP = 3.4 \times 10^{-2}$  (IL =  $-2.2$  dB with  $\Delta\lambda_{1dB} = 72.9$  nm), and a median EBWP of  $3.3 \times 10^{-2}$ . The insertion loss and center wavelengths of the measurement data was plotted against interlayer spacing  $\sigma$ , and compared with simulated values in Fig. 4(d). The close correspondence of the trends confirm that the interdie variations are dominantly driven by the interlayer spacing variations in this set of samples.

The measurements showed an additional loss of approximately 0.7 dB compared to the simulation, which agreed well with simulations. The slight discrepancy may be caused by the lack of planarization of the oxide cladding, a slight mismatch in the refractive index between the

Table 3. Comparison of grating couplers

| Feature                    | (This work)<br>2018 | [12]<br>2014 | [17]<br>2015           | [13]<br>2007 | [15]<br>2010 | [14]<br>2014          | [16]<br>2014          |
|----------------------------|---------------------|--------------|------------------------|--------------|--------------|-----------------------|-----------------------|
| Type                       | SiN-on-Si           | SiN-on-Si    | p-Si on c-Si<br>(CMOS) | Si           | SiN          | Si+Back-<br>reflector | Si Subwave-<br>length |
| IL [dB]                    | -2.2                | -1.3         | -1.2                   | -5.25        | -4.2         | -0.62                 | -4.7                  |
| $\eta$                     | 0.61                | 0.74         | 0.75                   | 0.3          | 0.38         | 0.87                  | 0.34                  |
| Emission Angle             | 29°                 | 21°          | 15°                    | 10°          | 8°           | 11°                   | 20°                   |
| $\Delta\lambda_{1dB}$ [nm] | 72.9                | 80           | 78                     | 56           | 67           | 40                    | 115                   |
| $\lambda_{center}$ [nm]    | 1306                | 1536         | 1310                   | 1540         | 1570         | 1530                  | 1550                  |
| EBWP [ $10^{-2}$ ]         | 3.4                 | 3.56         | 4.52                   | 0.58         | 1.62         | 2.27                  | 2.51                  |

index matching fluid ( $n = 1.5$ ) and  $\text{SiO}_2$ , variation or uncertainty in the refractive index, slopes in the sidewalls of the grating features, and the precise position and angle polish of the optical fiber.

For the chip with the grating having the best EBWP, which was also the one with  $\sigma = 206$  nm closest to the nominal value of 200 nm, we studied the corners of the device for comparison with simulation. Figures 5(a) and 5(b) show the measured spectra for lithographically defined  $x$  and  $y$  direction offsets of  $\pm 30$  and  $\pm 60$  nm between the Si and SiN layer from the nominal design. A  $\pm 60$  nm  $x$ -offset compromised coupling efficiency by at most 0.5 dB. The  $y$ -offset negligibly affected the performance. Figures 5(c) and 5(d) show the effect of measured spectra for lithographically defined teeth width (i.e., duty cycle) variation. GCs with -40 nm variation in SiN teeth width decreased the peak coupling efficiency by only 0.5 dB. The coupling efficiency is most sensitive to the width of the Si teeth, which are the smallest features in the device. A -40 nm deviation from the nominal width of 304 nm in the Si teeth led to a peak coupling efficiency of -3.1 dB but maintained a broad bandwidth of  $\Delta\lambda_{1dB} = 69$  nm. Overall, the measurements match the trends in the simulations and show reasonable robustness to variations in the interlayer offset and duty cycle.

#### 4. Comparison

Table 3 compares the present GC with past GCs in literature. This work achieves a comparable EBWP as the previous 1550 nm bi-layer GC, validating the effectiveness of the design procedure at a different  $\lambda_{center}$  while using a new platform. The polysilicon (p-Si) on c-Si GC in a CMOS platform has the highest EBWP to date [17]. However, it requires post-processing steps, such as the removal of Si handle substrate. Compared to conventional single layer Si [13] and SiN GCs [15], Si GCs with back-reflectors [14], and subwavelength Si GCs [16], the EBWP of bi-layer GCs is substantially higher.

#### 5. Discussion

In the design procedure, the majority of the simulations were done for the sampling step. By using the grating equation to relate the emission angle to the grating average effective index and the grating period, the number of simulations may be reduced. Here, the sampling step was not time consuming and we let the emission angle to be a free parameter, so we did not use the grating equation as a constraint. Using the grating equation as a constraint may be helpful in designs that have a larger number of variables to optimize. For example, adding more parameters to the optimization, such as layer thicknesses, has the potential to yield even better solutions. On the other hand, it is not straightforward to apply the grating equation to the multilayer gratings. Adapting the grating equation to a bi-layer grating without an interlayer spacing is explored in [11], but the approach does not apply to our geometry which has a separation between independent waveguide layers. An incident waveguide mode from the SiN waveguide excites

the symmetric and anti-symmetric TE modes in the bi-layer, so the effective indices used in the grating equation needs to account for two modes. In contrast, in [11], the bilayer region supports only one fundamental TE mode.

In this work, it was more effective to use a sampling step rather than the optimization algorithm over the feasible domain. To ascertain whether our design was a global optimum, we attempted to explore the entire design domain using DIRECT without limiting the number of iterations. The initial domain was  $\bar{l}^{(1)} = (0.2, 0.2, -1, 0.14, -1, 0.14, 0.14, -2, 0)$ ,  $u^{(1)} = (1, 1, 1, 1, 1, 1, 1.5, 10, 35)$ . DIRECT required around 6000 iterations to reach a similar solution as compared with the sample and refine strategy, and made negligible improvement afterwards up to at least 20,000 iterations. While our design focused on exploring all feasible coupling angles, if instead the design is aimed at a particular emission angle it may be possible to use the grating equation to generate a smaller range of bounds and skip the sampling step altogether. Newer variants of DIRECT [20] have better convergence behavior, and thus may be able to handle directly the entire feasible domain and bypass the sampling step.

Other methods of designing bi-layer gratings efficiently through custom solvers have also been reported, including a band-structure approach to reduce the simulation domain [25] and more recently adjoint methods [11, 26] which can converge more quickly by taking advantage of gradient information. Experimental demonstrations of fabricated devices are still pending for these methods. In our work, we opted to use a combination of available implementations of the optimization algorithms and a validated FDTD tool [27] to minimize debugging. In future work, alternative optimization and simulation methods can be also tried on the SiN-on-Si geometry.

With respect to the fabrication, the large variability in interlayer spacing was a major source of interdie variation. The platform used in this work was under development, and this was the first batch of chips that were made with the SiN and Si layers that allowed the testing of these grating design. The processes used for the planarization had not been fine tuned and resulted in the large variability in interlayer spacing; the uniformity of the interlayer spacing has since been significantly improved. In addition, a planarization step could also be added after the final  $\text{SiO}_2$  encapsulation, to reduce the losses from surface coupling.

Further apodization of the design to match the input mode is possible by increasing the number of freely varying teeth and successively using local optimization with the DIRECT designed GC as a starting point. A multi-objective design approach, beyond maximizing coupling efficiency, can be implemented by modifying the merit function to incorporate other metrics, such as the bandwidth. By doing so, we can generate designs which optimally trade off bandwidth and coupling efficiency.

## 6. Conclusion

In summary, we have demonstrated the effectiveness of the global optimization algorithm DIRECT in designing bi-layer GCs. To the best of our knowledge, this is the first time this algorithm has been applied to the design of photonic devices and bilayer GCs which have shown good performance across a 200 mm wafer. Starting from a large range of physically reasonable parameters, we used a sampling step to identify a smaller promising design space, which was then refined using DIRECT to reach a good design. This methodology was successfully applied to design a SiN-on-Si GC for the O-band. Over dies from across the wafer, the achieved EBWP ranged from  $1.5 \times 10^{-2}$  to  $3.4 \times 10^{-2}$ , with a median of  $3.3 \times 10^{-2}$ . Correspondingly, median insertion loss of -2.1 dB and median 1-dB bandwidth of 70.8 nm were measured. The main source of interdie variation was due to a large variability in the interlayer spacing, which has since been investigated and improved for future runs.

## Funding

H2020 LEIT Information and Communication Technologies (688516).