

# Energy Efficient and Energy Proportional Optical Interconnects for Multi-Core Processors: Driving the Need for On-Chip Sources

Martijn J. R. Heck, *Member, IEEE*, and John E. Bowers, *Fellow, IEEE*

**Abstract**—Silicon photonics is the prime candidate technology to realize an optical network-on-chip for global interconnects in future multi-core processors. Since silicon photonics lacks efficient native-substrate optical sources, the question is whether off-chip or heterogeneously integrated on-chip sources are the preferred technology. In this paper we argue, based on arguments of energy efficiency and energy proportionality, that on-chip sources provide a dramatic overall system efficiency improvement, as compared to using an off-chip (comb) source. We estimate an increase in source efficiency for on-chip lasers of close to 20 dB. These results provide a clear case to include on-chip lasers, such as hybrid silicon lasers, into the network architecture design.

**Index Terms**—Integrated optoelectronics, hybrid silicon platform, optoelectronic devices, semiconductor lasers, silicon-on-insulator (SOI) technology, optical interconnections, network on chip.

## I. INTRODUCTION AND RATIONALE

MULTI-CORE processors allow for the continuous scaling of Moore's Law by overcoming bottlenecks of heat dissipation and data synchronization. This is done by fabricating chips with multiple cores, thereby allowing for more efficient parallel processing. At the time of writing this paper, quad-core processors are commonplace, and processors having 80 cores are at the high end [1]. It is expected that the number of cores will increase to around one hundred in the next few years and multiple hundreds by 2020. The performance gains will eventually be limited by Amdahl's Law, and the maximum number of cores is expected to be around a thousand cores. Architectures have already been designed for a large number of cores, such as the Corona architecture, having 256 cores in 64 clusters [20], and the ATAC architecture, having 1024 cores in 16 clusters [10].

The global on-chip communication between the processor cores consumes an increasing portion of the total power budget, which is assumed to be around 200 W, and the bandwidth is

expected to run into the tens to hundreds of terabits per second [1], [3], [11], [20]. Interconnect power consumption needs to be addressed in order to keep the performance scaling that can be obtained by increasing the number of cores on a chip. The three key metrics for future interconnect technology are bandwidth density, energy efficiency and latency [11].

Optical links have all but replaced electrical links for telecommunications applications and are replacing the datacommunications interconnect links at increasingly short lengths. It is expected that to enable exascale systems, optics will penetrate into the modules (link lengths 5–100 mm) after 2015 [2]. The question is then whether optics will also be the enabler for on-chip communications and enable an optical network-on-chip (NoC) for communication between the multiple cores. This will only happen when optical interconnects can clearly outperform electrical interconnects on the combination of bandwidth density, energy efficiency and latency [3], [4]. This typically means a ~100 fJ/b system energy target, with about 10–20 fJ/b allocated for the optical source [3].

The prime candidate technology to realize an optical NoC is silicon photonics and its compatible materials, such as silicon nitride and silicon oxide. It is an ongoing discussion and an active research topic where such an optical NoC will access the stack and hence the process flow. Options include front-end-of-line (FEOL) and back-end-of-line (BEOL) monolithic implementations [5], [6] and heterogeneous implementations using 3-dimensional chip stacking [20]. Already silicon photonics has been proven to be a cost-effective and high-performance option for high-speed communication between silicon chips, when compared to electrical interconnects [7].

Using silicon photonics as the benchmark technology, link and NoC architecture simulations have been done over the last few years to assess the feasibility of optical on-chip interconnects. Studies based on the current state of the technology conclude that optical interconnects are not a feasible option for on-chip communications, due to lack of latency and energy consumption improvements [8], or not even offer enough bandwidth for off-chip communications, when memory bandwidth is included [9]. Looking forward, however, the 22-nm technology node seems promising for optical interconnects. In this node the circuit transistor capacitances are small enough to be driven directly by a photodetector, thereby eliminating power-hungry trans-impedance amplifiers and hence greatly reducing the power consumption of the link [10]. But even in this 22-nm technology, the advantages of optical over electrical interconnects are not obvious, especially when fixed overheads, e.g.,

Manuscript received October 1, 2013; revised November 18, 2013; accepted November 26, 2013. This work was supported by DARPA MTO through the POEM project.

M. J. R. Heck was with the University of California, Santa Barbara, CA 93106, USA. He is now with Aarhus University, DK-8200 Aarhus, Denmark (e-mail: mheck@eng.au.dk).

J. E. Bowers is with the Department of Electrical and Computer Engineering, University of California, Santa Barbara, CA 93106, USA (e-mail: bowers@ece.ucsb.edu).

Color versions of one or more of the figures in this paper are available online at <http://ieeexplore.ieee.org>.

Digital Object Identifier 10.1109/JSTQE.2013.2293271

laser power and thermal tuning, are taken into account or when a more aggressive electrical reference is chosen [11]. Looking even further along the technology roadmap, in [4] it is argued, based on a single link comparison, that optics will not be required before the 8-nm node. For global interconnects longer than 1.5 cm, though, optics are shown to outperform electrical RC and transmission-line links even at the 22-nm node.

In conclusion it can be said that based on the evolutionary ITRS roadmap, electrical interconnects will not be able to keep up within the next decade and optics will be the better choice. Whether optics will be the technology of choice depends on other developments that can revolutionize the landscape of interconnects, such as, e.g., carbon nanotube based interconnects [12]. The prospect, however, definitely justifies the research in this field. Moreover it seems clear that for on-chip global interconnects, such as used between multiple cores, optics seems to be the near-term interconnect technology of choice, since typical lengths are on the order of a few centimeters on a  $20 \times 20 \text{ mm}^2$  die.

For on-chip and off-chip networks, much like, e.g., datacenters and telecommunications networks, a holistic approach is required, where all the design parameters are taken into account to find the optimum trade-off between the key metrics bandwidth density, energy efficiency and latency [13]. Much work on such architecture analysis has been done, but the optical source, i.e. the laser, is generally excluded from the analysis, and assumed to be lumped into a static off-chip power overhead [1], [7], [11], [18]–[20], [46]. However in [14] it was argued that on-chip hybrid silicon lasers are promising candidates for interconnect purposes. The integration density [15] and performance [16] of the hybrid silicon technology have approached the native substrate performance. It is the main purpose of this paper to show that the state-of-the-art technology in silicon photonics justifies the addition of the source as an extra design dimension. We will show how the overall system performance can greatly benefit from this approach, especially in terms of energy efficiency.

Throughout this work we will use the approaches as shown in Fig. 1 as a starting point for the analysis. We will compare an off-chip comb laser, fiber-coupled to the chip (Fig. 1(a)), with an on-chip single-wavelength distributed-feedback (DFB) laser array (Fig. 1(b)). Both approaches feed an array of waveguides, i.e., an optical bus, with a set of wavelengths on the required WDM grid. In Section II we will first discuss the trade-offs for using on-chip or off-chip lasers, with the emphasis on coupling losses. In Section III we will show how on-chip lasers can greatly improve the energy efficiency of multi-core processors when taking the core utilization into account. We will do this by studying two case examples. In Section IV we will address some advantages of having the freedom to design a distributed and individually controlled network of sources on the chip surface. The purpose of this paper is to highlight the advantages, in terms of energy efficiency and energy proportionality, that on-chip lasers can bring to a NoC, by flexibility of layout, flexibility of switching on and off sources and eliminating coupling losses. It is meant to be complementary to existing work on NoC architectures by providing an additional angle. We hope our input in this field



Fig. 1. Schematics of the approaches studied in this work for (a) an off-chip comb laser and (b) an on-chip DFB array. Spatial multiplexers and/or interleavers are used to divide the  $N$  wavelengths over a bus of waveguides. These multiplexers can be, e.g., MMI or directional coupler trees, where crossing losses can be minimized according to [17], or star couplers.

will generate feedback from full architecture analysis and show future directions and targets for silicon photonics research on devices and circuits.

## II. ON-CHIP OR OFF-CHIP LASERS: COUPLING LOSS

There is some debate whether the optical source, i.e., in most foreseeable implementations a single-transverse-mode laser, will be integrated on the processor chip or in the 3-D chip stack, or whether it will remain off-chip, coupled to the silicon chip with an optical fiber, as shown in Fig. 1. In the latter case it acts as an optical power supply. An off-chip laser does not count towards the processor power budget of  $\sim 200$  Watts, but the total system energy efficiency is still impacted, which is a severe issue for systems that are limited or constrained in total energy consumption [18], [46]. Further advantages include easy replacement and temperature stability. Disadvantages of off-chip lasers are the additional coupling losses and packaging expenses.

It is assumed that interconnects will not be driven beyond twice the clock speed, due to power consumption by serialization and deserialization (SERDES). Optimum bitrates have been calculated to be 4 – 10 Gbps [1], [19], [20] in the near term, moving up to 15 Gbps over the next decade, based on a 4% projected annual growth of clock frequency [ITRS 2011 Executive Summary]. Given the vast amount of bandwidth required, this means that tens of thousands of communication channels will be required for NoCs. To avoid prohibitively large losses due to waveguide crossings and to increase the bandwidth density, generally wavelength-division multiplexed (WDM) solutions are considered, besides increasing the number of physical channels. The number of wavelengths and channel spacings considered are, for example,  $64 \times 80 \text{ GHz}$  [20] and  $64 \times 60 \text{ GHz}$  [46], where generally the free-spectral range (FSR) of the ring-based modulators is used as an upper bound for the width of the spectrum, i.e. 4 – 5 THz.

To avoid large packaging costs of multiple single-longitudinal-mode DFB lasers, a comb laser is generally assumed as an off-chip source. A comb laser is a laser that simultaneously emits a set of longitudinal optical modes. These modes have a constant spacing in the frequency domain, thereby forming a spectral ‘comb’. Lasers that emit short optical pulses at a fixed period, such as mode-locked lasers, are an example of a comb laser. These combs can have a Gaussian shape [21], typically when the pulse is close to transform limited, or a more flat profile, when the pulse is heavily chirped [22]. Typical parameters used in the simulations are 1 – 10 W output power and ~30% efficiency [20], [46]. The coupling loss of an off-chip laser to a single-mode fiber is typically ~2 dB. Similarly coupling from the fiber to the silicon chips is generally assumed to be ~2 dB, when using vertical grating couplers. It is the purpose of this section to show that the required optical bandwidth, the choice of the laser type and the coupler all have a significant impact on the link budget and energy efficiency of the system.

To have a more realistic look at the comb laser, two typical examples can be used, i.e., having a Gaussian-shaped spectrum or a ‘flat’ comb, with some limited power uniformity over the bandwidth and a steep fall-off of the power in the modes outside the bandwidth [22], [23]. If a uniform-power WDM comb of modes is required, a Gaussian output spectrum leads to an additional loss, since modes around the central wavelength have excess power per mode and modes outside the bandwidth carry power that will be wasted, as shown in Fig. 2(a). This additional loss can be calculated according to:

$$\eta_{\text{source}} = \frac{P_{\text{usable}}}{P_{\text{total}}}$$

$$P(\lambda) = \exp \left( - \left( \frac{\lambda - \lambda_0}{0.5 \cdot \Delta\lambda} \right)^2 \right)$$

$$P_{\text{total}} = \int P(\lambda) d\lambda = 0.5 \cdot \Delta\lambda \cdot \sqrt{\pi}$$

$$P_{\text{usable}} = BW \cdot P(\lambda_0 \pm 0.5 \cdot BW),$$

in which  $\eta_{\text{source}}$  is the efficiency,  $\lambda$  the wavelength,  $\lambda_0$  the center wavelength,  $\Delta\lambda$  the  $e^{-1}$  bandwidth of the spectrum and  $BW$  the required width of the WDM comb. It can be derived that  $BW = \Delta\lambda \cdot 2^{-0.5}$  and the efficiency  $\eta_{\text{source}} = e^{-0.5 \cdot 2^{0.5} \cdot \pi^{-0.5}} \approx -3.2$  dB.

The efficiency of a flat-comb laser depends on the mode-power uniformity  $\Delta P$ , i.e. the difference between the modes with the lowest and highest power, as indicated in Fig. 2(b). If we assume no significant power in the modes outside the comb bandwidth and a ‘wide’ comb, i.e., a mathematically infinite amount of comb lines, then the efficiency of a flat-comb source can be calculated as follows:

$$\eta_{\text{source}} = \frac{P_{\min}}{P_{\text{average}}},$$

with

$$P_{\min} = P_{\text{average}} - \xi \cdot \Delta P,$$



Fig. 2. Calculation parameters of comb generation efficiency for (a) a Gaussian comb and (b) a square or ‘flat’ comb. (c) Efficiency of a flat-comb laser (colored) as a function of comb uniformity and distribution  $\xi$ . The Gaussian comb efficiency is given for reference (dashed). The uniform distribution  $\xi = 0.5$  is indicated by the thick dark red line.

in which  $P_{\text{average}}$  is the total output power divided by the number of comb lines, i.e., the average power per comb line, and  $P_{\min}$  is the power of the weakest comb line within the target bandwidth. This equation merely represents the fact that the power in the least intense comb line dictates the source power metric, which should match the NoC system requirements.

In Fig. 2(c) the efficiency as a function of comb uniformity is shown, with the uniformity  $P_{\max}/P_{\min}$  defined as the ratio between the highest-power comb line  $P_{\max}$  and the lowest-power comb line  $P_{\min}$ . The parameter  $\xi$  indicates the distribution of the variation in the mode power. For a uniform distribution,  $\xi$  equals 0.5. Limiting cases are  $\xi \approx 0$  and  $\xi \approx 1$ , which both represent flat combs with a single comb line within the bandwidth of relevance having  $\Delta P$  more power or less power than  $P_{\text{average}}$  respectively. It can be seen that the uniformity of combs, such as presented in refs. [22]–[26] and assuming uniform distribution ( $\xi = 0.5$ ), should be better than 5 dB to compare advantageously to a Gaussian comb. This results also means that combs generated by modulating a continuous-wave laser are by definition the less favorable choice, since the (amplitude) modulation itself already



Fig. 3. Best-case grating coupling loss as a function of comb width (red). Data based on curved grating couplers [32] (black dotted) and straight grating couplers [33] (blue dashed).

brings down the efficiency by at least 3 dB, without even taking the power-hungry driver electronics into account [27], [28].

In this paragraph we have highlighted the importance of comb power uniformity. Work reported in literature often uses rather loose definitions of ‘comb generation’, and spectral uniformity, relative intensity noise (RIN), optical linewidth and coherence of the modes are often not fully addressed. We argue that for the purpose of a WDM ‘flat’ comb source, a uniformity of about 5 dB has to be achieved, in addition to the requirements on RIN and linewidth of every comb line [26], [29]. These are key metrics if such a comb source is to be considered for an energy-efficient NoC.

The light of the comb source needs to be coupled to the silicon die using a grating-based coupler. Surface coupling allows for on-wafer testing, which is essential for industrial scale fabrication and testing. Side coupling using spot-size converters is possible, but the examples presented in literature require additional processing [30] and/or small spot-size fibers [31], thereby losing compatibility with the CMOS-based process flows and increasing packaging cost.

The grating couplers presented in [32], [33] are fabricated in mature CMOS process lines and hence give a good indication of the current state-of-the art. Fig. 3 shows the worst-case loss values as a function of the comb width, i.e., the loss at the wings of the comb, using these two reported grating designs as a reference. This loss value at the wings is the value that is of interest for the system design when comb envelopes as shown in Fig. 1(a) and (b) are used. It can be seen that for comb widths of 4 – 5 THz, as discussed above, the grating coupler loss is between 2.1 dB and 2.4 dB.

In conclusion the following can be said about off-chip lasers:

- 1) Total accumulated loss budget adds up to around 7 – 8 dB from laser output to on-chip comb. This is significantly higher than what is generally assumed in literature and needs to be taken into account when a fair trade-off has to be made. For reference, on-chip hybrid silicon lasers show coupling losses of  $\sim 0.5$  dB [34] and wall-plug efficiencies up to 15% [16]. Even if we take the lower reported efficiency of on-chip lasers into account, i.e., 15% versus 30% on-chip, an on-chip laser outperforms an off-chip laser by 4 dB.
- 2) Significant improvements in efficiency can be achieved when the output spectrum of the off-chip laser can be

fully adjusted to minimize the excess and wasted optical power. One option is to use a multi-wavelength laser, with individually addressable gain media [35]–[37] or a fully integrated DFB array, i.e. a full WDM source photonic integrated circuit (PIC) [38].

- 3) We propose that comb lasers having a ‘flat’ comb should be characterized by their 5 dB bandwidth metric, since that makes them comparable to a Gaussian comb laser.

In the comparison we have not taken directly modulated VCSELs, micro-ring or disk lasers into account. For the large NoC bandwidths that are studied in this work, a WDM approach is required. Since VCSEL operating wavelengths are determined by epitaxial growth, there is no clear path forward on how to integrate large multi-wavelength VCSEL arrays or matrices on a chip. Micro-ring and disk laser technologies are currently not mature, but could in the future allow for on-chip sources [39], [40]. As we will discuss in Section V, however, technological breakthroughs are required to increase the efficiency of these devices to an acceptable level. Moreover, on a practical level, it is generally assumed that modulating the drive current decreases the lifetime and reliability of the device, and requires more stringent burn-in and testing, all of which increase the cost [7]. Another option we did not discuss is the use of parametric oscillators to generate a comb from a single-mode input laser, using nonlinear processes. Typically, noise performance and dynamics are rather complex and prohibitive for generating a comb that is useful for the purpose of on-chip networks [41]. However, recent work on soliton mode-locking in microresonators, is promising for low-noise and stable operation [42].

### III. ON-CHIP OR OFF-CHIP LASERS: ENERGY PROPORTIONALITY

It is well-known that datacenters and even high-performance computers (HPC) do not operate at maximum utilization, e.g., [43]. However, architecture studies in literature are generally focused at maximum bandwidth and utilization, to aim for the worst-case scenario in terms of demands on technology and power consumption. To minimize overall power consumption though, there is a clear need for energy-proportional computing [44]. This means that with lower utilization, the power consumption decreases proportionally.

It can be argued that the same rationale would apply for multi-core processors, e.g., saving power by switching off idle cores [45]. Focusing on the laser only, it can be seen in, e.g., [11], that architectures with an optical interconnected NoC have an increased energy per bit metric at lower bandwidths, mainly due to the static laser power consumption. Obviously this is not an issue for systems that are limited only by the 200 W processor power consumption limit, but overall system energy efficiency is severely impacted. Arguing that the laser acts as an optical power supply, equivalent to an electrical power supply, is not a correct approach, since the laser power is wasted when not used, and cannot be partially shut off. The point of this section is to show that on-chip lasers can lead to a more energy-proportional system, and hence a far more energy efficient system. Again,



Fig. 4. Layouts for  $64 \times 64$  crossbar with (a) long single-serpentine layout and (b) shorter single-serpentine layout. The transmitter and receiver block are shown in (c). The bus consists of 64 waveguides, carrying 64 wavelengths in each direction. Each ring actually represents 64 rings, tuned to a different wavelength. The ring array  $\alpha$  modulates/detects for one direction and the ring array  $\beta$  for the opposite direction. Electrical drivers and circuitry are schematically shown in red. Picture taken from [11].

we focus on the source performance only, with the intention that architecture designers will add the laser as an extra dimension to optimize in the full system equation.

We will use two example cases, namely a crossbar and a butterfly network, based on existing NoC architectural analysis in literature [11], [20]. We will then simulate the performance gains that can be achieved by making the laser configuration part of the design equation, instead of just considering a fixed power overhead optical power source. To do this we implement a simple model, where each active cluster communicates with all other active clusters simultaneously. No communication takes place with idle clusters, although we can envision some low-bandwidth electrical or optical arbitration network running in parallel. We consider two extreme cases, namely a completely uniform workload, where the set of active clusters is chosen randomly and arbitrarily placed over the chip surface [20] and an optimized workload, where the set of active clusters is chosen to maximize the energy proportionality. We note that the exact workload distribution and management will depend on the application and whether the clusters are actually homogeneous or heterogeneous [46], but a full review is beyond the scope of this paper. Our simple approach gives us a lower and upper bound of the potential improvements.

#### A. Photonic Global Crossbar – Case #1

A  $64 \times 64$  global crossbar layout is shown in Fig. 4 [11]. In this layout 64 waveguides carry 64 wavelengths in two directions. Each transmitter can put data on a single bus waveguide



Fig. 5. Number of wavelength pairs used and total fixed laser power savings as a function of the percentage of active clusters. Results are shown for a transmitter connected to a single waveguide with a modulator array, as shown in Fig. 4 (red), and for a transmitter with a single modulator per waveguide (blue).

using 64 ring modulators for each direction, i.e., 128 modulators total. Each receiver is connected by the same amount of rings to every bus waveguide, to obtain a single-write multiple-read (SWMR) architecture. This layout is very similar to the Corona architecture [1], [20]. We note that the clusters can have multiple cores, in principle allowing for scaling this  $64 \times 64$  NoC to over a thousand cores.

If the (two) comb lasers in the original approach are replaced by a set of two times 63 on-chip single-wavelength lasers, energy efficiency can be optimized by switching on or off lasers (in pairs for both directions, ignoring energy savings by optimizing the directionality) depending on the required bandwidth, traffic and core utilization. The number of required wavelengths  $N_\lambda$  scales linearly with the number of active clusters  $N_{\text{active}}$  according to:

$$N_\lambda = 2 \cdot (N_{\text{active}} - 1),$$

when it is assumed that all active clusters need to communicate with each other simultaneously and no communication with idle clusters is required. The factor of 2 represents the wavelength pairs that are required for bidirectional communication. A similar relation holds for the Corona architecture, which is a multiple-write single-read (MWSR) architecture. We note that all clusters have an identical position in the layout and hence it does not matter which ones are switched on or off. An interesting option is to connect a transmitter to each bus waveguide with a single modulator. This is considered to be an inferior option in terms of energy consumption, since the modulator has to be tuned over the full FSR, whereas a modulator array only has to be tuned over a fraction of the FSR, when barrel-shifting is used [19]. However, by allowing a single transmitter to make use of multiple waveguides, it can transmit to multiple receivers using a single wavelength only. In this case the number of required wavelengths scales according to:

$$N_\lambda = 2 \cdot \text{roundup} \left[ \frac{N_{\text{active}} \cdot (N_{\text{active}} - 1)}{N_{\text{waveguides}}} \right],$$

with  $N_{\text{waveguides}}$  the number of bus waveguides and ‘roundup’ means rounding up to the nearest integer.

In Fig. 5 the number of required (bidirectional) wavelength pairs is plotted as a function of the percentage of active clusters. It can be seen that power savings, as compared to an off-chip comb laser, can be significant, as expected. What is also inter-



Fig. 6. Layouts for 8-ary 2-stage butterfly with (a) point-to-point layout and (b) serpentine layout. The transmitter and receiver block are shown in (c). The bus consists of the indicated number of waveguides, carrying 64 wavelengths in each direction. Each ring actually represents 64 rings, tuned to a different wavelength. Further conventions are as in Fig. 4. Picture taken from [11].

esting is that the system is made more efficient with respect to laser power consumption, when the architecture is slightly adjusted, as explained above, with a maximum difference of 24% for 42% – 59% utilization. When these extra laser power savings outweigh the extra power consumption for the modulator tuning, the option to connect the transmitter to each bus is clearly the most favorable.

This straightforward analysis leads to two major conclusions:

- 1) Using on-chip lasers allows for multiple single-frequency sources. Since these sources are closely integrated with the NoC control electronics, they can be switched on and off, leading to significant energy savings that scale linearly with the cluster utilization.
- 2) Making the laser part of the architecture design equation, allows for further optimization and leads to different trade-offs. Architectures that might seem optimal using an off-chip laser with fixed overhead, can be the lesser choice when on-chip lasers are taken into account.

#### B. Photonic Butterfly Network – Case #2

Another typical example of an implementation of a NoC is the butterfly network. In Fig. 6 an 8-ary 2-stage network layout is shown, that was presented in [11]. In this implementation clusters of eight tiles, that may contain multiple cores themselves, are connected by an optical NoC, whereas the tiles within a cluster are connected by an electrical router with each other and with the optical part of the NoC. A bus of 28 waveguides, having 64 wavelength pairs for bidirectional operation, connects all 16 routers, i.e., 8 per stage. The 64 wavelengths are used to connect



Fig. 7. Number of wavelength pairs used and total fixed laser power savings as a function of the percentage of active tiles in a layout as shown in Fig. 6. Results are shown for the case where the selection of active tiles is optimized for minimum laser power consumption (red) and for a random distribution (blue). Both the Monte-Carlo simulation data points (light blue) and their average with standard deviation (dark blue) are shown.

the 8 tiles in a cluster with 8 tiles in another cluster, so a single bus waveguide is necessary per router pair link.

The number of required wavelengths is determined by the two clusters with the highest utilization. It can be found by multiplying the number of active tiles in these two clusters. Unlike the crossbar example, as discussed in the previous section, the tiles do not all have an equivalent position in the architecture. This means that it matters which tiles are switched on or off. We study this by taking two extreme and limiting cases, namely the case where one has complete control over which cores to use and the case where the core utilization is completely random. The results are shown in Fig. 7. Maximum energy efficiency from the perspective of laser power consumption is achieved when the load is equally distributed over all clusters. The exception is when only up to 8 tiles are used, since within a cluster only the electrical interconnects need to be used. We use a Monte-Carlo approach to calculate the potential laser power savings when the load is randomly distributed. At 50% utilization the average power savings are 51%, with 75% best-case.

This analysis leads to two major conclusions:

- 1) Similar to the case of the crossbar, the butterfly network can benefit greatly from on-chip lasers that can be switched on and off to minimize laser power consumption.
- 2) Making the laser part of the software design equation, i.e. which cores to use, allows for further optimization.

#### IV. ON-CHIP LASERS: ADVANTAGES OF LAYOUT FLEXIBILITY

The architectures presented in [11], [46] and shown in Fig. 4 and Fig. 6 make use of two comb lasers to feed the crossbar, one for each direction. This means that on average half of the number of channels will never be used, even at full utilization, and the system inherently takes a factor two in laser energy efficiency performance hit because of this, assuming the lasers are part of the equation. Furthermore we note that both lasers need to be locked to each other, i.e. having their combs spaced by half a channel spacing, to avoid crosstalk and signal degeneration. Obviously a single comb source, having a mode spacing of half the channel spacing, can be used too. In this case an on-chip channel duplexer can be used to separate all the odd numbered



Fig. 8. (a) Transmitter block including a local comb laser and (b,c) including a CW laser. Depending in the position in the network, the configuration in (b) transmits part of the channels clockwise and part counterclockwise. If the bus is laid out in a loop, the configuration in (c) can be used. Colors and schematics chosen according to the convention in Fig. 4 and Fig. 6.

modes from the even numbered modes to obtain the clockwise and counterclockwise propagating combs.

The proposed Corona architecture [1], [20] overcomes this problem by using a single propagation direction only. This however comes at the expense of additional waveguide length, since the light needs to propagate up to an additional roundtrip through the crossbar (similar to Fig. 4). With typical quoted propagation losses of 1 dB/cm for single-mode waveguides and 0.3 dB/cm for multi-mode waveguides [20] for SOI-based photonics, this leads to an additional  $\sim 12$  cm of waveguide (long serpentine layout, Fig. 4(a)) and hence a 3.6-dB efficiency hit best case. So no improvement is obtained by this approach over a bi-directional approach. On the other hand a factor of 2 less modulators are used in the unidirectional layout as compared to the bi-directional case, so the approach has its merits in terms of overall energy consumption reduction, as the modulators have a static power consumption due to thermal tuning requirements.

Recognizing that these losses come from the long passive interconnect lines that connect the transmitting tile to the laser leads to the conclusion that these losses can be avoided by placing the laser closer to the tiles. Since the tiles are distributed over the chip area, the source should be distributed over the area too. This is possible with on-chip lasers. In the layout shown in Fig. 4(c) a single comb laser per transmitter can be used, and a possible transmitter implementation is shown in Fig. 8(a). Some first realizations of potential on-chip comb lasers have been reported [47], [48]. Since a transmitter is connected to a single



Fig. 9. Calculated laser energy savings by placing the laser close to the transmitting die as a function of the number of tiles and the waveguide propagation losses.

waveguide only and since only one transmitter is connected to that waveguide, the comb lasers do not have to be locked to each other. More specific, the  $N_{\text{waveguides}}$  combs, with  $N_{\text{waveguides}}$  the number of tiles, and each having  $N_{\lambda}$  wavelengths, can be independent, as long as their channel spacing is fixed. So no wavelength locking of the  $N_{\text{waveguides}}$  comb lasers is required. This makes this approach actually quite feasible and attractive.

However, to avoid inherent loss issues with comb lasers, as explained above, preferably a single-mode laser is used in the transmitter. We propose a reconfiguration of the layout in Fig. 4(c) to achieve that. This is shown in Fig. 8(b) and (c). A single-mode laser is connected to a waveguide and the optical power is tapped off in equal absolute amounts using couplers with gradually increasing coupling strength, much like the work presented in [49]. This distributes the power over the  $N_{\text{waveguides}}$  bus waveguides. This approach would require wavelength locking of all the single-mode lasers to the WDM grid.

In both cases the laser is now on the same tile as the transmitter, avoiding long on-chip passive interconnects. An additional advantage is that the lasers are distributed over the chip area and hence the heat load is distributed too. To quantify the energy savings of this approach we assume that the additional loss of an off-chip laser due to additional interconnect length of a long-serpentine layout scales according to:

$$\text{Loss} = \left( \frac{\sqrt{N_{\text{tiles}}}}{2} + 2 \right) \cdot \left( l_{\text{chip}} \cdot \frac{\sqrt{N_{\text{tiles}}} - 2}{\sqrt{N_{\text{tiles}}}} \right) \cdot \alpha_{\text{loss}},$$

with  $l_{\text{chip}} = 20$  mm de length of the side of the chip,  $N_{\text{tiles}}$  ( $= N_{\text{waveguides}}$ ) the number of tiles and  $\alpha_{\text{loss}}$  the waveguide loss per unit length. A shortest possible path layout is assumed. Typical values for the waveguide loss are 3 dB/cm and 0.3–0.5 dB/cm for single mode SOI ridge and rib waveguides respectively [50], [51], 6–15 dB/cm for back-end-of-line polysilicon waveguides [52] and 3–4 dB/cm for front-end-of-line silicon, integrated in the CMOS layer [53]. Best reported results are 0.026 dB/cm for a multimode waveguide [54]. The results for waveguide loss values in the range of 0.03 – 3 dB/cm are shown in Fig. 9. Significant power savings of 46% are obtained for a 64-tile case for a waveguide loss of 0.3 dB, which is a typical best-case value for foundry processes. Power savings approach 90% and up for loss values above 1 dB/cm, but in this regime the transmitter



Fig. 10. Overview of the laser power savings that can be achieved by using the results from this work for (a) a crossbar and (b) a butterfly architecture, as discussed in this work. Since laser placement and design optimization [purple and green, (a)] are not compatible with each other, the bars are split and the comparison is side by side.

to receiver interconnect lines would be too lossy anyway. Two major conclusions can be drawn based on these results:

- 1) Waveguide losses need to be brought down into the <0.3 dB/cm regime to improve the design of Fig. 4(a) by using only a single source;
- 2) Placing the laser close to the transmitter tile leads to significant savings of 46% at 0.3 dB/cm, going down to only 6% at 0.03 dB/cm.

This means that using on-chip lasers increases the efficiency by about 3 dB for foundry-quoted waveguide loss values of 0.3 dB/cm, as compared to off-chip lasers.

## V. DISCUSSION AND CONCLUSION

In this paper we have discussed the advantages of on-chip optical sources versus an off-chip optical comb source for an optical NoC. We looked at three angles, namely the coupling losses, the energy proportionality and the advantages of layout flexibility. We draw some major conclusions:

- 1) Coupling losses of off-chip comb lasers are generally underestimated, improving the case for on-chip lasers;
- 2) Multiple on-chip lasers allow for better energy proportionality;
- 3) Making the on-chip lasers part of the architecture design equation and part of the load distribution design, additional power savings can be achieved;
- 4) The flexibility of placing on-chip sources at any desired position in the NoC layout leads to improved energy efficiency.

These effects are quantified in the overview graphs in Fig. 10, where the cumulative laser power savings are shown for the studied cases, i.e., the crossbar and the butterfly NoC. As can be seen, laser power savings run into the 10 – 20 dB regime (90% – 99%), especially for low utilizations.

The on-chip lasers and the off-chip comb source represent two ends of the spectrum of options. As mentioned, another option is to use multi-wavelength lasers or multiplexed DFB arrays [35]–[38]. This approach does not suffer from non-uniform comb line power, but the comb lines have to be stabilized on the WDM grid separately, which increases power consumption slightly. With (electronic) control feedback from the processor, these comb lines can be switched on and off individually, as required for energy proportional operation. A fiber to chip coupling is needed though, which still adds loss and packaging cost.

Fibers can be eliminated when a laser or amplifier array is facet-coupled to the silicon chip, either by packaging [55] or by bonding [56]. With a typical laser pitch of 125–250  $\mu\text{m}$  [56], [57], this allows for 100–200 sources per chip side, which is sufficient for the NoCs discussed in this paper. Moreover, techniques like quantum-well intermixing can increase the optical bandwidth to over 200 nm [58]. Such a hybridly integrated approach has most of the benefits of on-chip lasers, but will also count towards the total power budget, since the sources are in the same package.

To estimate the effect of these effects on the total NoC energy consumption, we can compare to the work in, e.g., [11]. For low (25%) throughput and high (100%) throughput an aggressive and conservative estimation of the NoC power breakdown is done. For 100% throughput the power consumption of the laser is estimated to be 22% – 63% of the total NoC power consumption, respectively. For 25% throughput these numbers are 60% – 89%. This means that the energy efficiency and energy proportionality of the total NoC can be greatly increased. So concluding it can be stated that an optical NoC becomes clearly comparable to an electric NoC, in terms of energy efficiency, even for the 22-nm node. This holds for both low and high throughput. We note that we cannot use the full coupling loss benefits, as calculated in this work, since the higher (and realistic) coupling loss values for off-chip lasers are not taken into account by most literature. We also note that this implies that the on-chip sources do not increase the thermal load on the processor more than electrical interconnects would in the 22 nm node.

A few assumptions are made in this paper. The first assumption is that NoCs can be easily reconfigured. This seems a valid statement for silicon-based networks that need to be thermally tuned and where tuning speeds of less than 0.1  $\mu\text{s}$  have been shown [59]. Lasers have typical start-up (relaxation) oscillations in the 100s of megahertz range, which are damped within the 0.1  $\mu\text{s}$  timescale, assuming optimized design. The thermal transients will be longer though, typically in the millisecond range. The detrimental effects of these transients, however, can and will be avoided by feedback loops, which are required in any case to map the laser operating wavelength to the WDM grid. Close integration of electronics with the laser source enables feedback



Fig. 11. Literature and commercial laser product overview of laser efficiency as a function of the specified optical output power. Laser cooling is not taken into account. Data are at room temperature, i.e. 20° – 25 °C. Wavelength ranges from 1250 – 1600 nm. Only single transverse mode lasers are taken into account. In-plane lasers include FP, EC-FP, DFB and DBR configurations. In (product description) cases where fiber-coupled power is quoted, a 2-dB fiber-to-chip coupling loss was assumed. [39], [65]–[78].

loops well into the gigahertz regime. So concluding we can state that on-chip optical NoCs can probably be reconfigured on the 1  $\mu$ s timescale or faster.

Another assumption is that the laser source efficiency is a constant when comparing different architectures and on-chip and off-chip lasers. There are four angles to this analysis. First we have a look at the laser efficiency as a function of optical output power. Designing and analyzing a laser is a complex problem, where, e.g., core geometry, device size, epitaxial layer design and thermal and electrical impedance all play a crucial role. Most laser models do not include all of these dimensions, which makes it hard to study the fundamental limits in laser efficiency. To gain insight in this dependence we take another approach and look at the practical reality. Assuming that state-of-the-art literature and commercial devices represent (close to) optimized designs, we plot the laser efficiency of a large amount of published and commercially available devices versus the output power in Fig. 11. It can be seen that efficiencies of >30% are obtained for lasers with output powers larger than ~10 mW. In this work we compare an off-chip comb laser with a typical output power of ~1 W with 64 on-chip lasers, with power levels of ~10 mW each. So the assumption that the lasers have comparable performance seems valid.

It can be seen however that below 10 mW there is a steep drop-off of power efficiency, especially for in-plane lasers. This means that higher granularity of the sources is not desirable and will have a negative impact on overall system efficiency. Most specific this means that direct modulation of on-chip sources is not a valid option, since only a single, low-power (~0.1 – 1 mW) laser per channel can be used. VCSELs show a better performance than in-plane lasers for ~1 mW power levels, but such lasers are not suitable for WDM-based NoCs.

The second angle is the technology impact on the laser performance, as integration will add additional boundary conditions on the technology. Fig. 11 shows results obtained with a technology that is optimized for the laser only. As such it is representative for off-chip lasers, where maximum design freedom is allowed, but not for on-chip lasers necessarily. The hybrid silicon technology

is a likely candidate for on-chip lasers [14], [15], although with a more general 3-D integration approach other options might be viable too. Reported hybrid silicon laser efficiency performance is up to 15% [16]. This is a relatively new technology with no obvious limitations as compared to native-substrate technologies, and 30%-40% efficiency should be possible for output powers above 10 mW. However, with current 15% performance, the total laser power savings shown in Fig. 10 would only take a –3 dB hit. Efforts to grow the laser diode epitaxial layer stack directly on silicon substrates are under way [60], [61].

The third angle is the difference in environment between on-chip and off-chip lasers. It can be expected that on-chip lasers will have to operate at elevated temperatures of up to 80 °C, i.e. the processor temperature. Off-chip lasers, however, will have their own heatsink and possibly thermo-electric cooling. Laser efficiency drops of –1.3 dB for a hybrid silicon laser operating at 80 °C [16] and of –0.6 dB for a quantum dot laser operating at 65 °C [23] as compared to room temperature have been reported. It has to be noted that the on-chip laser will not increase the processor temperature beyond its budget of 80 °C, since it will replace the power-hungry electrical interconnects. This also means that by proper design, steep temperature gradients over a modulator array can be avoided. The laser footprint is typically 100  $\mu$ m  $\times$  300  $\mu$ m (including contact pads) and can be considered to introduce a ‘global’ temperature variation, as compared to the modulator size. This means that the array can be shifted as a whole [62] and by making use of a barrel-shifter approach [19], the laser power dissipation will not affect the modulator tuning power consumption.

The last angle is the required power consumption for stabilization of the laser and mapping the wavelengths to the WDM grid. In an off-chip comb laser, the whole comb is stabilized when a single comb line and the mode spacing are stabilized. When multiple on-chip lasers are used, the number of control loops is equal to the number of lasers. The question is then whether these control loops consume a significant amount of power. We can argue that the control circuitry for detecting the wavelength shift is similar for both lasers and ring modulators. The ratio lasers versus rings scales as  $O(N/N^3)$ , assuming cross-bar architectures like the work in [1], [11]. If we assume that a ring array of N rings is tuned as a whole [62], the ratio of control circuit power consumption is  $O(N/N^2)$ , which is negligible for large N. We can also argue, based on, e.g., tunable lasers based on Vernier filters [63], [64], that the tuning power of a laser is twice that of a ring. If we assume barrel shifting of the receiver demultiplexers, i.e., the rings are only tuned over  $2\pi/N$ , that means the ratio of tuning power consumption for lasers versus rings is  $O(2N/N^2)$ , which is also negligible for large N.

So concluding we can state that the trade-offs required for on-chip lasers will cause an approximate maximum 4 dB drop in efficiency, using current state-of-the-art performance metric values. This means that the bars in Fig. 10 will be lowered, but still compare favorably to an off-chip laser in all cases.

The conclusion of our analysis, based on two simple NoC case studies, is that on-chip lasers dramatically outperform off-chip lasers in terms of energy efficiency and energy proportionality. Based on this conclusion, it is our main recommendation

that NoC architecture designers include the laser as part of the architecture and layout design space.

#### ACKNOWLEDGMENT

The authors thank J Shah and A Saleh for helpful discussions.

#### REFERENCES

- [1] D. Vantrease, R. Schreiber, M. Monchiero, M. McLaren, N. P. Jouppi, M. Fiorentino, A. Davis, N. Binkert, R. G. Beausoleil, and J. H. Ahn, "Corona: System implications of emerging nanophotonic technology," in *Proc. Int. Symp. Comput. Archit.*, 2008.
- [2] A. F. Benner, "Cost-effective optics: Enabling the exascale roadmap," in *Proc. IEEE 17th Symp. High Perform. Interconnects*, pp. 133–137, Aug. 25–27, 2009.
- [3] D. Miller, "Device requirements for optical interconnects to silicon chips," *Proc. IEEE*, vol. 97, no. 7, pp. 1166–1185, Jul. 2009.
- [4] M. Stucchi, S. Cosemans, J. Van Campenhout, Z. Tókei, and G. Beyer, "On-chip optical interconnects versus electrical interconnects for high-performance applications," *Microelectron. Eng.*, vol. 112, pp. 84–91, 2013.
- [5] J. S. Orcutt, A. Khilo, C. W. Holzwarth, M. A. Popovic, H. Li, J. Sun, T. Bonfield, R. Hollingsworth, F. X. Kärtner, H. I. Smith, V. Stojanovic, and R. J. Ram, "Nanophotonic integration in state-of-the-art CMOS foundries," *Opt. Exp.*, vol. 19, pp. 2335–2346, 2011.
- [6] I. A. Young, E. Mohammed, J. T. S. Liao, A. M. Kern, S. Palermo, B. A. Block, M. R. Reshotko, and P. L. D. Chang, "Optical I/O technology for tera-scale computing," *IEEE J. Solid-State Circuits*, vol. 45, no. 1, p. 235, Jan. 2010.
- [7] C. Gunn, "CMOS photonics for high-speed interconnects," *IEEE Micro*, vol. 26, no. 2, pp. 58–66, Mar./Apr. 2006.
- [8] R. Ho, P. Amberg, E. Chang, P. Koka, J. Lexau, G. Li, F. Y. Liu, H. Schwetman, I. Shubin, H. D. Thacker, X. Zheng, J. E. Cunningham, and A. V. Krishnamoorthy, "Silicon photonic interconnects for large-scale computer systems," *IEEE Micro*, vol. 33, no. 1, pp. 68–78, Jan./Feb. 2013.
- [9] H. J. S. Dorren, P. Duan, O. Raz, and R. P. Luijten, "Fundamental bounds for photonic interconnects," in *Proc. 16th Opto-Electron. Commun. Conf.*, Kaohsiung, Taiwan, Jul. 4–8, 2011.
- [10] J. Miller, J. Psota, G. Kurian, N. Beckmann, J. Eastep, J. Liu, M. Beals, J. Michel, L. Kimerling, and A. Agarwal, "ATAC: A manycore processor with on-chip optical network," *Massachusetts Institute of Technology, Cambridge, MA, USA, MIT-CSAIL-TR-2009-018*, May 5, 2009.
- [11] C. Batten, A. Joshi, V. Stojanović, and K. Asanović, "Designing chip-level nanophotonic interconnection networks," in *Integrated Optical Interconnect Architectures for Embedded Systems*. New York, NY: Springer, 2013, pp. 81–135.
- [12] M. F. L. De Volder, S. H. Tawfick, R. H. Baughman, and A. J. Hart, "Carbon nanotubes: Present and future commercial applications," *Science*, vol. 339, no. 6119, pp. 535–539, 2013.
- [13] R. G. Beausoleil, M. McLaren, and N. P. Jouppi, "Photonic architectures for high-performance data centers," *IEEE J. Sel. Topics Quantum Electron.*, vol. 19, no. 2, p. 3700109, Mar./Apr. 2013.
- [14] M. J. R. Heck, H.-W. Chen, A. W. Fang, B. R. Koch, D. Liang, H. Park, M. Sysak, and J. E. Bowers, "Hybrid silicon photonics for optical interconnects," *IEEE J. Sel. Topics Quantum Electron.*, vol. 17, no. 2, pp. 333–346, Mar./Apr. 2011.
- [15] M. J. R. Heck, J. F. Bauters, M. L. Davenport, J. K. Doylend, S. Jain, G. Kurczveil, S. Srinivasan, Y. Tang, and J. E. Bowers, "Hybrid silicon photonic integrated circuit technology," *IEEE J. Sel. Topics Quantum Electron.*, vol. 19, no. 4, p. 6100117, Jul./Aug. 2013.
- [16] B. R. Koch, E. J. Norberg, B. Kim, J. Hutchinson, J.-H. Shin, G. Fish, and A. Fang, "Integrated silicon photonic laser sources for telecom and datacom," presented at the Nat. Fiber Opt. Eng. Conf., Anaheim, CA, pp. 1–3, Mar. 17–21 2013.
- [17] A. M. Jones, C. T. DeRose, A. L. Lentine, D. C. Trotter, A. L. Starbuck, and R. A. Norwood, "Ultra-low crosstalk, CMOS compatible waveguide crossings for densely integrated photonic interconnection networks," *Opt. Exp.*, vol. 21, no. 10, pp. 12002–12013, 2013.
- [18] C. A. Batten, A. Joshi, V. Stojanovic, and K. Asanovic, "Designing chip-level nanophotonic interconnection networks," *IEEE J. Emerging Sel. Topics Circuits Syst.*, vol. 2, no. 2, pp. 137–153, Jun. 2012.
- [19] M. Georgas, J. Leu, B. Moss, C. Sun, and V. Stojanovic, "Addressing link-level design tradeoffs for integrated photonic interconnects," presented at the IEEE Custom Integr. Circuits Conf., San Jose, CA, pp. 1–8, Sep. 19–21, 2011.
- [20] J. Ahn, M. Fiorentino, R. G. Beausoleil, N. Binkert, A. Davis, D. Fattal, N. P. Jouppi, M. McLaren, C. M. Santori, R. S. Schreiber, S. M. Spillane, D. Vantrease, and Q. Xu, "Devices and architectures for photonic chip-scale integration," *Appl. Phys. A*, vol. 95, no. 4, pp. 989–997, 2009.
- [21] Y. Barbarin, E. A. J. M. Bente, M. J. R. Heck, Y. S. Oei, R. Nötzel, and M. K. Smit, "Characterization of a 15 GHz integrated bulk InGaAsP passively modelocked ring laser at 1.53 μm," *Opt. Exp.*, vol. 14, no. 21, pp. 9716–9727, 2006.
- [22] G.-H. Duan, A. Shen, A. Akrotin, F. V. Dijk, F. Lelarge, F. Pommereau, O. LeGouezigou, J.-G. Provost, H. Gariah, F. Blache, F. Mallecot, K. Merghem, A. Martinez, and A. Ramdane, "High performance InP-based quantum dash semiconductor mode-locked lasers for optical communications," *Bell Labs Tech. J.*, vol. 14, no. 3, pp. 63–84, 2009.
- [23] A. Gubenko, S. Mikhrin, V. Mikhrin, I. Krestnikov, and D. Livshits, "Low-power monolithic COMB laser for short-reach WDM optical interconnects," in *Proc. IEEE Photon. Conf.*, Burlingame, CA, pp. 62–63, Sep. 23–27, 2012.
- [24] M. J. R. Heck, E. Salumbides, A. Renault, E. A. J. M. Bente, Y. S. Oei, M. K. Smit, R. van Veldhoven, R. Nötzel, K. S. E. Eikema, and W. Ubachs, "Analysis of hybrid mode-locking of two-section quantum dot lasers operating at 1.5 μm," *Opt. Exp.*, vol. 17, pp. 18063–18075, 2009.
- [25] M. J. R. Heck, A. Renault, E. A. J. M. Bente, Y. S. Oei, M. K. Smit, K. S. E. Eikema, W. Ubachs, S. Anantathanasarn, and R. Nötzel, "Passively mode-locked 4.6 GHz and 10.5 GHz quantum dot laser diodes around 1.55 μm with large operating regime," *IEEE J. Sel. Topics Quantum Electron.*, vol. 15, no. 3, pp. 634–643, May/Jun. 2009.
- [26] M. J. R. Heck and J. E. Bowers, "Integrated fourier domain mode-locked lasers: Analysis of a novel coherent WDM comb laser," *IEEE J. Sel. Topics Quantum Electron.*, vol. 18, no. 1, pp. 201–209, Jan./Feb. 2012.
- [27] N. K. Fontaine, R. P. Scott, and S. J. B. Yoo, "Dynamic optical arbitrary waveform generation and detection in InP photonic integrated circuits for Tb/s optical communications," *Opt. Commun.*, vol. 284, no. 15, pp. 3693–3705, 2011.
- [28] A. Mishra, R. Schmogrow, I. Tomkos, D. Hillerkuss, C. Koos, W. Freude, and J. Leuthold, "Flexible RF-based comb generator," *IEEE Photon. Technol. Lett.*, vol. 25, no. 7, pp. 701–704, Apr. 2013.
- [29] A. Akrotin, A. Shen, R. Brenot, F. Van Dijk, O. Legouezigou, F. Pommereau, F. Lelarge, A. Ramdane, and G.-H. Duan, "Separate error-free transmission of eight channels at 10 Gb/s using comb generation in a quantum-dash-based mode-locked laser," *IEEE Photon. Technol. Lett.*, vol. 21, no. 23, pp. 1746–1748, Dec. 2009.
- [30] L. Chen, C. R. Doerr, Y.-K. Chen, and T.-Y. Liow, "Low-loss and broadband cantilever couplers between standard cleaved fibers and high-index-contrast Si<sub>3</sub>N<sub>4</sub> or Si waveguides," *IEEE Photon. Technol. Lett.*, vol. 22, no. 23, pp. 1744–1746, Dec. 2010.
- [31] G. Roelkens, P. Dumon, W. Bogaerts, D. Van Thourhout, and R. Baets, "Efficient silicon-on-insulator fiber coupler fabricated using 248-nm-deep UV lithography," *IEEE Photon. Technol. Lett.*, vol. 17, no. 12, pp. 2613–2615, Dec. 2005.
- [32] A. Mekis, S. Gloeckner, G. Masini, A. Narasimha, T. Pinguet, S. Sahni, and P. De Dobbelaere, "A grating-coupler-enabled CMOS photonics platform," *IEEE J. Sel. Topics Quantum Electron.*, vol. 17, no. 3, pp. 597–608, May/Jun. 2011.
- [33] D. Vermeulen, P. Verheyen, G. Lepage, W. Bogaerts, P. Absil, D. Van Thourhout, and G. Roelkens, "High-efficiency fiber-to-chip grating couplers realized using an advanced CMOS-compatible silicon-on-insulator platform," *Opt. Exp.*, vol. 18, no. 17, pp. 18278–18283, 2010.
- [34] G. Kurczveil, P. Pintus, M. J. R. Heck, J. D. Peters, and J. E. Bowers, "Characterization of insertion loss and back reflection in passive hybrid silicon tapers," *IEEE Photon. J.*, vol. 5, no. 2, p. 6600410, Apr. 2013.
- [35] A. A. M. Starling, L. H. Spielman, J. M. J. Binsma, E. J. Jansen, T. Van Dongen, P. J. A. Thijss, M. K. Smit, and B. H. Verbeek, "A compact nine-channel multiwavelength laser," *IEEE Photon. Technol. Lett.*, vol. 8, no. 9, pp. 1139–1141, Sep. 1996.
- [36] G. Kurczveil, M. J. R. Heck, J. D. Peters, J. M. Garcia, D. Spencer, and J. E. Bowers, "An integrated hybrid silicon multiwavelength AWG laser," *IEEE J. Sel. Topics Quantum Electron.*, vol. 17, no. 16, pp. 1521–1527, Nov/Dec. 2011.
- [37] D. T. Spencer, D. Dai, Y. Tang, M. J. R. Heck, and J. E. Bowers, "A novel 1×N power splitter scalable to large numbers of uniformly excited ports," *IEEE Photon. Technol. Lett.*, vol. 25, no. 1, pp. 36–39, Jan. 2013.

- [38] D. F. Welch, F. A. Kish, S. Melle, R. Nagarajan, M. Kato, C. H. Joyner, J. L. Pleumeekers, R. P. Schneider, J. Back, A. G. Dentai, V. G. Dominic, P. W. Evans, M. Kauffman, D. J. H. Lambert, S. K. Hurtt, A. Mathur, M. L. Mitchell, M. Missey, S. Murthy, A. C. Nilsson, R. A. Salvatore, M. F. Van Leeuwen, J. Webjorn, M. Ziari, S. G. Grubb, D. Perkins, M. Reffe, and D. G. Mehuys, "Large-scale InP photonic integrated circuits: Enabling efficient scaling of optical transport networks," *IEEE J. Sel. Topics Quantum Electron.*, vol. 13, no. 1, pp. 22–31, Jan/Feb. 2007.
- [39] J. Van Campenhout, L. Liu, P. Rojo Romeo, D. Van Thourhout, C. Seassal, P. Regreny, L. Di Cioccio, J.-M. Fedeli, and R. Baets, "A compact SOI-integrated multiwavelength laser source based on cascaded InP microdisks," *IEEE Photon. Technol. Lett.*, vol. 20, no. 16, pp. 1345–1347, Aug. 2008.
- [40] D. Liang, M. Fiorentino, S. Srinivasan, J. E. Bowers, and R. G. Beausoleil, "Low threshold electrically-pumped hybrid silicon microring lasers," *IEEE J. Sel. Topics Quantum Electron.*, vol. 17, no. 6, pp. 1528–1533, Nov./Dec. 2011.
- [41] T. Herr, K. Hartinger, J. Riemsberger, C. Y. Wang, E. Gavartin, R. Holzwarth, M. L. Gorodetsky, and T. J. Kippenberg, "Universal formation dynamics and noise of Kerr-frequency combs in microresonators," *Nature Photon.*, vol. 6, no. 7, pp. 480–487, 2012.
- [42] T. Herr, V. Brasch, J. Jost, C. Wang, N. Kondratiev, M. Gorodetsky, and T. Kippenberg, "Soliton mode-locking in optical microresonators," presented at the CLEO: 2013, OSA Technical Digest (online) (Optical Society of America, San Jose, CA, paper QTh4E.3, 2013).
- [43] H. Li, D. Groep, and L. Wolters, "Workload characteristics of a multi-cluster supercomputer," in *Job Scheduling Strategies for Parallel Processing*. Berlin, Germany: Springer, 2005, pp. 176–193.
- [44] L. A. Barroso and U. Holzle, "The case for energy-proportional computing," *Computer*, vol. 40, no. 12, pp. 33–37, 2007.
- [45] S. Borkar, "Thousand core chips: A technology perspective," in *Proc. 44th Annu. Des. Autom. Conf.*, 2007, pp. 746–749.
- [46] A. Joshi, C. Batten, Y.-J. Kwon, S. Beamer, I. Shamim, K. Asanovic, and V. Stojanovic, "Silicon-photonic clos networks for global on-chip communication," in *Proc. IEEE Comput. Soc. 3rd ACM/IEEE Int. Symp. Netw.-on-Chip*, 2009, pp. 124–133.
- [47] B. R. Koch, A. W. Fang, O. Cohen, and J. E. Bowers, "Mode-locked silicon evanescent lasers," *Opt. Exp.*, vol. 15, no. 18, pp. 11225–11233, 2007.
- [48] A. W. Fang, B. R. Koch, K.-G. Gan, H. Park, R. Jones, O. Cohen, M. J. Paniccia, D. J. Blumenthal, and J. E. Bowers, "A racetrack mode-locked silicon evanescent laser," *Opt. Exp.*, vol. 16, no. 2, pp. 1393–1398, 2008.
- [49] J. Sun, E. Timurdogan, A. Yaacobi, E. S. Hosseini, and M. R. Watts, "Large-scale nanophotonic phased array," *Nature*, vol. 493, no. 7431, pp. 195–199, 2013.
- [50] iSiPP25G specifications; [www.epixfab.eu](http://www.epixfab.eu)
- [51] P. Dong, W. Qian, S. Liao, H. Liang, C. C. Kung, N. N. Feng, R. Shafiiha, J. Fong, D. Feng, A. V. Krishnamoorthy, and M. Asghari, "Low loss shallow-ridge silicon waveguides," *Opt. Exp.*, vol. 18, no. 14, pp. 14474–14479, 2010.
- [52] J. S. Orcutt, S. D. Tang, S. Kramer, K. Mehta, H. Li, V. Stojanović, and R. J. Ram, "Low-loss polysilicon waveguides fabricated in an emulated high-volume electronics process," *Opt. Exp.*, vol. 20, no. 7, pp. 7243–7254, 2012.
- [53] J. S. Orcutt, B. Moss, C. Sun, J. Leu, M. Georgas, J. Shainline, E. Zgraggen, H. Li, J. Sun, M. Weaver, S. Urosevic, M. Popovic, R. Ram, and V. Stojanovic, "Open foundry platform for high-performance electronic-photonic integration," *Opt. Exp.*, vol. 20, no. 11, pp. 12222–12232, 2012.
- [54] G. Li, J. Yao, H. Thacker, A. Mekis, X. Zheng, I. Shubin, Y. Luo, J. Lee, K. Raj, J. Cunningham, and A. Krishnamoorthy, "Ultralow-loss, high-density SOI optical waveguide routing for macrochip interconnects," *Opt. Exp.*, vol. 20, no. 11, pp. 12035–12039, 2012.
- [55] A. J. Zilkie, P. Seddighian, B. J. Bijlani, W. Qian, D. C. Lee, S. Fathololoumi, J. Fong, R. Shafiiha, D. Feng, B. J. Luff, X. Zheng, J. E. Cunningham, A. V. Krishnamoorthy, and M. Asghari, "Power-efficient III-V/silicon external cavity DBR lasers," *Opt. Exp.*, vol. 20, no. 21, pp. 23456–23462, 2012.
- [56] S. Tanaka, S. Jeong, S. Sekiguchi, T. Akiyama, T. Kurahashi, Y. Tanaka, and K. Morito, "Four-wavelength silicon hybrid laser array with ring-resonator based mirror for efficient CWDM transmitter," in *Proc. Opt. Fiber Commun. Conf. Expo. Nat. Fiber Opt. Eng. Conf.*, Mar. 17–21, 2013, pp. 1–3.
- [57] C.-C. Lin, G. Yoffe, M. Emanuel, S. Rishton, D. Ton, S. Zou, B. Lu, and B. Pezeshki, "Monolithically integrated high speed DFB BH laser arrays for 10 Gbased LX4 application," in *Proc. IEEE Opt. Fiber Commun. Conf. Nat. Fiber Opt. Eng. Conf.*, 2006, p. 3.
- [58] S. R. Jain, M. N. Sysak, and J. E. Bowers, "> 200 nm gain-bandwidth hybrid silicon laser array using quantum well intermixing," in *Proc. 23rd IEEE Int. Semicond. Laser Conf.*, 2012, pp. 1–2.
- [59] A. H. Atabaki, A. A. Eftekhar, S. Yegnanarayanan, and A. Adibi, "Sub-100-nanosecond thermal reconfiguration of silicon photonic devices," *Opt. Exp.*, vol. 21, no. 13, pp. 15706–15718, 2013.
- [60] A. Lee, Q. Jiang, M. Tang, A. Seeds, and H. Y. Liu, "Continuous-wave InAs/GaAs quantum-dot laser diodes monolithically grown on Si substrate with low threshold current densities," *Opt. Exp.*, vol. 20, no. 20, pp. 22181–22187, Sep. 2012.
- [61] A. Y. Liu, C. Zhang, A. Snyder, D. Lubachev, J. M. Fastenau, A. W. K. Liu, A. C. Gossard, and J. E. Bowers, "InAs quantum dot ridge lasers on silicon," in *Proc. 30th North Amer. Molecular Beam Epitaxy Conf.*, Alberta, Canada, Oct. 5–11, 2013.
- [62] P. De Heyn, J. De Coster, P. Verheyen, G. Lepage, M. Pantouvaki, P. Absil, W. Bogaerts, J. Van Campenhout, and D. Van Thourhout, "Fabrication-tolerant four-channel wavelength-division-multiplexing filter based on collectively tuned Si microrings," in *J. Lightw. Technol.*, vol. 31, no. 16, pp. 2785–2792, 2013.
- [63] A. Le Liepvre, C. Jany, A. Accard, M. Lamponi, F. Poingt, D. Make, F. Lelarge, J.-M. Fedeli, S. Messaoudene, D. Bordel, and G.-H. Duan, "Widely wavelength tunable hybrid III–V/silicon laser with 45 nm tuning range fabricated using a wafer bonding technique," in *Proc. IEEE 9th Int. Conf. Group IV Photon.*, 2012, pp. 54–56.
- [64] J. C. Hulme, J. K. Doylend, and J. E. Bowers, "Widely tunable vernier ring laser on hybrid silicon," *Opt. Exp.*, vol. 21, no. 17, pp. 19718–19722, 2013.
- [65] A. Syrbu, V. Iakovlev, A. Caliman, P. Royo, and E. Kapon, "10 Gbps VCSELs with high single mode output in 1310 nm and 1550 nm Bands," presented at the Proc. Opt. Fiber Commun. Conf., San Diego, CA, pp. 1–3, Feb. 24–28 2008.
- [66] T. Spuesens, L. Liu, T. de Vries, P. Rojo Romeo, P. Regreny, and D. Van Thourhout, "Improved design of an InP-based microdisk laser heterogeneously integrated with SOI," in *Proc. 6th IEEE Int. Conf. Group IV Photon.*, 2009, pp. 202–204.
- [67] D. Liang, S. Srinivasan, D. A. Fattal, M. Fiorentino, Z. Huang, D. T. Spencer, J. E. Bowers, and R. G. Beausoleil, "Reflection-assisted unidirectional hybrid silicon microring lasers," in *Proc. Int. Conf. Indium Phosphide Related Mater.*, 2012, pp. 12–15.
- [68] A. Syrbu, A. Mircea, A. Mereuta, A. Caliman, C.-A. Berseth, G. Suruceanu, V. Iakovlev, M. Achtenhagen, A. Rudra, and E. Kapon, "1.5-mW single-mode operation of wafer-fused 1550-nm VCSELs," *IEEE Photon. Technol. Lett.*, vol. 16, no. 5, pp. 1230–1232, May 2004.
- [69] J.-R. Burie, P. Garabedian, C. Starck, P. Pagnod-Rossiaux, M. Bettati, M. Do Nascimento, J.-N. Reygoblet, J.-C. Bertroux, and F. Laruelle, "Extremely low losses 14xx single mode laser diode leading to 550-mW output power module with 0–75 °C case temperature and 10-W consumption," *SPIE LASE Int. Soc. Opt. Photon.*, vol. 8241, pp. 82410X–82419X, 2012.
- [70] J. S. Wang, R. S. Hsiao, G. Lin, L. Wei, Y. T. Wu, A. R. Kovsh, N. A. Maleev, A. V. Sakharov, D. A. Livshits, J. F. Chen, and J. Y. Chi, "Ridge waveguide 1310 nm lasers based on multiple stacks of InAs/GaAs quantum dots," *Physica Status Solidi (C)*, vol. 4, pp. 1339–1342, 2003.
- [71] I. Mito, M. Kitamura, K. Kobayashi, S. Murata, M. Seki, Y. Odagiri, H. Nishimoto, and M. Yamaguchi, "InGaAsP double-channel-planar-buried-heterostructure laser diode (DC-PBH LD) with effective current confinement," *J. Lightw. Technol.*, vol. 1, no. 1, pp. 195–202, 1983.
- [72] K. Shinoda, T. Kitatani, M. Aoki, M. Mukaikubo, K. Uchida, and K. Uomi, "1.3-μm InGaAlAs short-cavity DBR lasers for uncooled 10-Gb/s operation with low drive current," *IEEE Photon. Technol. Lett.*, vol. 18, no. 21–24, pp. 2383–2385, Nov. 2006.
- [73] N. Nunoya, M. Nakamura, M. Morshed, S. Tamura, and S. Arai, "High-performance 1.55-μm wavelength GaInAsP-InP distributed-feedback lasers with wirelike active regions," *IEEE J. Sel. Topics Quantum Electron.*, vol. 7, no. 2, pp. 249–258, Mar/Apr. 2001.
- [74] N. Nishiyama, C. Caneau, B. Hall, G. Guryanov, M. H. Hu, X. S. Liu, M.-J. Li, R. Bhat, and C. E. Zah, "Long-wavelength vertical-cavity surface-emitting lasers on InP with lattice matched AlGaInAs-InP DBR grown by MOCVD," *IEEE J. Sel. Topics Quantum Electron.*, vol. 11, no. 5, pp. 990–998, Sep./Oct. 2005.
- [75] J. J. Plant, P. W. Juodawlkis, R. K. Huang, J. P. Donnelly, L. J. Missaggia, and K. G. Ray, "1.5-μm InGaAsP-InP slab-coupled optical waveguide

- lasers," *IEEE Photon. Technol. Lett.*, vol. 17, no. 4, pp. 735–737, Apr. 2005.
- [76] M. Muller, W. Hofmann, T. Grundl, M. Horn, P. Wolf, R. Daniel Nagel, E. Ronneberg, G. Bohm, D. Bimberg, and M.-C. Amann, "1550-nm high-speed short-cavity VCSELs," *IEEE J. Select. Topics Quantum Electron.*, vol. 17, no. 5, pp. 1158–1166, Sep./Oct. 2011.
- [77] C. S. Wang, "Short-cavity DBR lasers integrated with high-speed electroabsorption modulators using quantum well intermixing," Ph.D. thesis, Department of Electrical and Computer Engineering, University of California, Santa Barbara, CA, Sep. 2007.
- [78] Efficiency data taken from specifications of Eudyna FLD3F7CZ; Finisar FP-1310-5I-XXX; Fraunhofer-HHI HCSEL, High-power SM and BH-DFB array; Innolume LD-13XX-BF-100 and LD-1310-COMB-12; Lucent A371 and A2300; Mitsubishi FU-468SLD-1CNA1; Optilab LMD5S513; QDLaser QLF1339-AA; Sumitomo SLV4270; Thorlabs FPL1009 S, SFL1550 S, LPS-1550-FC, FPL1053 S, LPS-1310-FC, VL-1580-1-SP-C-R5; Vertilas VL-1310-1G-P2-XX.



**John E. Bowers** (F'93) received the M.S. and Ph.D. degrees from Stanford University, Stanford, CA, USA.

He is currently a Professor in the Department of Electrical and Computer Engineering, University of California, Santa Barbara, CA, USA, where he is the Fred Kavli Chair in Nanotechnology and the Director of the Institute for Energy Efficiency. He was also with AT&T Bell Laboratories and Honeywell. He has authored or coauthored eight book chapters, 450 journal papers, and 700 conference papers. He holds 52

patents.

Dr. Bowers is a Member of the National Academy of Engineering, and a Fellow of the Optical Society of America (OSA) and the American Physical Society. He is a recipient of the OSA Holonyak Prize, the IEEE LEOS William Streifer Award, and the South Coast Business and Technology Entrepreneur of the Year Award. He was the recipient of the EE Times Annual Creativity in Electronics Award for Most Promising Technology for the hybrid silicon laser in 2007.



**Martijn J. R. Heck** (S'04–M'09) received the M.Sc. degree in applied physics and the Ph.D. degree from the Eindhoven University of Technology, Eindhoven, The Netherlands, in 2002 and 2008, respectively.

He is currently an Associate Professor with the Department of Engineering, Aarhus University, Aarhus, Denmark, where he is involved in research on photonic integration technologies and applications. From 2007 to 2008, he was a Postdoctoral Researcher at the Communication Technology: Basic Research and Applications Research Institute, Eindhoven University of Technology, where he was engaged in the development of a technology platform for active–passive integration of photonic integrated circuits. From 2008 to 2009, he was with the Laser Centre, Vrije Universiteit, Amsterdam, The Netherlands, where he was involved in the development of integrated frequency-combs generators. From 2009 to 2013, he was Postdoctoral Researcher and Associate Director of the Silicon Photonics Center at the University of California, Santa Barbara, CA, USA, where he was involved in photonic integrated circuits based on the heterogeneous integration of silicon, silica and III/V photonics.