

# A $4 \times 45$ Gb/s Two-Tap FFE VCSEL Driver in 14-nm FinFET CMOS Suitable for Burst Mode Operation

Mohammad Mahdi Khafaji<sup>ID</sup>, Guido Belfiore<sup>ID</sup>, Jan Pliva, Ronny Henker<sup>ID</sup>, and Frank Ellinger, *Senior Member, IEEE*

**Abstract**—This paper presents a high-swing vertical-cavity surface-emitting laser (VCSEL) driver with two-tap feed-forward equalizer (FFE) in 14-nm FinFET CMOS technology. The design features a standby mode with power-ON time below 10 ns and 86% of power saving. Based on system-level simulations, the impact of the driver swing, tap delay, and weight on the optical eye-opening is studied and it is shown that the FFE combined with high-swing driving can improve the output eye at rates beyond two times the bandwidth of the VCSEL. By applying an extra supply, the output stage is optimized to provide high swing ( $\sim 0.65$  V) without suffering from the slow response of a transistor in the triode region. In addition, the bias and modulation currents can be adjusted to accommodate different VCSELs. The designed circuit is tested with two VCSEL types. It is bonded to common-cathode 20 and 18 GHz VCSELs, and in both the cases, data transmission at 45 Gb/s with bit error rate lower than  $10^{-12}$  is measured. The total power dissipation is below 100 mW, providing a power efficiency of better than 2.11 pJ/bit. To the best of our knowledge, this design is the fastest VCSEL driver in any CMOS technology, which is also capable of burst mode operation.

**Index Terms**—Burst mode, CMOS, driver, feed-forward equalizer (FFE), FinFET, integrated circuit, laser, optical transmitter, vertical-cavity surface-emitting laser (VCSEL).

## I. INTRODUCTION

DIRECT modulation of vertical-cavity surface-emitting laser (VCSEL) arrays can enable parallel optical links in data centers with well above 100-Gb/s aggregate data rate in a compact footprint at low cost and power dissipation. However, to fully exploit the benefits of VCSEL-based optical links, high-speed drivers have to be realized in the same CMOS process of the processors. Furthermore, it should be possible to utilize the drivers with the existing common-cathode VCSEL arrays to allow multi-channel links. These transmitters are usually optimized for the highest operation speed and, therefore, the maximum amount of power is continuously dissipated.

Manuscript received February 6, 2018; revised May 4, 2018; accepted June 14, 2018. Date of publication July 19, 2018; date of current version August 27, 2018. This work was supported in part by the German Research Foundation in the framework of the collaborative research center 912 Highly Adaptive Energy-Efficient Computing and in part by the European Union's seventh framework programme (FP7/2007-2013) through the ADDAPT Project under Grant 619197. This paper was approved by Associate Editor Azita Emami. (*Corresponding author: Mohammad Mahdi Khafaji.*)

The authors are with the Chair of Circuit Design and Network Theory, Technische Universität Dresden, 01069 Dresden, Germany (e-mail: mohammad\_mahdi.khafaji@tu-dresden.de).

Color versions of one or more of the figures in this paper are available online at <http://ieeexplore.ieee.org>.

Digital Object Identifier 10.1109/JSSC.2018.2849390

TABLE I  
SUMMARY OF RECENT HIGH-SPEED VCSEL DRIVERS WITH EQ

| Ref.              | [6]       | [7]       | [8]               | [9]       | [10]      | This work |
|-------------------|-----------|-----------|-------------------|-----------|-----------|-----------|
| EQ type           | 2-tap FFE | 3-tap FFE | CTLE <sup>1</sup> | Nonlinear | 3-tap FFE | 2-tap FFE |
| Bit rate (Gb/s)   | 71        | 50        | 40                | 20        | 32        | 45        |
| VCSEL BW (GHz)    | 26        | 18        | 19                | 11.2      | 15        | 18        |
| PD BW (GHz)       | 30        | 22        | 35                | 15        | 17        | 22        |
| Combined BW (GHz) | 19.65     | 13.93     | 16.7              | 8.97      | 11.25     | 13.93     |
| BWE (bit/Hz)      | 3.6       | 3.6       | 2.4               | 2.23      | 2.8       | 3.23      |

<sup>1</sup> Continuous time linear equalizer.

On the other hand, the peak load is rarely required in data centers. In a typical data center, the links are mostly operated in burst mode, with a utilization of only 10% [1]. Hence, a high-speed driver capable of burst mode operation is of high interest and, when used with a burst mode receiver, can lead to considerable amount of power saving. While several receiver designs with this capability have been presented so far, e.g., in [2]–[4], a high-speed transmitter has rarely been reported. Such a burst mode driver has to deal with the prolonged wake-up time of VCSELs. This can be as long as 40 ns, which may result in an unacceptable overhead in data transmission [5].

Another important aspect in the driver design is the equalizer (EQ) architecture. Table I shows a summary of fastest published VCSEL drivers in different processes, including the presented work. To compare the performance of different EQ types, their impact on increasing spectral density of a link can be studied. This is important because the EQ is usually employed to improve the bandwidth efficiency (BWE) of the link, by enabling higher data rate transmission at a given BW. Assuming that the data rate is limited by the utilized optical components, a combined link BW can be approximated as follows [11]:

$$BW_{\text{total}}^{-2} = BW_{\text{VCSEL}}^{-2} + BW_{\text{PD}}^{-2} \quad (1)$$

in which  $BW_{\text{PD}}$  is the BW of the receiver side photodiode (PD). The BWE of each design was then calculated by dividing the reported data rate by the  $BW_{\text{total}}$ . The results in Table I show that using the feed-forward equalizer (FFE)

led to around 25% better BWE values compared with other EQ designs. The main reason is that the non-FFE EQs are designed to deal better with ringing at the laser response, which mostly happens at lower bias currents, where the laser BW is not at its maximum. Therefore, compared with the FFE, higher extinction ratios (ERs) at lower power values and lower data rates can be achieved. Nevertheless, a multi-channel driver with an FFE potentially requires 25% less amount of channels for the same aggregate data rate, although this usually comes at a cost of worse power efficiency (PE) values. In this paper, the design is optimized for the highest possible data rate, and hence, an FFE architecture is selected. The power dissipation is reduced by enabling burst mode operation, which is highly suitable for current data centers. In addition, as the current of each sub-block can be separately adjusted, the power can be scaled down at lower bit rates.

In this paper, we present a four-channel driver with burst-mode capability implemented in a 14-nm bulk CMOS technology. By utilizing an extra supply at the output stage, the modulation and biasing current can be configured in a wide range allowing VCSELs with different characteristics to be driven. The chip was then measured with two different 850-nm laser types: an 18-GHz multi-mode (MM) and a 20-GHz single-mode (SM) VCSEL. In both the cases, the driver enables error-free operation with a bit error rate (BER) lower than  $10^{-12}$  at a data rate of 45 Gb/s and optical modulation amplitude (OMA) of over 1 dBm. In a burst mode operation, by employing an enhanced biasing circuitry, the VCSEL can be turned on in less than 10 ns, while the power dissipation is reduced to 14% in standby mode. The total power consumption at the highest data rate, including the VCSEL, is less than 100 mW/channel, corresponding to an energy per bit of lower than 2.2 pJ/bit. To the best of our knowledge, the presented design works at the highest data rate among CMOS VCSEL drivers and is the fastest one in any technology suitable for burst mode operation.

## II. FFE STUDY

The block diagram of the driver with a two-tap FFE and a typical time diagram of the driver currents are shown in Fig. 1. The EQ is functioning as a pre-emphasis and its corresponding current ( $I_{D1}$ ) is subtracted from the main one ( $I_{D0}$ ), resulting in a reduction of the driving current ( $I_{Mod}$ ) by a factor of  $(G1/G0) \cdot I_{D0}$  after a time delay of  $\Delta\tau$ . To study the effect of the EQ on the optical output, system-level simulations including passives of the main driver stage and a nonlinear model of the VCSEL have been carried out. At the output stage, a  $50\ \Omega$  load resistor is connected to the laser via a T-coil inductor, which is used for BW extension. The chosen value of the T-coil compensates the pole generated by parasitics at the input of the VCSEL, enabling the pre-emphasized signal to reach the laser. The model of an MM laser and the derivation and accuracy of the parameters were shown in [12]. The bias current is set to achieve the maximum BW of the laser (18 GHz). The pad parasitic capacitance has also been added, as presented in Fig. 2.

Fig. 3 shows the normalized vertical eye-opening of the optical output for different EQ settings at data rates



Fig. 1. Block diagram of a two-tap FFE laser driver and a typical time diagram.  $T$  is the bit period.



|           | Description                   | Value           |
|-----------|-------------------------------|-----------------|
| $I_{Mod}$ | Modulation current            | 2.5-10 mA       |
| $R_{L,o}$ | Driver output resistance      | $50\ \Omega$    |
| $L_T$     | Driver peaking T-coil         | 200 pH          |
| $k$       | T-coil coupling factor        | 0.5             |
| $C_{par}$ | Output parasitic capacitance  | 80 fF           |
| $R_m$     | VCSEL mirror resistance       | 35 $\Omega$     |
| $R_a$     | VCSEL active area resistance  | 50- 80 $\Omega$ |
| $C_a$     | VCSEL active area capacitance | 140- 160 fF     |
| $P_{opt}$ | VCSEL optical output          | 1.5-5 mW        |

Fig. 2. Simplified system including the output passives and nonlinear VCSEL model used to study the effect of the FFE and the definition of utilized parameters. The values of VCSEL parameters are related to the MM laser used in this paper.

from 40 to 50 Gb/s. As presented in Fig. 3(a) and (b), the eye-opening at 40 Gb/s was not improved by applying the EQ, while increasing the swing of the modulation current from 5 to 7 mA resulted in 30% higher eye-opening. This shows that at a data rate, which is approximately double the BW of VCSEL, the optical eye is still vertically open and pre-emphasis can even have a negative impact. The reason is the reduced swing of driving current after applying the EQ,  $(1 - 2(G1/G0)) \cdot I_{D1}$ , as shown in Fig. 1. This was confirmed in Fig. 3(c), in which the swing of the modulation current after applying the pre-emphasis was kept constant and independent of the EQ setting. For the two-tap FFE at 40 Gb/s, the eye-opening was improved by more than 10%. At a higher rate of 45 Gb/s (2.5 times the laser BW) the eye was closed by more than 60% compared with 40-Gb/s operation, as shown



Fig. 3. Simulation results of normalized vertical eye-opening of the optical output for modulation current with (a) 5-mA<sub>pp</sub> swing at 40 Gb/s, (b) 7-mA<sub>pp</sub> swing at 40 Gb/s, (c) fixed 7-mA<sub>pp</sub> swing after equalization at 40 Gb/s, (d) 5-mA<sub>pp</sub> swing at 45 Gb/s, (e) 7-mA<sub>pp</sub> swing at 45 Gb/s, and (e) 7-mA<sub>pp</sub> swing at 50 Gb/s.

in Fig. 3(d). Increasing the modulation current resulted in only 10% improvement on the eye-opening. Applying the pre-emphasis further enhanced the eye-opening by 15% [Fig. 3(e)]. The optimum delay and relative gain of the tap are 17 ps and 0.075 (7.5%), respectively. At 50 Gb/s, the eye is completely closed when the low swing of 5-mA modulation current was applied. At high-swing setting, the EQ can increase the eye-opening by a factor of two, however, the eye-opening is still lower than the one with low swing current at 45 Gb/s. The optimal tap delay and the weight in this case are 15 ps and 0.175 (17.5%), as shown in Fig. 3(f). It has to be noted that one main reason for eye degradation at higher relative gain values is the excessive jitter added by applying stronger pre-emphasis signal. For the same simulation conditions in Fig. 3, the horizontal eye-opening is approximately reduced by 10% for the highest amount of the FFE.

It can be concluded that over-driving the VCSEL is an effective way to improve the eye-opening up to data rates close to double the BW of the VCSEL. At higher data rates, i.e., 2.5 times the VCSEL BW, a combination of high-swing modulation current and an EQ provides better results. At data rates closer to three times the VCSEL BW, a two-tap FFE may not be able to open the optical eye to a level, which is sufficient to provide error-free transmission. Therefore, applying an EQ at the receiver side or increasing the FFE order is recommended. Based on our simulations, optimum values for the tap delay and relative gain are 13–17 ps and 7%–20%, respectively.

### III. DRIVER CIRCUITS

Each channel of the driver array consists of a 50- $\Omega$  termination and three differential pairs working as a



Fig. 4. Simplified block diagram of one channel of the implemented driver.

limiting amplifier (LA) at the input, followed by a pre-driver, pre-emphasis, and main driver stage. A simplified block diagram is shown in Fig. 4. The LA is required to assure that the output is independent of the amplitude level and the rise time of the provided input signal. The LA output is applied to a delay cell and pre-driver. Fig. 5 shows the schematic of the main driver, including the pre-emphasis and the pre-driver circuitry. To avoid loading the output node, the pre-emphasis signal is applied to the input of the main driver. This also allowed lowering the current of the pre-emphasis stage to provide the same amount of the equalization, since  $R_{L1}$  is larger than the load at the output node. On the other hand, the main driver should not be limiting, otherwise, the FFE signal will not reach the VCSEL. To increase the linear operation range of the main driver, a degeneration resistor,  $R_s$ , was added. The current of the main driver is fully steered only when the maximum swing of the pre-driver signal is applied. However, the main driver need not be very linear, because the pre-emphasis signal at the output node is a function of the gain of both the driver and the pre-emphasis stages. By adjusting the gain of the pre-emphasis stage via a control voltage, the amount of the required peaking can be achieved. It has to



Fig. 5. Schematic of the main driver circuit, the pre-driver, and the pre-emphasis stage. The parasitics of the connection to VCSEL chip consisting of the output pad capacitance and bond wire inductance are also shown.

be noted that this modification does not change the main driver power dissipation, but there is slight power penalty in the pre-driver stage. In parallel to  $R_S$ , a degeneration capacitor  $C_s$  is also added. Together with this resistor, it generates a zero at a frequency of  $f_z = (2\pi R_s C_s)^{-1}$  in the transfer function of the stage [13]. The value of  $C_s$  is chosen in a way to provide the zero at 19 GHz to compensate the first pole determining the BW of the VCSEL. The delay cell consists of three cascaded differential amplifier, with a minimum delay of 4 ps per stage determined by post-layout simulations. Without noticeably increasing the jitter, the delay can be varied from 12 to 17 ps by tuning the current of these amplifiers. The delayed signal is then connected to the pre-emphasis stage, which shares the output with the pre-driver stage. By adjusting the current of this stage, the required amount of peaking can be generated.

#### A. Output Stage Design

Since the design is targeted for high-speed operation, the effect of the drain–source voltage of the transistor ( $V_{DS}$ ) on its speed can be studied. Fig. 6 shows the normalized transit frequency ( $f_T$ ) of an nMOS transistor at various drain–source voltages. When  $V_{DS}$  is lower than the saturation voltage  $V_{sat}$ , the transistor is in the triode region, where it presents considerably lower  $f_T$  values compared with the saturation region. This can be seen in the dramatic change of the slope around  $(V_{DS}/V_{sat}) = 1$  in Fig. 6. When  $V_{DS}$  drops from  $V_{sat}$  to 0.67  $V_{sat}$ , the transistor  $f_T$  is approximately halved. Hence, to achieve the highest speed, operation in



Fig. 6. Post-layout simulations of normalized  $f_T$  versus normalized  $V_{DS}$ , when the current, transistor size, and  $V_{GS}$  are kept fixed.

the triode region has to be avoided. This is more critical in highly scaled processes, because by scaling to shorter channel lengths, there hardly exist an improvement in the  $f_T$  of an nMOS transistor [14]. However, the design is more challenging due to lower breakdown voltage and higher gate resistance. In a basic differential pair functioning as a current-mode logic inverter, e.g.,  $M_{D3} - M_{D4}$ , if we assume the voltage gate of  $M_{D3}$  is  $V_{DD}$ , and the drain–source voltage of the current source  $M_{13}$  is also  $V_{sat}$ , then

$$V_{DD} - V_{sat} - V_{TH} = V_{sat} \quad (2)$$

in which  $V_{TH}$  is threshold voltage of  $M_{D3}$ . Therefore,

$$V_{sat} = \frac{V_{DD} - V_{TH}}{2}. \quad (3)$$

The maximum swing ( $V_{SW,max}$ ) at the drain of  $M_{D3}$  when it is still in the saturation region can also be obtained as follows:

$$V_{DD} - V_{SW,max} = 2V_{sat}. \quad (4)$$

Substituting (3) into (4) yields

$$V_{SW,max} = V_{TH}. \quad (5)$$

This shows that in a simple differential pair, the maximum output swing without entering the triode region is  $V_{TH}$ .

To overdrive the VCSEL, the maximum swing (~0.65 V) at the output node was targeted. This is well above the threshold voltage of transistor  $M_6$ , which results in slowing down the output signal by forcing it to the triode region if 1-V supply was to be used. To avoid this condition, a second supply voltage with higher voltage (VDH) was introduced. Depending on the required laser bias current ( $I_{B,L}$ ) and modulation current ( $I_{Mod,L}$ ), a proper value for this voltage can be selected. The procedure is described as follows. If we assume that the drain–source saturation voltage ( $V_{sat}$ ) of transistor  $M_6$  and  $M_8$  is the same, then the minimum voltage of  $M_6$  drain ( $V_{D,M_6}$ ) is 2  $V_{sat}$ . The maximum  $V_{D,M_6}$  is determined by the drain–source breakdown voltage of  $M_6$  ( $BV_{DS,M_6}$ ), and it can be assumed as  $BV_{DS,M_6} + V_{sat}$ . Therefore, the maximum swing at the output node is  $BV_{DS,M_6} - V_{sat}$ , if operation in the triode region for  $M_6$  is avoided. The lowest  $V_{D,M_6}$  is observed when  $M_5$  is in OFF state, and the total driver stage current ( $I_{DRV}$ ) is steered via  $M_6$ . The laser current is



Fig. 7. Post-layout simulation of (a) and (c) electrical output voltage and (b) and (d) optical power without EQ (top) and with EQ (bottom) at 45 Gb/s. Applied sources include jitter.

also reduced to  $I_{B,L} - (I_{\text{Mod},L}/2)$ . The maximum  $V_{D,M_6}$  is reached when  $M_6$  is turned off; so, the laser current increases to  $I_{B,L} + (I_{\text{Mod},L}/2)$ . If the voltage drop over  $M_7$  and  $R_{\text{BiasH}}$  is neglected, the following equations can be written:

$$\text{VDH} - \left( I_{\text{DRV}} + I_{B,L} - \frac{I_{\text{Mod},L}}{2} \right) R_{L,o} = 2V_{\text{sat}} \quad (6)$$

$$\text{VDH} - \left( I_{B,L} + \frac{I_{\text{Mod},L}}{2} \right) R_{L,o} = \text{BV}_{DS,M_6} + V_{\text{sat}}. \quad (7)$$

Solving these equations for  $I_{\text{DRV}}$  and  $\text{VDH}$  results to

$$I_{\text{DRV}} = \frac{\text{BV}_{DS,M_6} - V_{\text{sat}}}{R_{L,o}} + I_{\text{Mod},L} \quad (8)$$

$$\text{VDH} = \left( I_{B,L} + \frac{I_{\text{DRV}}}{2} \right) R_{L,o} + \frac{3V_{\text{sat}}}{2} + \frac{\text{BV}_{DS,M_6}}{2}. \quad (9)$$

The value of  $R_{L,o}$  is determined based on the output BW and also impedance matching to the VCSEL. In our design, a value of  $50\Omega$  was selected. For a value of  $7\text{ mA}_{\text{pp}}$  and  $4\text{ mA}$  for the modulation and bias current of the laser, respectively, (7) and (9) yield a driving stage current of  $21\text{ mA}$  and  $\text{VDH}$  with value of  $1.66\text{ V}$ . Therefore, the chosen architecture can accommodate different bias and modulation currents by adjusting  $I_{\text{DRV}}$  and also  $\text{VDH}$ , accordingly. With a maximum  $I_{\text{DRV}}$  of  $25\text{ mA}$  and at  $\text{VDH}$  of  $2\text{ V}$ , the designed output stage can provide up to  $12\text{ mA}_{\text{pp}}$  and  $8\text{ mA}$  for  $I_{\text{Mod},L}$  and  $I_{B,L}$ , respectively. The negative supply required to bias the lasers can also be directly applied from the chip. On-chip decoupling capacitors to ground are placed very close to the output pads. Post-layout simulations of the output voltage and the resulting optical eye diagrams at  $45\text{ Gb/s}$  with and without pre-emphasis are presented in Fig. 7. The optical eye-opening was improved by 20% after applying the EQ, which is in agreement with the system level simulations.

As the design is targeted for burst mode operation, the laser should not be completely turned off during the standby mode; otherwise, it shows turn-on time in the order of hundreds of



Fig. 8. Schematic of the bias circuit used to enable burst mode operation.

nano-seconds. This is done via thick gate pMOS transistor  $M_7$  and a parallel  $2\text{-K}\Omega$  resistor, which can keep the laser biased close to its threshold current and avoid lengthy turn-on time.

### B. Bias Circuit

To enable short recovery of the circuits after a standby time, all supply voltages are kept intact. This avoids the large-size on-chip decoupling capacitors from being discharged during disable mode, as they can cause a considerable delay until they reach their nominal values and permit the circuit to operate. However, to reduce the power, the bias currents are substantially lowered. This is done in the bias network, as shown in Fig. 8. When the enable (EN) signal is triggered, the current of the main mirror,  $I_{\text{ref}}$ , switches from  $I_{\text{mir}} + I_{\text{SB}}$  to  $I_{\text{SB}}$ , which is less than 10% of its nominal value. This forces all bias currents to decrease by more than 90%, but no turn-off happens. In post-layout simulations, the turn-on time, including the VCSEL, was less than 5 ns.

## IV. MEASUREMENT RESULTS

This chip was fabricated in a 14-nm bulk FinFET CMOS technology. The die photograph of the four-channel driver is shown in Fig. 9. The chip dimensions are  $2\text{ mm} \times 0.5\text{ mm}$ , but the active area of each channel is only  $350\text{ }\mu\text{m} \times 250\text{ }\mu\text{m}$ . The die was glued to a high-frequency board to which all the dc controlling voltages and data input can be applied. The chip was tested with two laser types. Both the VCSEL arrays, supplied by VI-systems, are emitting at a wavelength of  $850\text{ nm}$ . One of them is an MM laser with a BW of  $18\text{ GHz}$  and the other is an SM laser with  $20\text{-GHz}$  BW. Since there is a reduced number of modes present at the optical output of an SM VCSEL compared with an MM VCSEL, the impairments by chromatic and modal dispersion of MM fiber would be less. This allows a link based on SM laser to reach longer distances [15]. The VCSEL die was glued and bonded to the driver on the same printed circuit board (PCB). Each VCSEL array was coupled to a multi-channel MM fiber via PRIZM LightTurn coupling optics. With the help of an automated device, the fiber was aligned to couple the maximum optical power. The coupling efficiency was approximately 75%.



Fig. 9. Assembled four-channel board with high-frequency inputs, aligned fiber, and packaged driver chip and a VCSEL array. A view of the bonding before fiber alignment is also shown.



Fig. 10. Block diagram of the optical measurement setup.

Fig. 9 shows the assembled board, bonded chip to each of the VCSEL arrays, as well as layout diagram of one channel of the driver chip. To measure the optical signal, a 2-m fiber was connected to a commercially available photoreceiver module with 22-GHz BW and a linear gain of  $-70$  V/W. The input signal was a pseudo-random bit sequence (PRBS) with a length of  $2^7 - 1$  provided by the SHF 12100B generator. This pattern length is selected to provide a direct comparison with the state-of-the-art designs. Because of using non-phase-matched cables with a length of 0.9 m from the pattern generator to the board, differential signals at frequencies higher than 35 GHz showed considerable skew and attenuation affecting the measurement. Therefore, a D-flip-flop (D-FF) was utilized close to the test PCB to provide a better input signal. In contrast to on-wafer measurements via probing, there were reflections caused by the input transmission lines, on-board ac coupling, and also bond wires. These factors degraded the input signal quality. For the designed PCB, the measured jitter at the input of the chip was more than 50% higher than that of the signal generator. Fig. 10 shows the measurement setup.

Fig. 11 shows the measured optical eye diagrams for both the VCSELs and at different data rates. During measurements, all channels and VCSELs of the array were in ON state and consumed the nominal power. However, the signal was applied to one channel at each measurement, other than crosstalk measurements. The outputs at 40 Gb/s without applying the EQ are presented in Fig. 11(a) and (f). The high-swing driver showed error-free ( $\text{BER} < 10^{-12}$ ) operation, as suggested by simulations. The difference between these eye diagrams lies in the output power and group delay response of the two VCSELs. While the MM VCSEL has lower BW, it can reach higher power levels and, therefore, was affected less by the noise. In addition, it has different group-delay response, and the driver is better matched to the MM laser. Utilizing only one VCSEL and driver, the error-free transmission was possible up to 42 Gb/s without employing the EQ [16]. In the case of the multi-channel driver, due to the lowering of VCSEL efficiency at higher temperatures, this was not possible. At any rate higher than 40 Gb/s, utilizing the EQ was required for error-free operation. Increasing the bit rate to 45 Gb/s caused considerable reduction in the eye-opening and the transmission was no longer error-free. The received eye-opening was lowered from 22 to 9 mV and from 34 to 7 mV for the 20 and 18 GHz VCSELs, respectively. By applying the pre-emphasis, the eye was opened to 23 and 20 mV, respectively, enabling the data transmission with  $\text{BER} < 10^{-12}$ . While this behavior is in agreement with the presented simulation results, the amount of eye closure at higher frequencies and the EQ impact seemed to be underestimated. The reason is that the effect of the limited BW<sub>PD</sub> at the receiving side was not included in the simulations. At data rates close to double of the BW<sub>PD</sub>, its impact cannot be excluded from the measurement results. To achieve the highest eye-opening for the 20-GHz VCSEL, the OMA dropped to 0 dBm; however, it was possible to increase the swing of the driver and apply weaker pre-emphasis to optimize the output for high OMA of 1.4 dBm, as shown in Fig. 11(d). The eye-opening was dropped to 17 mV, but the measured recovered pattern was error free ( $\text{BER} < 10^{-12}$ ). In this configuration, the chip drew 200 and 54 mA from 1- and 1.7-V supply, respectively. The VCSEL array was biased from  $-2$ -V supply and required 17.3 mA. Therefore, the total power is 326.4 mW (81.6 mW per channel), yielding a PE of 1.81 pJ/bit. At 46 Gb/s, the eye was still open, though not error free, as shown in Fig. 11(e). For the 18-GHz VCSEL case, the chip drew 248 and 60 mA from 1 and 1.65 V, respectively. The negative supply of the VCSEL array is tied to a 1.53-V supply and it drew 22 mA, leading to a total power of 380.7 mW (95.2 mW per channel) and the PE of 2.11 pJ/bit. For this VCSEL, the EQ could keep an eye-opening of 13 mV at 47 Gb/s, while the received signal was not error free due to the sensitivity level at the input of BER tester [Fig. 11(i)]. The driver was also tested with a longer PRBS pattern ( $2^{31} - 1$ ) and an eye-opening reduction of approximately 1 dB was observed. This is in agreement to what was previously reported in [17]. A side-by-side comparison of the eye diagrams at 46 Gb/s is presented in Fig. 12.

As the bias current of each block can be independently adjusted, the total power dissipation could be reduced at lower



Fig. 11. Measured optical eye diagrams. 20 GHz VCSEL at (a) 40 Gb/s with no EQ, (b) 45 Gb/s with no EQ, (c) 45 Gb/s with EQ optimized for the largest eye-opening, (d) 45 Gb/s with EQ optimized for high OMA, and (e) 46 Gb/s with EQ. 18 GHz VCSEL at (f) 40 Gb/s with no EQ, (g) 45 Gb/s with no EQ, (h) 45 Gb/s with EQ, and (i) 47 Gb/s with EQ.



Fig. 12. Measured optical eye diagrams of the SM VCSEL for two different PRBS lengths. (a) 46 Gb/s with PRBS7. (b) 46 Gb/s with PRBS31.

data rates to achieve enhanced energy per bit values. The PE for both the VCSEL types can be kept below 2.1 pJ/bit at lower rates down to 15 Gb/s. The results are presented in Fig. 13. The bathtub curve was measured at the highest operation speed as well as low power (LP) settings, as shown in Fig. 14. For both the tested VCSEL types, the eye-opening is higher than 0.15 unit interval (UI) at 45 Gb/s, which is equal to 3.3 ps. For LP operation, the eye-opening is not worsened, and in the



Fig. 13. Measured PE of both VCSELs at different data rates.

18-GHz VCSEL case, due to lower jitter of the laser, there is even an improvement. This suggests that the constraints on the receiving side can also be relaxed.

To measure the turn-on time after applying the EN signal, a real-time scope was required. Therefore, a low-speed 200-MHz “1010” pattern was provided at the input. One of



Fig. 14. Measured bathtub test at 45, 30, and 15 Gb/s for (a) 20-GHz VCSEL and (b) 18-GHz VCSEL. For the latter two measurements, LP settings were used, as indicated in Fig. 13. UI stands for unit interval which is equivalent to the bit period.

the channels was turned on and off, and the time required from applying the enable signal to have a fully recovered signal at the output was measured. As shown in Fig. 15(a), this time was less than 10 ns, while it was possible to save 86% of the maximum power dissipation when no data are to be sent. This shows three times improvement compared with our previous design in [16]. The amount of power saving for the given duty cycle of the EN signal is presented in Fig. 15(b). By having such a fast reaction time, the adjustable output stage operating at over 40 Gb/s, and high power saving during standby, the presented design is suitable for burst-mode operation.

It is of high importance to measure the crosstalk between adjacent channels in a multi-channel system. At first, the BER at the highest bit rate was measured when the neighboring channel was also transmitting an uncorrelated pattern, but virtually no effect on the bathtub test was observed. Therefore, two sinusoidal signals at 19 and 21 GHz were applied at the input of the two transmitters. This is equivalent to “1010” pattern at 38 and 42 Gb/s. The output was then measured in the frequency domain using a spectrum analyzer. The worst case crosstalk is lower than  $-22$  dB, as presented in Fig. 16.



(a)



(b)

Fig. 15. (a) Power-on time measurement with low speed “0101” sequence. (b) Reduction of the power dissipation during burst-mode operation.



Fig. 16. Crosstalk measurement of two adjacent channels.

## V. CONCLUSION

In this paper, the design of a burst-mode capable, 45-Gb/s VCSEL driver in a 14-nm bulk CMOS technology was presented. By utilizing an enhanced biasing circuitry, the turn-on time of the complete transmitter was below 10 ns. The four-channel chip is based on the two-tap FFE architecture with the high-swing output stage. As a key difference with other CMOS designs, high-swing modulation current was provided while the driving transistor at the output stage was kept in the saturation region to avoid slow response. This was done by introducing an extra supply voltage. In addition, the output stage can be adjusted to accommodate a wide range of bias and modulation currents enabling it to drive various common-cathode lasers. Bonded on a PCB to two different VCSELs with BW of 18 and 20 GHz, error-free

TABLE II  
COMPARISON WITH THE STATE-OF-THE-ART DESIGNS

| Ref.                                 | [6]           | [7]         | [8]                      | [10]          | [16]        | This work               |             |
|--------------------------------------|---------------|-------------|--------------------------|---------------|-------------|-------------------------|-------------|
| Technology node                      | 130 nm SiGe   | 130 nm SiGe | 28 nm CMOS               | 14 nm CMOS    | 14 nm CMOS  | 14 nm CMOS              |             |
| Bit rate (Gb/s)                      | 71            | 50          | 40 <sup>1</sup>          | 32            | 42          | 45                      | 45          |
| PE (pJ/bit)                          | 13.4          | 3.8         | 0.5(0.33) <sup>2,3</sup> | 3.28          | 1.94        | 1.81(0.56) <sup>3</sup> | 2.11        |
| OMA (dBm)                            | ~1.8          | 0.7         | 1.3                      | 1.23          | 1.4         | 1.4                     | 1           |
| VCSEL BW (GHz)                       | 26            | 18          | 19                       | 15            | 20          | 20                      | 18          |
| I <sub>mod</sub> (mA <sub>pp</sub> ) | 20            | 8           | 3                        | 5             | 6           | 7                       | 7           |
| Driver type                          | Cathode drive | Anode drive | Anode drive              | Cathode drive | Anode drive | Anode drive             | Anode drive |
| Burst mode                           | No            | No          | No                       | No            | Yes         | Yes                     | Yes         |

<sup>1</sup> No BER is reported.

<sup>2</sup> No limiting amplifier or on-chip signal generator was included.

<sup>3</sup> Values in parentheses are referring to the PE at 40 Gb/s when only the same sub-blocks in driver chips are considered, and the VCSEL power is excluded.

(BER < 10<sup>-12</sup>) transmission with PE below 2.1 pJ/bit was measured. When the chip is in standby mode, a power reduction up to 86% can be achieved. To the best of our knowledge, this is the fastest CMOS design with the highest BWE capable of burst-mode operation.

A comparison with the state-of-art designs is presented in Table II. When the PE is compared against [8], several points have to be considered. First, the presented work offered higher bit rate at lower combined BW, as shown in Table I. Second, the utilized VCSEL in this paper required higher bias voltage and modulation current to operate. Third, the design in [8] does not feature an LA and, therefore, is sensitive to the shape and the amplitude of the electrical input signal. If the core functionality of the driver, excluding VCSEL bias, is considered, the power dissipation of the proposed driver exceeds the one in [8] by only 9 mW at 40 Gb/s. It should also be added that the presented work has the highest current driving capability among CMOS designs.

#### ACKNOWLEDGMENT

The authors would like to thank IBM Research Zurich, Rüschlikon, Switzerland, for the chip fabrication, helpful discussions, and technology support, Argotech, Trutnov, Czech Republic for board designing, manufacturing, and packaging, and VI-Systems, Berlin, Germany, for providing the VCSEL chips. M. Khafaji thanks Abo-Al-Fadhl for his help.

#### REFERENCES

- [1] A. Roy, H. Zeng, J. Bagga, G. Porter, and A. C. Snoeren, "Inside the social network's (datacenter) network," *ACM SIGCOMM Comput. Commun. Rev.*, vol. 45, no. 4, pp. 123–137, 2015. [Online]. Available: <http://doi.acm.org/10.1145/2785956.2787472>
- [2] A. Rylyakov *et al.*, "A 25Gb/s burst-mode receiver for rapidly reconfigurable optical networks," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2015, pp. 1–3.
- [3] J. Chen *et al.*, "An energy efficient 56 Gbps PAM-4 VCSEL transmitter enabled by a 100 Gbps driver in 0.25 μm InP DHBT technology," *J. Lightw. Technol.*, vol. 34, no. 21, pp. 4954–4964, Nov. 1, 2016.
- [4] K. Hara, S. Kimura, H. Nakamura, N. Yoshimoto, and H. Hadama, "New AC-coupled burst-mode optical receiver using transient-phenomena cancellation techniques for 10 Gbit/s-class high-speed TDM-PON systems," *J. Lightw. Technol.*, vol. 28, no. 19, pp. 2775–2782, Oct. 1, 2010.
- [5] T. Morf *et al.*, "VCSEL-based optical links in burst-mode slow optical power ramp-up and how to achieve ultra-short wake-up times," *Electron. Lett.*, vol. 53, no. 19, pp. 1325–1327, Sep. 2017.
- [6] D. M. Kuchta *et al.*, "A 71-Gb/s NRZ modulated 850-nm VCSEL-based optical link," *IEEE Photon. Technol. Lett.*, vol. 27, no. 6, pp. 577–580, Mar. 15, 2015.
- [7] G. Belfiore, M. Khafaji, R. Henker, and F. Ellinger, "A 50 Gb/s 190 mW asymmetric 3-tap FFE VCSEL driver," *IEEE J. Solid-State Circuits*, vol. 52, no. 9, pp. 2422–2429, Sep. 2017.
- [8] A. Sharif-Bakhtiar, M. G. Lee, and A. C. Carusone, "A 40-Gbps 0.5-pJ/bit VCSEL driver in 28 nm CMOS with complex zero equalizer," in *Proc. IEEE Custom Integr. Circuits Conf. (CICC)*, Apr. 2017, pp. 1–4.
- [9] M. Raj, M. Monge, and A. Emami, "A modelling and nonlinear equalization technique for a 20 Gb/s 0.77 pJ/b VCSEL transmitter in 32 nm SOI CMOS," *IEEE J. Solid-State Circuits*, vol. 51, no. 8, pp. 1734–1743, Aug. 2016.
- [10] J. Proesel *et al.*, "A 32 Gb/s, 4.7 pJ/bit optical link with −11.7 dBm sensitivity in 14 nm FinFET CMOS," in *Proc. Symp. VLSI Circuits*, Jun. 2017, pp. C318–C319.
- [11] G. D. Brown, "Bandwidth and rise time calculations for digital multimode fiber-optic data links," *J. Lightw. Technol.*, vol. 10, no. 5, pp. 672–678, May 1992.
- [12] G. Belfiore, M. Khafaji, R. Henker, and F. Ellinger, "A compact electro-optical VCSEL model for high-speed IC design," in *Proc. 12th Conf. Ph.D. Res. Microelectron. Electron. (PRIME)*, Jun. 2016, pp. 1–4.
- [13] B. Razavi, *Design of Integrated Circuits for Optical Communications*, 2nd ed. Hoboken, NJ, USA: Wiley, 2012.
- [14] J. Singh *et al.*, "14 nm FinFET technology for analog and RF applications," in *Proc. Symp. VLSI Technol.*, Jun. 2017, pp. T140–T141.
- [15] R. Puerta *et al.*, "Effective 100 Gb/s IM/DD 850-nm multi- and single-mode VCSEL transmission through OM4 MMF," *J. Lightw. Technol.*, vol. 35, no. 3, pp. 423–429, Feb. 1, 2017.
- [16] M. Khafaji, J. Pliva, R. Henker, and F. Ellinger, "A 42-Gb/s VCSEL driver suitable for burst mode operation in 14-nm bulk CMOS," *IEEE Photon. Technol. Lett.*, vol. 30, no. 1, pp. 23–26, Jan. 1, 2018.
- [17] J. E. Proesel, B. G. Lee, C. W. Baks, and C. L. Schow, "35-Gb/s VCSEL-based optical link using 32-nm SOI CMOS circuits," in *Proc. Opt. Fiber Commun. Conf. Expo. Nat. Fiber Opt. Eng. Conf. (OFC/NFOEC)*, Mar. 2013, pp. 1–3.



**Mohammad Mahdi Khafaji** was born in Tehran, Iran, in 1982. He received the Ph.D. degree (*summa cum laude*) from the Technische Universität Dresden (TUD), Dresden, Germany, in 2015.

From 2008 to 2012, he was with Innovations for High Performance (IHP) Microelectronics, Frankfurt (Oder), Germany, where he was involved in high-speed digital-to-analog converters. He is currently with the Chair for Circuit Design and Network Theory, TUD. His current research interests include high-speed data converters and broadband circuits for optical communication.



**Guido Belfiore** received the B.Sc. and M.Sc. degrees in electrical engineering from the University of L'Aquila, L'Aquila, Italy, in 2009 and 2012, respectively. He is currently pursuing the Ph.D. degree with the Chair for Circuit Design and Network Theory, Technische Universität Dresden, Dresden, Germany. His M.Sc. thesis was on interface circuit design for resistive sensors.

In 2011, he participated in an exchange program at the University of Glasgow, Glasgow, U.K., where he investigated the dry etch processes for high-k dielectric material layers. His current research interests include broadband high-speed analog driver IC design for optical components.



**Jan Pliva** was born in 1988 in Liberec, Czech Republic. He received the Dipl.-Ing. degree in electrical engineering from the Technische Universität Dresden (TUD), Dresden, Germany, in 2014.

He joined the Chair for Circuit Design and Network Theory, TUD, as a Research Associate in 2014. He is focusing on broadband amplifier design for opto-electronic data receivers and mixed-signal circuit design. He was also involved in high-speed successive-approximation-register analog-to-digital converter design with IBM Research Zurich, Rüschlikon, Switzerland, and TUD from 2011 to 2013.



**Ronny Henker** received the Dipl.-Ing. degree in communications engineering from the Hochschule für Telekommunikation Leipzig (HfT Leipzig), Leipzig, Germany, in 2006, and the Ph.D. degree from the Dublin Institute of Technology, Dublin, Ireland, in 2010.

From 2006 to 2010, he was with the RF Institute, HfT Leipzig, where his research focused on the applications of nonlinear effects in optical communications. Before joining the Chair of Circuit Design and Network Theory (CCN), Technische Universität Dresden (TUD), Dresden, Germany, he was with the Fraunhofer Institute for Photonic Microsystems, Dresden, where he was involved in the characterization of MOEMS and micro-mirror arrays. He is currently the Group Leader and the Project Manager of CCN, TUD, since 2011. He also acts as the Operational Manager of the EU FP7 and H2020 Projects ADDAPT and DIMENSION. His current research interests include adaptive low-power high-speed and broadband integrated circuits for optical communications.



**Frank Ellinger** (S'97–M'01–SM'06) was born in Friedrichshafen, Germany, in 1972. He received the Diploma degree in electrical engineering from the University of Ulm, Ulm, Germany, in 1996, and the M.B.A. and Ph.D. degrees in electrical engineering and the Habilitation degree in high-frequency circuit design from ETH Zürich (ETHZ), Zürich, Switzerland, in 2001 and 2004, respectively.

Since 2006, he has been a Full Professor and Head of the Chair for Circuit Design and Network Theory, Technische Universität Dresden, Dresden, Germany. From 2001 to 2006, he was the Head of the RFIC Design Group, Electronics Laboratory, ETHZ, and the Project Leader of the IBM/ETHZ Competence Center for Advanced Silicon Electronics hosted at IBM Research, Rüschlikon, Switzerland. He has been coordinator of the projects RESOLUTION, MIMAX, DIMENSION, ADDAPT, and FLEXIBILITY funded by the European Union. He coordinates the cluster project FAST with over 90 partners (most of them from industry) and the Priority Program FFlexCom of the German Research Foundation (DFG), Germany. He has been a member of the Management Board of the German Excellence Cluster Cool Silicon, Germany. He has authored or co-authored over 430 refereed scientific papers, and authored the lecture book *Radio Frequency Integrated Circuits and Technologies* (Springer, 2008).

Prof. Ellinger was an elected IEEE Microwave Theory and Techniques Society Distinguished Microwave Lecturer from 2009 to 2011. He has received several awards, including the IEEE Outstanding Young Engineer Award, the Vodafone Innovation Award, the Alcatel-Lucent Science Award, the ETH Medal, the Denzler Award, the Rohde & Schwarz/Agilent/GerotronEEfCOM Innovation Award (twice), and the ETHZ Young Ph.D. Award.