

# **EE 437/538B: Integrated Systems**

## **Capstone/Design of Analog Integrated Circuits and Systems**

### **Lecture 7: Optical Rx (Part 2)**

Prof. Sajjad Moazeni

[smoazeni@uw.edu](mailto:smoazeni@uw.edu)

Spring 2022

# A Full Photonic Link



# Transimpedance Amplifier (TIA)



low-parasitic  
bondy. Also expressed in units of dBΩ by  $20\log(|Z_T|)$

- Key design objectives
  - ✓ High transimpedance gain
  - ✓ Low input resistance for high bandwidth and efficient gain
- For large input currents, the TIA gain can compress and pulse-width distortion/jitter can result

{  
Low-power  
Low-noise

# Resistive Front-End

[Razavi]

$$C_D = I_{off}$$

$$R_L = 1k\Omega$$

$$BW \approx 5 \text{ GHz}$$

$$\cancel{25 \text{ GHz}}$$



$$R_T = R_{in} = \underline{\underline{R_L}}$$

$$BW_{3dB} = \omega_p = \frac{1}{R_{in}C_D} = \frac{1}{R_L C_D}$$



$$\overline{V_{n,out}^2} = \int_0^\infty I_n^2 Z_T^2 df = \int_0^\infty \frac{4kT}{R_L} \left( \frac{R}{1 + j2\pi f RC} \right)^2 df = \frac{kT}{C_D}$$

$$\overline{I_{n,in}^2} = \frac{\overline{V_{n,out}^2}}{R_L^2} = \frac{kT}{R_L^2 C_D}$$

$$I_{n,in,rms} = \frac{\sqrt{KT/C_D}}{R_L}$$

- Direct trade-offs between transimpedance, bandwidth, and noise performance

[Sam Palermo]

# T-Coil/Inductive Input Stage



# Common-Gate TIA

[Razavi]



$$R_{in} = \frac{r_o + R_D}{1 + (g_m + g_{mb})r_o} \approx \frac{1}{g_m}$$

$R_T = R_D$   
 $R_T \& R_{in}$   
decoupled!  
 $\approx 1/g_m$

- Input resistance (input bandwidth) and transimpedance are decoupled

[Sam Palermo]

# Common-Gate TIA Frequency Response

[Razavi]



Neglecting transistor  $r_o$  :

$$\frac{v_{out}}{i_{in}} = \frac{R_D}{\left(1 + s \frac{C_{in}}{g_{m1} + g_{mb1}}\right)(1 + sR_DC_{out})}$$

- Often the input pole may dominate due to large photodiode capacitance (100 – 500fF)

[Sam Palermo]

# Common-Gate TIA Noise

[Razavi]



Neglecting transistor  $r_o$ :

$$\overline{V_{n, out}^2} = \left( \overline{I_{n, M2}^2} + \overline{I_{n, RD}^2} \right) R_D^2 = 4kT \left( \frac{2}{3} g_{m2} + \frac{1}{R_D} \right) R_D^2 \quad \left( \frac{\text{V}^2}{\text{Hz}} \right)$$

$$\overline{I_{n, in}^2} = 4kT \left( \frac{2}{3} g_{m2} + \frac{1}{R_D} \right) \quad \left( \frac{\text{A}^2}{\text{Hz}} \right)$$

Both the bias current source and  $R_D$  contribute to the input noise current

$R_D$  can be increased to reduce noise, but voltage headroom can limit this

- Common-gate TIAs are generally not for low-noise applications
- However, they are relatively simple to design with high stability

[Sam Palermo]



# Regulated Cascode (RGC) TIA

- Input transistor gm is boosted by common-source amplifier gain, resulting in reduced input resistance
- Requires additional voltage headroom
- Increased input-referred noise from the common-source stage

[Park ESSCIRC 2000]



$$Z_{in}(0) \equiv \frac{1}{g_{m1} \underbrace{(1 + g_{mB} R_B)}_{\text{Gain Boosting}}}$$

[Sam Palermo]

# CMOS 20GHz TIA

- An additional common-gate stage in the feedback provides further gm-boosting and even lower input resistance
- Shunt-peaking inductors provide bandwidth extension at zero power cost, but very large area cost

[Kromer JSSC 2004]



(b)

$$Z_i \approx \frac{1}{g_{m1} (1 + |A_2 A_3|) + j\omega C_{i,tot}}$$

$$A_2 = g_{m2} R_2 \quad A_3 = -g_{m3} R_3$$

13

[Sam Palermo]

# Feedback TIA w/ Ideal Amplifier



With Infinite Bandwidth Amplifier:

$$Z_T(s) = -R_T \left( \frac{1}{1 + s/\omega_p} \right)$$

$$R_{in} = \frac{R_F}{A+1}$$

$$\underline{R_T = \frac{A}{A+1} R_F}$$

$$\underline{\omega_p = \frac{1}{R_{in} C_T} = \frac{A+1}{R_F (C_D + C_I)}}$$

- Input bandwidth is extended by the factor  $A+1$
- Transimpedance is approximately  $R_F$
- Can make  $R_F$  large without worrying about voltage headroom considerations

# Feedback TIA w/ Finite Bandwidth Amplifier



**With Finite Bandwidth Amplifier :**

$$A(s) = \frac{A}{1 + \frac{s}{\omega_A}} = \frac{A}{1 + sT_A}$$

$$Z_T(s) = -R_T \left( \frac{1}{1 + s/(\omega_o Q) + s^2/\omega_o^2} \right)$$

$$R_T = \frac{A}{A+1} R_F$$

$$\omega_o = \sqrt{\frac{A+1}{R_F C_T T_A}}$$

$$Q = \frac{\sqrt{(A+1)R_F C_T T_A}}{R_F C_T + T_A}$$

$$R_{in} = \frac{R_F}{A+1}$$

- Finite bandwidth amplifier modifies the transimpedance transfer function to a second-order low-pass function

# Feedback TIA w/ Finite Bandwidth Amplifier

- Non-zero amplifier time constant can actually increase TIA bandwidth!!
- However, can result in peaking in frequency domain and overshoot/ringing in time domain
- Often either a Butterworth ( $Q=1/\sqrt{2}$ ) or Bessel response ( $Q=1/\sqrt{3}$ ) is used
  - Butterworth gives maximally flat frequency response
  - Bessel gives maximally flat group-delay



2nd-Order TIA Frequency Response



17

[Sam Palermo]

# Feedback TIA Transimpedance Limit

If we assume a Butterworth response for maximally flat frequency response :

$$Q = \frac{1}{\sqrt{2}} \quad \Rightarrow \quad \omega_A = \frac{1}{T_A} = \frac{2A}{R_F C_T}$$

For a Butterworth response :

$$\omega_{3dB} = \omega_0 = \sqrt{\frac{(A+1)\omega_A}{R_F C_T}} = \frac{\sqrt{(A+1)2A}}{R_F C_T} \approx \sqrt{2} \text{ times larger than } T_A = 0 \text{ case of } \frac{A+1}{R_F C_T}$$

Plugging  $R_T = \frac{A}{A+1} R_F$  into above expression yields the maximum possible  $R_T$  for a given bandwidth

$$\sqrt{\left(\frac{A+1}{A}\right) R_T C_T} \geq \omega_{3dB}$$

$$\boxed{\text{Maximum } R_T \leq \frac{A \omega_A}{C_T \omega_{3dB}^2}}$$

[Mohan JSSC 2000]

- Maximum  $R_T$  proportional to amp gain-bandwidth product
- If amp GBW is limited by technology  $f_T$ , then in order to increase bandwidth,  $R_T$  must decrease quadratically!

[Sam Palermo]

# Feedback TIA



- As power supply voltages drop, there is not much headroom left for R<sub>D</sub> and the amplifier gain degrades

[Sam Palermo]

# CMOS Inverter-Based Feedback TIA



[Li JSSC 2014]



Cherry  
Ho-pe  
Stage

- CMOS inverter-based TIAs allow for reduced voltage headroom operation
- Cascaded inverter-gm + TIA stage provide additional voltage gain
- Low-bandwidth feedback loop sets the amplifier output common-mode level

20

[Sam Palermo]

# Input-Referred Noise Current



- TIA noise is modeled with an input-referred noise current source that reproduces the output TIA output noise when passed through an ideal noiseless TIA
- This noise source will depend on the source impedance, which is determined mostly by the photodetector capacitance

[Sam Palermo]

# Input-Referred Noise Current Spectrum



- Input-referred noise current spectrum typically consists of uniform, high-frequency  $f^2$ , & low-frequency  $1/f$  components
- To compare TIAs, we need to see this noise graph out to  $\sim 2X$  the TIA bandwidth
  - Recall the noise bandwidth tables

[Sam Palermo]

# Input-Referred RMS Noise Current

---

- The input-referred rms noise current can be calculated by dividing the rms output noise voltage by the TIA's midband transimpedance value

$$i_{n,TIA}^{rms} = \frac{1}{R_T} \sqrt{\int_0^{>2BW} |Z_T(f)|^2 I_{n,TIA}^2(f) df}$$

- If we integrate the output noise, the upper bound isn't too critical. Often this is infinity for derivations, or 2X the TIA bandwidth in simulation
- This rms current sets the TIA's electrical sensitivity

$$\underline{i_{sens}^{pp}} = 2Q \underline{i_{n,TIA}^{rms}}$$

- To determine the total optical receiver sensitivity, we need to consider the detector noise and responsivity

[Sam Palermo]

# Averaged Input-Referred Noise Current Density

---

- TIA noise performance can also be quantified by the averaged input-referred noise current density

$$i_{n,TIA}^{avg} = \frac{i_{n,TIA}^{rms}}{\sqrt{BW_{3dB}}}$$

This quantity has units of  $\left( \frac{\text{pA}}{\sqrt{\text{Hz}}} \right)$ .

Note, this is different than averaging the input - referred noise spectrum,

$I_{n,TIA}^2(f)$  over the TIA bandwidth.

[Sam Palermo]

# FET Feedback TIA Input-Referred Noise Current Spectrum



- The feedback resistor and amplifier front-end noise components determine the input-referred noise current spectrum

$$I_{n,TIA}^2(f) = I_{n,res}^2(f) + I_{n,front}^2(f)$$

- The feedback resistor component is uniform with frequency

$$I_{n,res}^2(f) = \frac{4kT}{R_F}$$

# FET Feedback TIA Input-Referred Noise Current Spectrum



- Gate current-induced shot noise

$$I_{n,G}^2 = 2qI_G$$

This is typically small for CMOS designs

- FET channel noise

$$I_{n,D}^2 = 4kT\Gamma g_m$$

$\Gamma$  is the channel noise factor, typically 0.7 - 3 depending on the process.

[Sam Palermo]

# Input-Referring the FET Channel Noise

To do this, we could calculate  $\frac{i_{n,TIA}}{i_{n,D}} = \frac{\left(\frac{v_{out}}{i_{n,D}}\right)}{Z_T}$

But it is easier (and equivalent) to ground the output and calculate

$$i_{n,D} = g_m v_{n,TIA} = \frac{g_m i_{n,TIA}}{sC_T + \frac{1}{R_F}} = \frac{g_m R_F}{1 + sR_F C_T} i_{n,TIA}$$

where  $C_T = C_D + C_I$ , the summation of the detector and amplifier input capacitance.

$$\left(\frac{i_{n,D}}{i_{n,TIA}}\right)^{-1} = \frac{1 + sR_F C_T}{g_m R_F}$$

Using this high - pass transfer function, the input - referred FET channel noise is

$$\begin{aligned} I_{n,front,D}^2(f) &= \frac{1 + (2\pi f R_F C_T)^2}{(g_m R_F)^2} \cdot 4kT\Gamma g_m \\ &= 4kT\Gamma \left(\frac{1}{g_m R_F^2}\right) + 4kT\Gamma \left(\frac{(2\pi C_T)^2}{g_m}\right) f^2 \end{aligned} \quad \text{Uniform and } f^2 \text{ component!}$$



# Total Input-Referred FET Feedback TIA Noise



$$I_{n,TIA}^2(f) = \frac{4kT}{R_F} + 2qI_G + 4kT\Gamma\left(\frac{1}{g_m R_F^2}\right) + 4kT\Gamma\left(\frac{(2\pi C_T)^2}{g_m}\right)f^2$$

Feedback Resistor

Gate Shot Noise

FET Channel Noise

- Note that the TIA input-referred noise current spectrum begins to rise at a frequency lower than the TIA bandwidth

[Sam Palermo]

# Differential TIAs

- Differential circuits have superior immunity to power supply/substrate noise
- A differential TIA output allows easy use of common differential main/limiting amplifiers
  - This comes at the cost of higher noise and power
- How to get a differential output with a single-ended photocurrent input?
  - Two common approaches, based on the amount of capacitance applied at the negative input



$\times 2 \times \text{power}$   
 $\times 2 \times \text{noise}$

[Sam Palermo]

# Balanced TIA

- A balanced TIA design attempts to match the capacitance of the two differential inputs

$$C_X \approx C_D$$

- This provides the best power supply/substrate noise immunity, as the noise transfer functions are similar
- Due to double the circuitry, the input-referred rms noise current is increased by  $\sqrt{2}$



Assuming an high BW amplifier  
and  $C_T = C_D + C_I$

$$Z_T(s) = \frac{v_{OP} - v_{ON}}{i_i} = \frac{\left(\frac{A}{A+1}\right)R_F}{1 + \frac{sC_T R_F}{A+1}}$$

Same transfer function as the single - ended design [Sam Palermo]

# Pseudo-Differential TIA

- A pseudo-differential TIA design uses a very large capacitor at the negative input, such that it can be approximated as an AC ground  $C_X \rightarrow \infty$
- While not good to reject power supply/substrate noise, it does provide significant filtering of the  $R_F'$  noise
- The differential transimpedance is approximately doubled relative to the single-ended case



Assuming an high BW amplifier  
and  $C_T = C_D + C_I$

$$Z_T(s) = \frac{v_{OP} - v_{ON}}{i_i} = \frac{\left(\frac{2A}{A+2}\right)R_F}{1 + \frac{sC_T R_F}{\frac{A}{2} + 1}}$$

# *Single-ended TIA to Differential Output*



Pseudo-Diff.

# Need for a Pre-Amp

VGA; Variable Gain Amp.

✓ offset bias level ( $TIA \rightarrow S.A$ )

✓ reduce the kick-back  
"noise"

Ripple coupling path

between Clk & signal



# Offset Control

- Due to the single-ended photodetector signal, the differential output signal swings from 0 to  $V_{ppd}$ , which can limit the dynamic range
- Adding offset control circuitry can allow for an output swing of  $\pm V_{ppd}/2$



--

[Sam Palermo]

# Optical RX Scaling Issues

- 😊 Traditionally, TIA has high  $R_T$  and low  $R_{in}$

$$R_T = R_F \left( \frac{A}{1+A} \right)$$

$$\omega_{3dB} \approx \frac{1+A}{R_F C_{IN}}$$

- 😢 Headroom/Gain issues in 1V CMOS

- $A \approx 2 - 3$  < 10

- 😢 Power/Area Costs

$$\text{TIA } I_D \propto (R_T C_{IN})^2 f_{3dB}^4$$

$$\text{LA } I_D \propto f_{3dB}^2$$



$$V_A = V_{GS1} + V_{GS2} \approx 0.8 * VDD$$

$$A \approx g_m R_D = \frac{\alpha(VDD - V_A)}{VOD} \approx \frac{\alpha(0.2 * VDD)}{VOD}$$

[Sam Palermo]

# State-of-the-art TIAs

PAM-4

Ref: "A 128 Gb/s, 11.2 mW Single-Ended PAM4 Linear TIA With  $2.7 \mu\text{Arms}$  Input Noise in 22 nm FinFET CMOS", Intel JSSC 2022

- Multi-stage Inv. Based
- Series vs. shunt inductive peaking



Fig. 9. Schematic of the single-ended four-stage TIA with series and shunt inductive peaking for bandwidth enhancement.

TABLE II  
COMPARISON TO STATE-OF-THE-ART LINEAR TIAs

*Low W ~ 0.18/ $\lambda$*   
*Low Gb/s*

| Ref.                                               | This Work         | [5]<br>BCTM<br>2015 | [9]<br>TCASI<br>2018 | [26]<br>SOCC<br>2019 | [6]<br>CSICS<br>2016 | [8]<br>JSSC<br>2018 | [46]<br>ESSCIRC<br>2018 | [43]<br>JSSC<br>2019 | [17]<br>JSSC<br>2019 | [47]<br>SSCL<br>2020 |
|----------------------------------------------------|-------------------|---------------------|----------------------|----------------------|----------------------|---------------------|-------------------------|----------------------|----------------------|----------------------|
| Technology                                         | 22 nm FinFET      | 55 nm SiGe BiCMOS   | 130 nm SiGe BiCMOS   | 65 nm CMOS           | 130 nm SiGe BiCMOS   | 130 nm SiGe BiCMOS  | 28 nm CMOS              | 28 nm CMOS           | 16 nm FinFET         | 28 nm CMOS           |
| Interface                                          | SE                | SE                  | SE                   | SE                   | Diff.                | Diff.               | S2D                     | S2D                  | S2D                  | Diff.                |
| Application                                        | IM <sup>†</sup>   | IM                  | IM                   | IM                   | CD <sup>††</sup>     | CD                  | IM                      | IM                   | IM                   | CD                   |
| Supply Voltage (V)                                 | 0.8               | 2.3                 | 2.0                  | 1.0                  | 3.3                  | 3.3                 | 0.9                     | 1.2                  | 1.8                  | 2.4                  |
| S <sub>21</sub> (dB)                               | 26                | 13                  | —                    | —                    | —                    | 27                  | 24.5                    | —                    | 33                   | 39**                 |
| BW <sub>3dB-S<sub>21</sub></sub> (GHz)             | 48.7              | 92                  | —                    | —                    | —                    | 50                  | 62                      | —                    | 17                   | 10**                 |
| PD Cap. (fF)                                       | 70                | —                   | 100                  | 60                   | 60                   | —                   | 70                      | 80                   | 10                   | —                    |
| Z <sub>T</sub> (dB.Ω)                              | 59.3              | —                   | 41                   | 49                   | 77                   | 65                  | 65                      | 74                   | 78                   | 78                   |
| BW <sub>3dB-Z<sub>T</sub></sub> (GHz)              | 45.5              | —                   | 50                   | 29.9                 | 34                   | 66                  | 60                      | 27                   | 27                   | 42                   |
| Data Rate (Gb/s)                                   | 128               | 120                 | 50                   | 45                   | 45                   | 100                 | 112                     | 54                   | 106.25               | —                    |
| TDECQ                                              | 1.80              | —                   | —                    | —                    | —                    | —                   | —                       | —                    | 1.78                 | —                    |
| Input Ref. Noise Density (pA/ $\sqrt{\text{Hz}}$ ) | 12.6              | —                   | 39.8                 | 30                   | 20                   | 7.6                 | 19.3                    | —                    | 16.7                 | 18                   |
| P <sub>dc</sub> (mW)                               | 11.2              | 48                  | 24                   | 16.4                 | 285                  | 150                 | 107                     | 34.6                 | 60.8                 | 319                  |
| FoM* ( $\Omega \cdot \text{GHz}/\text{mW}$ )       | 3797 <sup>§</sup> | 428 <sup>‡</sup>    | 234                  | 513                  | 845                  | 782                 | 997                     | 145                  | 3527.4               | 1046                 |

<sup>†</sup> Intensity Modulation

<sup>††</sup> Coherent Detection

<sup>\*\*</sup> Estimated from the plots

$$* \text{FoM} = \frac{Z_T[\Omega] \times \text{BW}_{3dB-Z_T}[\text{GHz}]}{P_{dc}[\text{mW}]}$$

<sup>§</sup> Does not include the impact of LDO

<sup>‡</sup> Z<sub>T</sub> is estimated from  $50 \times S_{21}$  and BW<sub>3dB-S<sub>21</sub></sub> is used for BW<sub>3dB-Z<sub>T</sub></sub>.

# Integrating Receiver Block Diagram



[Emami VLSI 2002]

# Demultiplexing Receiver



- Demultiplexing with multiple clock phases allows higher data rate
  - Data Rate = #Clock Phases x Clock Frequency
  - Gives sense-amp time to resolve data
  - Allows continuous data resolution

# 1V Modified Integrating Receiver



## Differential Buffer

- 😊 Fixes sense-amp common-mode input for improved speed and offset performance
- 😊 Reduces kickback charge
- 😢 Cost of extra power and noise

Input Range = 0.6 – 1.1V

# Receiver Sensitivity Analysis



$$\text{Max } \Delta V_{in}(\Delta I_{AVG}) = 0.6\text{mV}$$

$$\sigma_{samp} = \sqrt{\frac{2kT}{C_{samp}}} = 0.92\text{mV} \quad \sigma_{buffer} = 1.03\text{mV} \quad \sigma_{SA} = 0.45\text{mV}$$

$$\text{Clock Jitter Noise } \sigma_{clk} = \left( \frac{\sigma_j}{T_b} \right) \Delta v_b \approx 0.65\text{mV at 16Gb/s}$$

$$\text{Total Input Noise } \sigma_{tot} = \sqrt{\sigma_{samp}^2 + \sigma_{buffer}^2 + \sigma_{SA}^2 + \sigma_{clk}^2} = 1.59\text{mV}$$

$$\Delta V_b \text{ for BER} = 10^{-10} = 6.4\sigma_{tot} + \text{Offset} = 11.9\text{mV}$$

$$P_{avg} = \frac{\Delta V_b (C_{pd} + C_{in})}{\rho T_b}$$

| Gb/s | $P_{avg}$ (dBm) |
|------|-----------------|
| 10   | -9.8            |
| 16   | -7.8            |

S

43

[Sam Palermo]

# Integrating RX with Dynamic Threshold



# Low-BW TIA & CTLE Front-End



- Improved sensitivity is possible by increasing the first stage feedback resistor, resulting in a high-gain low-bandwidth TIA
- The resultant ISI is cancelled by a subsequent CTLE

[Sam Palermo]

# Active CTLE Example



# Low-BW TIA & CTLE Front-End



$$\overline{I_{n,in,SF}^2(f)} = \frac{4kT}{R_F} + \frac{4kT\gamma}{g_m R_F^2} + 4kT\gamma \frac{(2\pi C_{tot,in})^2}{g_m} f^2 + \frac{4kT\gamma}{g_{m,post} R_F^2} + \frac{4kT\gamma}{g_{m,post} R_F^2} \left( \frac{f}{BW} \right)^4$$

$$\begin{aligned} \overline{I_{n,in,TSFE}^2(f)} &= \frac{4kT}{R_F n^2} + \frac{4kT\gamma}{g_m R_F^2 n^4} \\ &\quad + 4kT\gamma \frac{(2\pi R_F n^2 C_{tot,in})^2}{g_m R_F^2 n^4} f^2 \\ &\quad + \frac{4kT\gamma}{g_{m,eq} R_F^2 n^4} + \frac{4kT\gamma}{g_{m,eq} R_F^2 n^4} \left( \frac{f}{BW/n} \right)^4 \\ &= \frac{4kT}{R_F n^2} + \frac{4kT\gamma}{g_m R_F^2 n^4} + 4kT\gamma \frac{(2\pi C_{tot,in})^2}{g_m} f^2 \\ &\quad + \frac{4kT\gamma}{g_{m,eq} R_F^2 n^4} + \frac{4kT\gamma}{g_{m,eq} R_F^2} \left( \frac{f}{BW} \right)^4 \quad (5) \end{aligned}$$

[Li JSSC 2014]



- Significant reduction in feedback resistor noise
- Low-frequency input and post amplifier noise is also reduced

# Low-BW TIA & CTLE Front-End

[Li JSSC 2014]



X power of  
CTLE

25Gb/s Eye Diagram



[Sam Palermo]

# Low-BW TIA & DFE RX

[Ozkaya JSSC 2017]



- In a similar manner, a high-gain low-bandwidth TIA is utilized
- The resultant ISI is cancelled by a subsequent 1-tap loop-unrolled DFE

[Sam Palermo]

# Low-BW TIA & DFE RX

[Ozkaya JSSC 2017]

1<sup>st</sup> order TIA model pulse response (56Gb/s)



- As RF is increased, the main cursor increases and the SNR improves as ISI is cancelled by a DFE
- Large performance benefit with a low-complexity 1-tap DFE

[Sam Palermo]

# Low-BW TIA & DFE RX

[Ozkaya JSSC 2017]



- Self-referenced TIA is used for differential generation
- Actual 64Gb/s pulse response has a significant pre-cursor ISI tap, which requires a 2-tap TX FFE

**64Gb/s Pulse Response & Timing Margin**



55

[Sam Palermo]