

# A 370-pJ/b Multichannel BFSK/QPSK Transmitter Using Injection-Locked Fractional-N Synthesizer for Wireless Biotelemetry Devices

Kok-Hin Teng, *Student Member, IEEE*, and Chun-Huat Heng, *Senior Member, IEEE*

**Abstract**—This paper presents a 401–428-MHz BFSK/QPSK transmitter (TX) with two types of fractional-injection-locking techniques for multichannel transmission capabilities. A  $\Delta\Sigma$ -based injection-locked ring oscillator is proposed to achieve fine frequency tuning with a frequency resolution of 1.3 kHz. The proposed method facilitates multichannel BFSK modulation. For high data rate wideband QPSK modulation, frequency tuning is achieved through sequential injection locking. The TX performs 550 Kb/s for BFSK and 11 Mb/s for band-shaped QPSK with EVM of 4.4% and 4.9%, respectively. It also achieves a settling time of less than 0.8  $\mu$ s. This helps to save operating power of the wireless medical devices with duty-cycling protocol employed in the TX. Fabricated in 130-nm process technology, the TX achieves an energy efficiency of 370 pJ/b while delivering –13 dBm of output power with 1-V supply.

**Index Terms**—Delta-sigma, energy efficient, fractional frequency-shift keying (FSK), injection-locked oscillator, low power, multichannel, phase-shift keying, transmitter (TX).

## I. INTRODUCTION

TRANSCEIVERS operating at the Medical Device Radio Communications Service (MedRadio) or Medical Implant Communication Service (MICS) band in the range of 401–406 MHz have drawn a lot of interest due to its low body loss characteristic. However, the limited channel spacing of 100~300 kHz has constrained the achievable data rate.

For low or moderate data rate, frequency-shift keying (FSK) modulation with its constant amplitude modulation improves the power amplifier (PA) efficiency, and simple noncoherent receiver architecture can be employed. However, for higher data rate, quadrature phase shift keying (QPSK) or quadrature amplitude modulation (QAM) exhibits better spectral efficiency. Although it requires coherent receiver architecture, such receiver is usually implemented off-body and would not face the same size and energy constraints of the on-body transmitter (TX) due to the asymmetric nature of the uplink and the downlink data for biotelemetry.

Open-loop digital controlled oscillator has been commonly adopted for FSK modulation [1], [2]. The *LC* oscillator employed could incur area penalty. In addition, due to the open-loop nature, wide frequency deviation is needed to ensure

Manuscript received August 24, 2016; revised November 10, 2016 and December 15, 2016; accepted January 3, 2017. Date of publication January 31, 2017; date of current version March 3, 2017. This paper was approved by Associate Editor Wooguen Rhee. This work was supported by the National Research Foundation of Singapore under Grant NRF-CRP8-2011-01.

The authors are with the Department of Electrical and Computer Engineering, National University of Singapore, Singapore 117583.

Color versions of one or more of the figures in this paper are available online at <http://ieeexplore.ieee.org>.

Digital Object Identifier 10.1109/JSSC.2017.2650407

good SNR, which in turn reduces its spectral efficiency. Open-loop ring oscillator is seldom used due to its poorer phase noise characteristic. In [3], an injection-locked oscillator coupled with an edge combiner was proposed where crystal oscillator (XO) frequency pulling is used to achieve frequency modulation. However, the achievable data rate is limited. In addition, it can only support one channel due to the nature of injection locking. Recently, [4] and [5] have adopted the dual-injection technique to incorporate frequency tuning capability into the injection locking architecture. However, they require external inductors, and the adopted dual-injection complicates the design.

In this paper, we proposed a reconfigurable multichannel TX that supports both the binary frequency shift keying (BFSK) and QPSK modulation without the issues mentioned previously. The BFSK simplifies the off-body receiver architecture while the QPSK improves the spectral efficiency and attains higher data rate. An injection-locked ring oscillator (ILRO) is employed here to achieve area efficiency and good phase noise characteristic needed for the FSK and QPSK modulation. Multichannel support is achieved by the proposed  $\Delta\Sigma$ -based injection locking architecture for BFSK and the sequential injection technique [6] for QPSK. The  $\Delta\Sigma$ -based injection locking allows fine frequency tuning of 1.3 kHz, which enables accurate control of the frequency deviation for FSK as well as the channel spacing. For the QPSK, due to its relatively wider bandwidth, sequential injection locking (SIL) with its coarser frequency tuning is adopted.

With injection locking techniques, the TX can achieve a settling time of less than 0.8  $\mu$ s. The fast settling time allows the use of low duty-cycling ratio to achieve more power saving. This is critical in many on-body sensor applications where the bulk of power is consumed by TX, and the ability to have fast turn-on TX minimizes the power loss during TX on/off transition. This paper is organized as follows. Section II describes the concept of the  $\Delta\Sigma$ -based injection locking with phase interpolation (PI) technique. Sections III presents the proposed system architecture, followed by the critical building blocks. Noise analysis of the proposed TX will be explained in Section V, and the measurement results will be shown in Section VI. Finally, Section VII will conclude our findings.

## II. CONCEPT OF $\Delta\Sigma$ -BASED INJECTION LOCKING WITH PHASE INTERPOLATION

### A. $\Delta\Sigma$ -Based Injection Locking

The proposed injection locking techniques is best illustrated in Fig. 1 using a four-stage pseudodifferential



Fig. 1. Fixed, sequential, and proposed  $\Delta\Sigma$ -based injection locking mechanism.

ring oscillator (RO). For subharmonic fixed injection locking, the pulse train ( $P_{\text{inj}}$ ) generated from the clean reference clock signal (REF) is injected into the same RO's delay cell during each reference period ( $T_{\text{ref}}$ ) to realign the RO phase edge to the REF. If the RO's free-running frequency ( $F_{\text{RO}}$ ) is closed to the  $N^{\text{th}}$  harmonic of the REF, its output frequency will be locked to

$$F_{\text{FIL}} = N \times F_{\text{ref}} \quad (1)$$

where  $F_{\text{ref}}$  is the frequency of REF.

In [6], an SIL technique is proposed where the  $P_{\text{inj}}$  is sequentially injected into the neighboring RO delay cells. For RO with  $M$  number of phase edges, this generates fractional frequency of

$$F_{\text{SIL}} = \left( N + \frac{k}{M} \right) \times F_{\text{ref}} \quad (2)$$

where  $k = 1, 2, \dots, (M - 1)$ , is the number of neighboring delay cells between two sequential injections. Due to the finite value of  $k$  and  $M$ , the achievable frequency resolution and tuning is limited. For instance, as shown in Fig. 1(b), with eight phase edges and a minimum  $k$  value of 1, the  $P_{\text{inj}}$  can only be shifted by one delay cell for injection in every reference clock period. This limits the frequency resolution to  $1/8 \times F_{\text{ref}}$ . In addition, due to the mismatch of the delay stages and the periodicity of SIL, fixed spurs could appear in the output spectrum.

For our proposed  $\Delta\Sigma$ -based injection locking technique, a 2-b  $\Delta\Sigma$  modulator together with an injection pulse position accumulator is used to determine the delay cell stage that will be injected by  $P_{\text{inj}}$ . As an example, for RO with eight phase edges, previous accumulator value of 4 will choose E4 as the previous delay edge for injection. As current  $\Delta\Sigma$  modulator



Fig. 2. Dynamic element matching analogy for the synthesized period.

output of 3 will be added to the previous accumulator value of 4, it will result in an updated accumulator value of 7 and choose E7 as the next delay stage for injection. This gives rise to a clock period of  $(N + 3/8) \times T_{\text{RO}}$  where  $T_{\text{RO}}$  is the period of RO. A subsequent  $\Delta\Sigma$  modulator output of 2 added to the current accumulator value of 7 will result in an overflow ( $7 + 2 = 9 > 8$ ) and thus choose E1 (9 modulus 8 = 1) as the next delay stage for injection, which gives rise to a clock period of  $(N + 2/8) \times T_{\text{RO}}$ . Therefore, the  $\Delta\Sigma$  modulator will randomly determine the clock period from  $[N \times T_{\text{RO}}]$  to  $[(N + 3/8) \times T_{\text{RO}}]$ . On average, this will generate an output frequency of

$$F_{\Delta\Sigma-IL} = \left[ \left( N + \frac{k}{M} \right) + \frac{K}{M \times 2^j} \right] \times F_{\text{ref}} \quad (3)$$

where the  $k$  and  $M$  in this case are equal to 1 and 8, respectively,  $K$  is the input control word of the  $\Delta\Sigma$  modulator and  $j$  is the bit-width of the input control word.  $M$  of 8 is used in the example shown in Fig. 1(c) because of the fact that eight phase edges are engaged for the proposed  $\Delta\Sigma$  injection. It should be pointed out that the use of  $\Delta\Sigma$  injection is to cater for the generation of fractional output frequency. For integer mode, fixed injection as shown in Fig. 1(a) would be used.

With this technique, a fractional frequency with very fine resolution can be accomplished. In addition, the random injection also helps eliminate the periodicity due to the delay stage mismatches in a similar manner as dynamic element matching [7]. This is because a period of  $[N + i/8] \times T_{\text{RO}}$  can be formed in different ways, i.e., by selecting E1 and  $E(1 + i), \dots, \text{ or } E8 \text{ and } E[(8 + i) \text{ modulus } 8]$ . Hence a group of time intervals can be chosen to represent a particular period as shown in Fig. 2, which forms a band of the transfer function. This will approximate a linear transfer characteristic on average over time. It should also be noted that the resulting noise shaping is one order less than the  $\Delta\Sigma$  modulator order due to the additional accumulator, which acts as phase integration.

### B. Phase Interpolation

To further reduce the quantization noise without using an excessive number of delay stages within the RO, PI technique is also introduced here as shown in Fig. 3. As an example, a simplified  $\Delta\Sigma$ -based injection locking with 4-PI using

Fig. 3. Simplified  $\Delta\Sigma$ -based injection locking with 4-PI mechanism.

a two-stage pseudodifferential RO is illustrated in Fig. 3. A 3-b accumulator (Accum2) is employed to accumulate the 2-b output of the  $\Delta\Sigma$  modulator. The least two significant bits of the accumulator, S[1:0] are used to select the outputs of the phase interpolator with a value between 0 to 3. They represent REF time delay of 0,  $T_{RO}/16$ ,  $2 \times (T_{RO}/16)$ , and  $3 \times (T_{RO}/16)$ , respectively. When Accum2 output exceeds three, an overflow signal (Ovf) will be assigned to high. The Accum2 output is subtracted by four before added to the new value from  $\Delta\Sigma$  modulator in the next clock cycle. The Ovf will be added to the offset value (AF) and sent to the injection pulse position accumulator to shift the  $P_{inj}$  to the neighboring delay cell. For simplicity, the timing diagram with no AF added is shown in Fig. 3. The modulator output will randomly select 4-PI and four RO phase edges such that on average, it results in a clock period that falls between  $[(N+1/16) \times T_{RO}]$  to  $[(N+2/16) \times T_{RO}]$  with a fractional time resolution of  $T_{RO}/(16 \times 2^j)$ .

Through the PI for the RO with four phase edges, the resulting architecture has similar behavior as a  $\Delta\Sigma$ -based injection locking RO with eight delay cells. Hence, the PI can result in significant power saving by using 1.6 times less number of delay cells. With the AF added, an additional delay of  $(AF \times T_{RO}/4)$  is included to every  $T_{ref}$ . On average,  $\Delta\Sigma$ -based ILRO output frequency with the PI can be defined as

$$F_{\Delta\Sigma-IL\_PI} = \left[ N + \frac{AF}{M} + \frac{1}{M \times p} + \frac{K}{(M \times p) \times 2^j} \right] \times F_{ref} \quad (4)$$

where  $p$  is the number of PI within two adjacent RO phase edges, and  $M$  is a total number of RO phase edges. Both  $p$  and  $M$  values are four, respectively, in this example. In our actual implementation,  $M = 8$  and  $p = 4$  will be chosen.

### III. SYSTEM ARCHITECTURE

The TX system architecture is shown in Fig. 4. It supports multichannel BFSK and QPSK with the proposed  $\Delta\Sigma$ -based and SIL technique. The raw data are first sent to the digital baseband with a media access control (MAC) layer implementation to form a data packet with forward error correction. The duty-cycling protocol can be applied to the TX through



Fig. 4. System block diagram for proposed TX with digital baseband.



Fig. 5. Operating sequence for biomedical sensor transmission.

EN\_TX signal to improve its energy efficiency. The data packet will then go through a digital controller where the modulation type and the transmission channel can be chosen. Eight pulse generators (PGs) are directly connected to different delay stages where only one PG would be activated during each injection.

The injection controller can operate with or without the phase interpolator. Its operation will be described in detail in Section IV. Although the phase quantization can be reduced by simply increasing the number of RO delay stages, it would incur significant power penalty. For example, sixteen pseudodifferential delay stages are needed to create 32 phase edges. In comparison, with the proposed phase interpolator, only four additional delay stages (with each delay cells generating  $T_{RO}/32$  delays) are needed between the two neighboring RO phase edges to simulate 32 phase edges. This significantly reduces the delay stages needed and improves the energy efficiency. A digital PA (DPA) is employed such that both the phase and the amplitude can be controlled to perform BFSK or band-shaped QPSK modulation. To ensure that the  $F_{RO}$  falls within the injection locking range, an SAR frequency calibration is adopted for both the BFSK and QPSK modulations [8].

#### A. Modulations

As the QPSK modulation targets at tens of Mb/s with a bandwidth of MHz, only coarse frequency tuning is needed. Hence, SIL [6] is adopted here similar to [9]. Without the introduction of noise shaping from the  $\Delta\Sigma$  modulator, its phase noise at the higher offset frequency can be 20 dB better



Fig. 6. Proposed injection-locked synthesizer architecture with four-stage pseudodifferential RO.



Fig. 7. Timing diagram without/with PI and PG with single-to-differential converter.

than the  $\Delta\Sigma$ -based injection locking. The QPSK modulation is performed by selecting one of the four phase-shifted ( $0^\circ$ ,  $90^\circ$ ,  $180^\circ$ , and  $270^\circ$ ) carrier signals to represent the 2-b digital information per symbol. In order to suppress the adjacent channel power ratio (ACPR), a pulse-shaping root-raised-cosine filter based on a lookup-table ROM (LUTROM) design is adopted [4]. The serial data from the digital controller is first converted to the parallel I/Q data before being sent to the LUTROM to map to the pre-defined amplitude and phase control values. The values are then decoded and sent to the DPA. With a 5-b amplitude control per phase in the DPA, multiple amplitude levels can be achieved for each quadrature phase output. Multichannel selection is achieved through SIL where the  $k$  value in (2) can be determined by the frequency channel selection through the position shift block. Here, the  $k$  value can be varied between 2, 4, or 6, which corresponds to

407-, 418-, or 429-MHz carrier frequency, respectively. During QPSK operation, the phase interpolator is disabled to achieve power saving of 0.7 mW.

For BFSK modulation, its instantaneous frequency can be represented as [10]

$$f_i(t) = f_c + \Delta F \times m(t) \quad (5)$$

where  $f_c$  is the carrier frequency,  $\Delta F$  is the frequency deviation, and  $m(t)$  is the digital modulation signal [ $-1$  or  $1$ ]. In our proposed architecture, the frequency channel selection information is first sent to a LUT. The LUT consists of various sets of frequency control words (FCWs) that are mapped to the BFSK data [ $f_c + \Delta F$ ,  $f_c - \Delta F$ ] for different frequency channels ( $f_c$ ).



Fig. 8. Detail schematic of (a) DCCS and (b) RO delay cell.

### B. Duty Cycling

Duty cycling of TX helps improve its energy efficiency. However, such technique relies heavily on the fast turn-on/turn-off capability of the TX. As illustrated in Fig. 5, the sensor data are only communicated during the “TX Active” time slot while the TX remain in the “Low Power” mode most of the time. However, it takes certain “Wake Up” time for the TX to change from the “Low Power” mode to the “TX Active” mode. For the conventional phase-locked-loop (PLL)-based TX, the “Wake Up” time is in the range of hundreds of  $\mu\text{s}$ , which could limit the achievable duty cycling. On the other hand, our proposed TX architecture benefit from the fast start-up time of the injection locking, and the “Wake Up” period is only 0.8  $\mu\text{s}$ . During “Low Power” mode, only the XO and the interface circuitry with raw data are kept running to ensure the continuous collection of the sensor data.

## IV. BUILDING BLOCKS

### A. Fractionally Injection-Locked Synthesizer Architecture

Fig. 6 shows the injection-locked synthesizer architecture. The RO is similar to [8] where the pseudodifferential delay cell with mismatch filtering resistor is employed. The controller is able to perform  $\Delta\Sigma$ -based or sequential ILRO for BFSK/QPSK modulation, respectively, with multichannel transmission capability.

All phases from RO will first be sent to phase position identifier to identify the phase edge closest to REF and initialize the ring counter (RC) with the desired flip-flop setting through RC\_L[7:0]. The RC is an 8-b counter (SEL[7:0]) with only one ‘1’ at any time (one-hot). The ‘1’ position will be shifted depending on the input Shift\_Amt. For example, if Shift\_Amt is p, the ‘1’ position will be shifted to the right by p bit. Since RC remembered its previous ‘1’ position and shifted accordingly based on Shift\_Amt, it inherently implements an accumulator function. The RC output, SEL[7:0] will then de-multiplex PI\_out to one of the eight PGs. For



Fig. 9. Types of phase interpolator.

QPSK modulation, the  $k$  value of 2, 4, or 6 will be assigned to the Shift\_Amt and the Del1 will be selected as PI\_out.

For the BFSK modulation, if the PI is disabled, the  $\Delta\Sigma$  modulator output will be sent to the RC directly to determine the number of right shift. This will, in turn, activate the desired PGs, which directly control the corresponding delay stage within ILRO. The accumulator (Accum) will also be disabled with its output reset to 0. Hence, the first reference pulse (Del1) will always be chosen to drive the PG. As illustrated in Fig. 7(a) and (b), the timing diagram without and with PI shows the right value of SEL[7:0], which corresponds to the  $\Delta\Sigma$  modulator output. Therefore, by changing the FCW value, the  $\Delta\Sigma$ -based injection-locked synthesizer can achieve fractional frequency resolution as defined in (3). Both iN[7:0] and iP[7:0] are the differential  $P_{\text{inj}}$  signals that represent the output of each PG after passing through the single-ended-to-differential converter (S/D) as shown in Fig. 7(b). The width of the injection  $P_{\text{inj}}$  can be tuned through the programmable delay inverter in PG for the desired injection strength. To minimize any mismatch in delay between the differential  $P_{\text{inj}}$  signals, two-stage cross-coupled inverters are incorporated. When the PI is activated, the phase interpolator will provide additional four-PI between the two neighboring delay edges within the ILRO. This will lower the phase noise by 12 dB. The  $\Delta\Sigma$  modulator output is now sent to the Accum instead of the RC, and only the overflow (Ovf) will be sent to the RC directly. From (4), the output frequency now becomes  $[N + \text{offset}/8 + 1/32 + K/(32 \times 2^j)] \times F_{\text{ref}}$ , where the offset is introduced to cover different segments of delay edges. In this implementation, N of 9 and  $F_{\text{ref}}$  of 44 MHz are chosen to achieve the MICS/MedRadio frequency band.

### B. RO Delay Cell

Digitally controlled pseudodifferential current starved inverter is adopted for the delay cell implementation. The schematic of the digitally controlled current source (DCCS) and the RO delay cell are shown in Fig. 8. The DCCS consists of a 4-b digital programmable current source to provide a discrete voltage control to the 7-b current DAC in the delay cell. As shown in Fig. 8(b), M1-M4 are two inverters used as one of the pseudodifferential RO delay stages. The differential injection  $P_{\text{inj}}$  is converted to a current signal through M5-M8 before being injected into the RO outputs.

### C. Phase Interpolator

In the phase interpolator discussion earlier, the REF clock is delayed to generate four fractionally spaced REF clock with



Fig. 10. Time calibration for PI. (a) Block diagram. (b) Schematic of time amplifier. (c) Timing diagram.



Fig. 11. Programmable second-order 2-b  $\Delta\Sigma$  modulator with dithering.

a delay difference of  $T_d$ . Here,  $T_d = T_{\text{RO}}/32$  is chosen to obtain four interpolated phase edge between the adjacent phase edge of the RO delay stages. Two different implementations as shown in Fig. 9 can be used to achieve the four fractional delay path (DEL1–DEL4). Configuration (a) provides greater flexibility as each delay path can be independently varied through the individual delay cell to achieve the desired fractional delay. However, it requires more numbers of delay cells. On the other hand, for configuration (b), each delay path will depend on the previous delay path. This reduces its flexibility in returns for much fewer delay cells. By considering area and energy efficiency factors, the configuration (b) has been chosen. In both cases, if we ensure proper matching of each delay cell, the difference between neighboring delay paths should exhibit a delay difference of  $T_d = T_{\text{RO}}/32$ . Each delay cell consists of a buffer constructed by two cascaded inverters. Similar tuning configuration as the RO delay stages is implemented here to fine tune the delay  $T_d$  to the desired value.

#### D. Time Calibration for PI

To achieve accurate fractional PI of  $1/32 \times T_{\text{RO}}$ , a delay cell calibration is needed. One popular method for delay cell calibration is to incorporate the delay cell within an RO structure and estimate the  $F_{\text{RO}}$  through a simple counter. Though simple, such technique will lead to a long calibration time if the fractional delay accuracy in the range of a few ps is needed. In this paper, we adopt a direct phase comparison to achieve the desired phase calibration.

As shown in Fig. 10(a), the two neighboring phase edges,  $RO_N$  and  $RO_{N+1}$ , are input to the calibration block. Signal alignment block will capture their phase edges and lower down their frequency to that of REF for subsequent processing ( $R_N$ ,  $R_{N+1}$ ). The  $R_N$  will then go through four delay cells

(TD1–TD4) and compared with the  $R_{N+1}$ , which did not go through the additional four delay cells. If TD1–TD4 are calibrated correctly with the desired fractional delay, we would expect the rising edge of delayed  $R_N$  ( $D_{\text{start}}$ ) to be aligned with the rising edge of the delayed  $R_{N+1}$  ( $S_{\text{stop}}$ ).  $C_{\text{det}}$  is the synchronized signal for the resulting phase edge comparison between the  $D_{\text{start}}$  and  $S_{\text{stop}}$ . When  $C_{\text{det}} = 1$ , it means the  $D_{\text{start}}$  and  $S_{\text{stop}}$  are very close to each other, within the coarse delay resolution. For fine rising edge comparison between the  $D_{\text{start}}$  and  $S_{\text{stop}}$ , a timing amplifier is employed. Through an arbiter, the resulting fine edge comparison is performed. It will then go through a D-type flip-flop (DFF) for synchronization to produce  $F_{\text{det}}$ .  $F_{\text{det}} = 1$  implies  $D_{\text{start}}$  and  $S_{\text{stop}}$  are very close to each other, within the fine delay resolution. The status of the  $C_{\text{det}}$  and  $F_{\text{det}}$  will be feedback to adjust the propagation delay of TD1–TD4 such that the rising edge of  $D_{\text{start}}$  closely approaches that of  $S_{\text{stop}}$ .

The calibration cycle can be described as follows with its corresponding timing diagram shown in Fig. 10(c). Initially, the delay cells are set to the largest delay setting and cause the  $S_{\text{stop}}$  to lead  $D_{\text{start}}$ . The resulting  $C_{\text{det}}$  will be low, indicating that the coarse phase error is large, and the delay cell setting needs to be reduced. Once the  $S_{\text{stop}}$  rising edge is in the close proximity of the  $D_{\text{start}}$ , the arbiter will cause the  $C_{\text{det}}$  to go high, and the fine calibration phase can kick in. Timing amplifier is employed here to amplify the small time difference of  $\pm 50$  ps between  $S_{\text{stop}}$  and  $D_{\text{start}}$  by 16 times [11]. When the rising edges of  $D_{\text{start}}$  and  $S_{\text{stop}}$  are closed to each other, the SR latch will be in a metastable state. The regeneration through positive feedback will take a longer time to accomplish. The two inverters right after  $D_{\text{start}}$  and  $S_{\text{stop}}$  are added to provide timing offset ( $T_{\text{off}}$ ) such that the gain and linear range of the timing amplifier can be determined. The timing amplifier gain ( $A_T$ ) can be derived as [11]

$$A_T = \frac{2C}{g_m T_{\text{OFF}}} \quad (6)$$

where  $C$  is the output capacitance of the NAND gate, and  $g_m$  is the transconductance of the NAND gate in the metastable state. The signals with the amplified timing difference (TS and TD), will then be sent to another arbiter. Similarly, if TS leads TD, the  $F_{\text{det}}$  will be set low to indicate the needs to reduce the delay. Once TS lagging TD is detected, both the  $C_{\text{det}}$  and  $F_{\text{det}}$  will be set high to indicate the completion of the calibration cycle.  $D_{\text{clk}}$  is an inverted signal of REF. A simple comparator similar to [11] is used as arbiter, which output will determine whether one of its input is lagging or leading its other input.

As mentioned earlier, the key function of signal alignment block is to capture the neighboring phase edges ( $RO_N$  and  $RO_{N+1}$ ) and only perform the comparison approximately every reference period. The signal alignment block mainly consists of two flip-flops, which capture the REF clock on the rising edge and falling edge of the phase edges ( $RO_N$  and  $RO_{N+1}$ ). The two flip-flops output is then used to regenerate a clock ( $R_N$  and  $R_{N+1}$ ) with lower frequency for subsequent delay calibration [12].

In our design, one-shot calibration is implemented for PI. Four most significant bits of the delay control word are



Fig. 12. Phase noise model for  $\Delta\Sigma$ -based injection locking TX without PI.



Fig. 13. (a) Pulsed ILRO transfer functions. (b) Phase noise spectrum for TX with/without PI.

increased linearly for coarse rising edge comparison while the remaining seven least significant bits are tuned in SAR methodology for fine rising edge comparison. The time window is set to be  $10 \mu s$  so that it has sufficient settling time for the DAC switching in the buffers. The calibration will take at most 23 cycles or  $230 \mu s$  to obtain the desired propagation delay. One-shot is used so that power would not be wasted to constantly update the PI. If needed, calibration can always be performed at the interval between the transmission burst.

#### E. Other Blocks

A single loop second-order  $\Delta\Sigma$  modulator with 2-b output is chosen as shown in Fig. 11 [13]. As the output level is directly mapped to the injected phase edges, a 2-b quantizer will allow full coverage of the region within the two neighboring phase edges. A 25-b pseudorandom sequence generator is also incorporated to provide better randomization. With PI, by choosing 10-b FCW and  $F_{ref}$  of 44 MHz as clock frequency, the frequency resolution becomes

$$F_{LSB} = \left( \frac{1}{32 \times 2^{10}} \right) \times F_{ref} \cong 1.3 \text{ kHz} \quad (7)$$

For the DPA design, the architecture is similar to [4] and [8]. The four phases from ILRO and the 5-b amplitude

can be digitally controlled through the decoder to perform the BFSK and band-shaped QPSK modulations. The combination of nMOS and pMOS branches help to lower the dc output offset current so that the close-in clock spur around the output spectrum can be reduced.

## V. NOISE ANALYSIS OF THE PROPOSED TX

The power spectrum density (PSD) of the proposed TX is mainly affected by the phase noise of the injection-locked fractional-N synthesizer. By employing a  $\Delta\Sigma$  modulator to randomize the injection pulse position for BFSK modulation, it introduces additional phase quantization noise to the system. In this section, a phase noise modeling for the proposed TX will be presented. In addition, the phase noise for an injection locking loop bandwidth (LBW) selection will be analyzed. Finally, analysis of phase noise and error vector magnitude (EVM) performance due to the PI delay mismatch issue will be shown.

### A. Phase Noise Modeling for the Proposed TX

Fig. 12 shows a phase noise model for the  $\Delta\Sigma$ -based injection locking TX without the PI. There are three key noise sources in the system: the REF noise ( $S_{REF}$ ), the free-running RO phase noise ( $S_{RO}$ ), and the  $\Delta\Sigma$  modulator quantization

Fig. 14.  $\Delta\Sigma$ -based ILRO phase noise with  $\beta$ .

Fig. 15. RO delay cells with random phase error.

noise ( $S_{\Delta\Sigma M}$ ). A good  $S_{\text{REF}}$  is critical to the subharmonic injection locking technique because by injecting the  $P_{\text{inj}}$  from REF to RO for periodic phase alignment, the in-band phase noise of the RO output is worsened to [ $S_{\text{REF}} + 20\log(N)$ ]. The pulsed injection-locked subharmonic oscillator phase noise can be modeled as

$$S_{\text{ILRO}}(f) = S_{\text{REF}}(f) \cdot H_{\text{UP}}(f) + S_{\text{RO}}(f) \cdot H_{\text{INJ}}(f) \quad (8)$$

where the two noise transfer functions can be expressed as [14]

$$H_{\text{UP}}(f) = \frac{N\beta}{1 + (\beta - 1)e^{(-j2\pi f/F_{\text{ref}})}} e^{(-j\pi f/F_{\text{ref}})} \frac{\sin(\pi f/F_{\text{ref}})}{\pi f/F_{\text{ref}}} \quad (9)$$

$$H_{\text{INJ}}(f) = 1 - \frac{\beta}{1 + (\beta - 1)e^{(-j2\pi f/F_{\text{ref}})}} \times e^{(-j\pi f/F_{\text{ref}})} \frac{\sin(\pi f/F_{\text{ref}})}{\pi f/F_{\text{ref}}} \quad (10)$$

where  $\beta$  is the realignment strength with  $\beta = 1$  represents perfect phase alignment and  $\beta = 0$  represents no phase alignment.

The characteristic transfer functions of  $H_{\text{UP}}(f)$  and  $H_{\text{INJ}}(f)$  with  $\beta$  values of 0.6, 0.8, and 1.0, respectively, are shown in Fig. 13(a). It should be highlighted that the cutoff frequency of the filtering characteristic is decreased with the  $\beta$  value. This cutoff frequency is normally interpreted as injection locking LBW ( $f_{\text{LBW}}$ ). In addition, as reported in [15], the  $f_{\text{LBW}}$  will also be varied with the injection pulsewidth.



Fig. 16. Actual period for RO delay cells selection with random mismatch.



Fig. 17. Actual period for PI delay cells selection with systematic mismatch.

Due to the introduction of the  $\Delta\Sigma$  modulator, the  $S_{\Delta\Sigma M}$  will be added to the  $S_{\text{REF}}$  before being subjected to the low-pass filtering action of  $H_{\text{UP}}(s)$ . The quantization noise is determined by the quantized phase step and the  $\Delta\Sigma$  modulator. Due to the phase accumulation, the actual noise shaping order will be one order less than the order of the  $\Delta\Sigma$  modulator. Hence for the second-order  $\Delta\Sigma$  modulator deployed here, the expected low-frequency phase noise will be shaped with a 20-dB/decade slope. The resulting noise transfer function is shown as [16] and [17]

$$S_{\Delta\Sigma M}(f) = \frac{\Delta f^2}{12 \times F_{\text{ref}}} \times \left( 2 \sin \left( \frac{\pi f}{F_{\text{ref}}} \right) \right)^4 \quad (11)$$

$$S_{\Delta\Sigma\_acc}(f) = \frac{\Delta f^2}{12 \times F_{\text{ref}}} \times \frac{1}{(2\pi f)^2} \times \left( 2 \sin \left( \frac{\pi f}{F_{\text{ref}}} \right) \right)^4 \quad (12)$$

where frequency step,  $\Delta f = (F_{\text{ref}} - F_{\text{ref}}/(1 + 1/8)) = F_{\text{ref}}/9$  [17]. Fig. 13(b) shows the representative calculated PSD plot of the  $\Delta\Sigma$ -based injection locking TX with/without PI. As the 2-b quantizer can produce four different clock periods, to reduce the locking range that may cause the injection locking to fail, the  $F_{\text{RO}}$  is set such that the maximum frequency difference between the  $F_{\text{RO}}$  and the four possible clock periods is  $0.5 \times (3/8) \times F_{\text{ref}}$ . Therefore, the  $\beta$  in the simulation is chosen to be 0.8. The off-chip LC resonator employed in the matching network with its inherent band-pass filter characteristic can provide further suppression at higher frequency offset. With 4-PI, the quantized phase in (12) can be reduced by four times, which results in 12-dB phase noise improvement.



Fig. 18. MATLAB model for systematic and random delay cells mismatch for RO and phase interpolator.



Fig. 19. (a) Phase noise of RO delay cells random mismatch. (b) Normalized phase noise versus RO delay cells mismatch.



Fig. 20. (a) Phase noise of delay cells systematic mismatch. (b) Normalized phase noise versus PI delay cells mismatch.

### B. Design Consideration on Phase Noise for Injection Locking

As shown earlier, realignment strength,  $\beta$ , plays an important role in determining  $f_{LBW}$ . As shown in Fig. 13(a), at  $\beta = 1$ ,  $f_{LBW}$  at half of the injection frequency can be obtained. With reduced  $\beta$  of 0.8 and 0.6,  $f_{LBW}$  of 11 and 6.4 MHz can be obtained accordingly, which is determined by (9) and (10). At high-frequency offset, there are two critical noise components. First, the shaped high-frequency noise due to the  $\Delta\Sigma$  modulator, which needs to be suppressed earlier with

lower  $f_{LBW}$ . On the other hand, poorer RO phase noise favors the larger  $f_{LBW}$  such that the in-band phase noise at lower frequency offset can be reduced. This is clearly illustrated in the phase noise plot of Fig. 14 where different  $\beta$  values, and thus  $f_{LBW}$  are used. As shown, with lower  $\beta$ , in-band noise is elevated due to the higher gain of  $H_{UP}(s)$ . At the same time, it helps suppress the phase noise at the higher frequency offset due to the earlier suppression of the shaped high-frequency noise from  $\Delta\Sigma$  modulator. On the other hand, larger  $\beta$  reduces the in-band noise whereas there is larger phase noise at higher



Fig. 21. Die micrograph.



Marker 1&2: sequential ILRO; Marker 3&4:  $\Delta\Sigma$ -based ILRO w/o PI; Marker 5&6:  $\Delta\Sigma$ -based ILRO with PI

Fig. 22. Measured phase noise for sequential ILRO,  $\Delta\Sigma$ -based ILRO with and without PI.

frequency offset due to the modulator. In this design,  $\beta = 0.8$  is chosen to achieve reasonable tradeoff between in-band phase noise and high-frequency offset phase noise.

### C. Effect of Delay Mismatch on the Phase Noise and EVM

The proposed architecture relies heavily on delay cells, such as pseudodelay cell for RO, and the delay cell for PI. The mismatch of these delay cells could impact the overall performance of the system and require a closer examination. In general, there would be two different types of mismatches arises from these delay cells. The first type is the random delay cell mismatch due to process variation and layout parasitic as shown in Fig. 15. As illustrated, for RO with four delay stages, it would generate eight phases with slight mismatch. Each phase edge could be either slightly earlier or later than the desired nominal phase edge due to the nature of random mismatch. This in turns gives rise to phase nonlinearity, which could impact the system performance. Fortunately, the proposed architecture provides a way to randomize such nonlinearity. It should be noted that the  $\Delta\Sigma$  modulator is employed to randomize the instantaneous period of the RO, which is formed by the current phase edge and the previous phase edge. Hence, there are eight different ways to create an interval of  $(N+1/8)T_{\text{RO}}$ . For example, either  $(\phi_0, \phi_1)$ ,  $(\phi_1, \phi_2)$ , or  $(\phi_7, \phi_0)$  can be chosen as the previous and current phase edges to generate  $(N + 1/8 + \Delta\phi_0)T_{\text{RO}}$ ,  $(N + 1/8 +$

Fig. 23. Measured output spectrum of frequency resolution with  $\Delta\Sigma$ -based ILRO.

$\Delta\phi_1)T_{\text{RO}}, \dots, (N + 1/8 + \Delta\phi_7)T_{\text{RO}}$ . As the  $\Delta\Sigma$  modulator randomizes the interval, it also randomizes the previous phase edges. Hence, we can conclude that eight slightly different  $(N + 1/8)T_{\text{RO}}$  values would be randomly selected, which will form a group of selection as shown in Fig. 16. The same argument is applicable for other interval, such as  $(N + 2/8)T_{\text{RO}}$  and so on. Hence, the interval selection is very similar to the dynamic element matching method in the DAC design. To represent one interval, eight different interval representations can be randomly picked. On average, this will form a linear characteristic, which destroy the phase nonlinearity originally observed in the ring oscillator phase edges.

The second type of the mismatch is mainly due to the calibration error for the PI. As described earlier, the two phase edges ( $R_N$  and  $R_{N+1}$ ) are chosen to form  $T_{\text{RO}}/8$  for the calibration of the delay cell within PI. When DEL1~DEL4 are calibrated correctly, they should correspond to 0,  $T_{\text{RO}}/32$ ,  $2T_{\text{RO}}/32$ ,  $3T_{\text{RO}}/32$ . Due to the finite digital control and accuracy in the calibration loop, there could be deviation from the desired nominal delay of  $T_{\text{RO}}/32$ . Hence, we could end up having 0,  $T_{\text{RO}}/32 + \Delta T$ ,  $2T_{\text{RO}}/32 + 2\Delta T$ ,  $3T_{\text{RO}}/32 + 3\Delta T$ , where  $\Delta T$  could be either positive or negative. The mismatch here is quite different from the first type as it exhibits deterministic pattern due to the miscalibration as shown in Fig. 17. Similarly, we are interested in the instantaneous time interval that is formed between the previous and current phase edges rather than the phase edge itself. Although PI delay cells also suffered from random mismatch, here we assume the systematic mismatch due to limited calibration resolution to be dominant to investigate its impact. If the random mismatch is dominant, its effect will be similar to the RO delay cell mismatch.

To study the impact of these two types of mismatches, an MATLAB model is constructed as shown in Fig. 18. First, the control word will go through a second-order  $\Delta\Sigma$  modulator to generate the corresponding output. Given that the modulator output determines the instantaneous oscillator interval, the  $\Delta\Sigma$  modulator output is send to an accumulator to determine its current phase edge for the PI control (0,  $T_{\text{RO}}/32$ ,  $2T_{\text{RO}}/32$ ,  $3T_{\text{RO}}/32$ ). Any overflow will be passed to another accumulator to determine the RO phase edge (0,



Fig. 24. Measured output spectrum of three adjacent channels and EVM results for BFSK and QPSK at 550 Kb/s and 11 Mb/s, respectively.

$T_{\text{RO}}/8, \dots, 7T_{\text{RO}}/8$ ). In combination, it will output the current phase edge selection with resolution of  $T_{\text{RO}}/32$ , covering from 0 to  $31T_{\text{RO}}/32$ . To determine the instantaneous oscillator time interval, the difference between current and previous phase edge will be determined subsequently. The two types of phase mismatches can be easily incorporated into the phase edge selection, that is  $(0 + \Delta\phi_0, T_{\text{RO}}/8 + \Delta\phi_1, \dots, 7T_{\text{RO}}/8 + \Delta\phi_7)$  and  $(0, T_{\text{RO}}/32 + \Delta T, \dots, 3T_{\text{RO}}/32 + 3\Delta T)$ . It should be noted that the model can only capture the nonlinearity effect of the instantaneous oscillator interval. It did not model the actual injection locking behavior. Nevertheless, this is sufficient as the injection locking behavior can be treated as a phase filter that mainly provides 1st order PLL behavior on the resulting instantaneous oscillator interval.

The result is shown in Figs. 19 and 20. As illustrated, with the first type of mismatch, there is a raise in phase noise floor when the delay cell mismatch is more than 1%. Similarly with the second type of mismatch, we can tolerate inaccurate

calibration up to 10 ps before normalized phase noise of 1.35% increase is observed. With 5% of RO random mismatch and PI delay cells of 5-ps offset error, the normalized phase noise is increased by 2.85%. From the simulation, we can also obtain the peak frequency error  $\Delta\text{Ferr}$ . Given that the FSK modulation has frequency deviation of 272.17 kHz, the resulting peak frequency error  $\Delta\text{Ferr}$  will correspond to EVM of 6.8%.

## VI. MEASUREMENT RESULTS

The proposed TX was implemented in 130-nm standard CMOS technology with an active area of  $0.643 \text{ mm}^2$  as shown in Fig. 21. With Quad Flat No-lead packaging, the chip consumes a total current of 4.08 mA. The measured phase noise using SIL is less than  $-100 \text{ dBc}$  at 1-MHz offset as illustrated in Fig. 22. The in-band phase noise of  $\Delta\Sigma$ -based ILRO achieves around  $-90 \text{ dBc/Hz}$ . Besides, the  $\Delta\Sigma$ -based ILRO has first-order noise shaping compare to the



Fig. 25. Power consumption breakdown (in mW).

Fig. 26. (a) Measured phase noise of  $\Delta\Sigma$ -based ILRO with PI of various LBW. (b) Measured timing waveform of larger LBW. (c) Measured timing waveform of smaller LBW.

SIL. At around 20 MHz, the injection locking reaches its locking range limit ( $f_{LBW}$ ) and the high-frequency shaped noise gets filtered beyond  $f_{LBW}$ . In addition, when the PI is activated, smaller high-frequency noise is observed due to the smaller phase quantization. It should be noted that the in-band phase noise for the  $\Delta\Sigma$ -based ILRO measurement result is about 10 dB worse than simulation result in Fig. 13. In simulation, ideal reference is used, which exhibits better phase noise. In actual implementation, the reference oscillator phase noise performance is limited by the power supplied to the XO. In [8], we have shown that the overall phase noise for injection locking TX can be improved by 7 dB by pumping more current to the XO.

Fig. 23 shows that by changing the LSB of FCW, the tunable frequency resolution is 1.3 kHz. Fig. 24 shows the BFSK and the QPSK modulations with three-channel measured output spectrum and their EVM results, respectively. The  $f_{LBW}$  of the



Fig. 27. Measured wake-up time of TX from sleep mode to data transmission mode.



Fig. 28. Measured average power consumption with duty-cycling protocol.

output transmission signal for both the modulations is similar to the value shown in Fig. 22, which is around 17.6 MHz. For BFSK, each channel can support up to 550 Kb/s without violating the spectral mask in MICS/MedRadio band. With a frequency deviation of 272 kHz, the measured EVM of the BFSK is less than 4.4%. For the QPSK, with the pulse-shaping technique, the output spectrum achieves an ACPR of around  $-26$  dBc. The measured EVM for QPSK is less than 4.9% at a data rate of 11 Mb/s. It should be noted that the PI is disabled in the QPSK modulation to avoid unnecessary power consumption and noise jitter added to the transmission signal. In our implementation, the data clock is derived directly with the on-chip XO of 44 MHz, which limits the maximum achievable data rate. The proposed architecture should be able to support BFSK data rate beyond 1 Mb/s and QPSK data rate beyond 20 Mb/s as proven by other ILRO-based architecture [5]. The power breakdown for each of the subblocks is shown in Fig. 25.

Various injection strength can be obtained by tuning the width of the injection  $P_{inj}$  [15], which also varies the  $f_{LBW}$ . For the  $\Delta\Sigma$ -based ILRO, by reducing the  $f_{LBW}$  from 15 to 4 MHz, the noise shaping due to quantization noise can be reduced at high frequency, thus resulted in less jitter in the

TABLE I  
PERFORMANCE COMPARISON

|                                | [19]                  | [20]                           | [21]                           | [3]             | [5]                   | <b>This Work</b>                                  |
|--------------------------------|-----------------------|--------------------------------|--------------------------------|-----------------|-----------------------|---------------------------------------------------|
| Data Rate (Mbps)               | 0.1518–3              | 0.455                          | 4.5 (PSK)<br>0.1875<br>(GMSK)  | 0.2             | 5 (FSK)<br>20 (PSK)   | 0.55 (BFSK)**<br>11 (QPSK)**                      |
| Channel Selection              | Yes                   | Yes                            | Yes                            | No              | Yes                   | Yes                                               |
| Dig. Baseband                  | Yes                   | Yes                            | No                             | No              | No                    | Yes                                               |
| P <sub>DC</sub> (mW)           | 13.2                  | 2.9                            | 2.27 (FSK)<br>2.28 (PSK)       | 0.09            | 2.2                   | 4.06 (BFSK)<br>4.08 (QPSK)                        |
| VDD (V)                        | 1.55                  | 1.0/2.5                        | 1.0                            | 1.0             | 0.9                   | 1.0                                               |
| Energy Eff. (nJ/b)             | 4.4                   | 6.37                           | 0.51                           | 0.45            | 0.11–0.44             | 7.42 (BFSK)<br>0.37 (QPSK)                        |
| Frequency (MHz)                | 0.36–0.51<br>MICS/ISM | 402–405<br>420–450<br>MICS/ISM | 402–405<br>420–450<br>MICS/ISM | 400<br>MICS/ISM | 400–436.4<br>WBAN/ISM | 401–428<br>MICS/MedRadio                          |
| Modulation                     | GMSK/PSK              | GMSK/PSK                       | GMSK/PSK                       | FSK             | FSK/PSK               | BFSK/QPSK                                         |
| TX power (dBm)                 | +4.7                  | -16                            | -17 (PSK)<br>-10 (GMSK)        | -17             | -8                    | -13                                               |
| Phase Noise (dBc/Hz)           | -121.5@1 MHz          | NA                             | NA                             | -105.2@300 kHz  | -107@1 MHz            | -100@1 MHz (SIL)<br>-91@1MHz ( $\Delta\Sigma$ IL) |
| Process (nm)                   | 180                   | 40                             | 40                             | 130             | 65                    | 130                                               |
| Active Area (mm <sup>2</sup> ) | 6.1 <sup>++</sup>     | 4.73 <sup>++</sup>             | 3.06 <sup>++</sup>             | 0.04            | 0.23                  | 0.643*                                            |
| FOM <sup>#</sup> (nJ/(bit-mW)) | 1.49                  | 253.7                          | 25.4                           | 22.6            | 0.7                   | 7.4**                                             |

\* include digital baseband with data buffer memory. \*\*limited by on-chip data clock implementation. <sup>++</sup> Area includes transmitter and receiver. <sup>#</sup> $FOM = P_{DC}/(DataRate \times P_{out})$ .

time domain waveform as shown in Fig. 26. It can be observed that there is around 10-dB difference at high-frequency offset for both the measurements. At high-frequency offset, two noise components, i.e., high-frequency shaped noise from the modulator and the RO phase noise, dominate. By reducing the f<sub>LBW</sub>, the high-frequency shaped noise from the modulator get suppressed earlier, which result in the observed 10-dB difference at high-frequency offset.

In Fig. 27, the measured TX wake-up time from sleep mode to data transmission mode (EN\_TX from low to high) is around 0.8  $\mu$ s. By applying clock gating in the digital block and power gating in the TX subblocks, the wake-up time is mainly due to the propagation delay of EN\_TX driven by the pre-buffer, and the parasitic gate capacitance of the power switches. In addition, with the injection locking technique applied, the wake-up time of our proposed TX is 500 times better than TXs that employ PLL as synthesizer [18]. The TX operating time with duty-cycling protocol versus power consumption has also been measured by controlling the EN\_TX signal as shown in Fig. 28. This information is important as the percentage of the duty cycle can be chosen based on the targeted average power consumption.

Our proposed TX performance is benchmarked with others in Table I. The older technology employed and the incorporation of full MAC layer and data buffer account for its slightly larger area and higher power than others. Nevertheless, our achieved FOM of 7.4 is much better than others except for [19] and [5] in the Table I. The former reported high FOM is mainly due to its higher output power of 4.7 dBm. Although the latter achieves the best FOM, the adopted *LC* oscillator-based QPSK modulation requires phase calibration, which is not included in the implementation. Most reported works are based on PLL, which has slow settling time due to the finite LBW. This paper has demonstrated an alternative way of achieve fast frequency tuning with comparable frequency resolution. This allows more aggressive duty cycling to achieve power saving. It should also be pointed out that our measured data rate is currently limited by the on-chip data clock.

## VII. CONCLUSION

This paper presents a 130-nm CMOS multichannel BFSK/QPSK TX by using  $\Delta\Sigma$ -based or SIL technique. The proposed  $\Delta\Sigma$ -based injection locking together with the PI implementation can achieve fine frequency tuning with

1.3-kHz resolution and reduce the phase noise by 12 dB. The TX achieves 550 Kb/s for BFSK and 11 Mb/s for band-shaped QPSK with EVM of 4.4% and 4.9%, respectively. By applying the injection locking mechanism, the wake-up time of 0.8  $\mu$ s can be achieved. It is about 500 times better than those PLL-based TX. Consuming 4.08 mW from a 1 V supply, the TX achieves an energy efficiency of 370 pJ/b.

## REFERENCES

- [1] J. L. Bohorquez, A. P. Chandrakasan, and J. L. Dawson, "A 350  $\mu$ W CMOS MSK transmitter and 400  $\mu$ W OOK super-regenerative receiver for medical implant communications," *IEEE J. Solid-State Circuits*, vol. 44, no. 4, pp. 1248–1259, Apr. 2009.
- [2] J. Bae, L. Yan, and H.-J. Yoo, "A low energy injection-locked FSK transceiver with frequency-to-amplitude conversion for body sensor applications," *IEEE J. Solid-State Circuits*, vol. 46, no. 4, pp. 928–937, Apr. 2011.
- [3] J. Pandey and B. P. Otis, "A sub-100  $\mu$ W MICS/ISM band transmitter based on injection-locking and frequency multiplication," *IEEE J. Solid-State Circuits*, vol. 46, no. 5, pp. 1049–1058, May 2011.
- [4] X. Liu *et al.*, "A 103 pJ/bit multi-channel reconfigurable GMSK/PSK/16-QAM transmitter with band-shaping," in *Proc. ASSCC*, Nov. 2014, pp. 269–272.
- [5] S.-J. Cheng, Y. Gao, W.-D. Toh, Y. Zheng, M. Je, and C.-H. Heng, "A 110pJ/b multichannel FSK/GMSK/QPSK/p4-DQPSK transmitter with phase-interpolated dual-injection DLL-based synthesizer employing hybrid FIR," in *IEEE ISSCC Dig. Tech. Papers*, Feb. 2013, pp. 450–451.
- [6] P. Park, J. Park, H. Park, and S. Cho, "An all-digital clock generator using a fractionally injection-locked oscillator in 65nm CMOS," in *IEEE ISSCC Dig. Tech. Papers*, Feb. 2012, pp. 336–337.
- [7] C.-H. Heng, "A 1.8-Ghz CMOS fractional-N frequency synthesizer with randomized multiphase VCO," Ph.D. dissertation, Elect. Eng., Univ. Illinois, Urbana–Champaign, Champaign, IL, USA, 2003.
- [8] X. Liu, M. M. Izad, L. Yao, and C.-H. Heng, "A 13 pJ/bit 900 MHz QPSK/16-QAM band shaped transmitter based on injection locking and digital PA for biomedical applications," *IEEE J. Solid-State Circuits*, vol. 49, no. 11, pp. 2408–2421, Nov. 2014.
- [9] K.-H. Teng, T. Wu, Z. Yang, and C.-H. Heng, "A 400-MHz wireless neural signal processing IC with 625 $\times$  on-chip data reduction and reconfigurable BFSK/QPSK transmitter based on sequential injection locking," in *Proc. ASSCC*, Nov. 2015, pp. 1–4.
- [10] L. W. Couch, M. Kulkarni, and U. S. Acharya, *Digital and Analog Communication Systems*, vol. 6. Englewood Cliffs, NJ, USA: Prentice-Hall, 1997.
- [11] M. Lee and A. A. Abidi, "A 9 b, 1.25 ps resolution coarse–fine time-to-digital converter in 90 nm CMOS that amplifies a time residue," *IEEE J. Solid-State Circuits*, vol. 4, no. 43, pp. 769–777, Apr. 2008.
- [12] J. Gu, J. Wu, D. Gu, M. Zhang, and L. Shi, "All-digital wide range precharge logic 50% duty cycle corrector," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 20, no. 4, pp. 760–764, Apr. 2012.
- [13] R. Schreier, G. C. Temes, and S. R. Norsworthy, *Delta-Sigma Data Converters: Theory, Design, and Simulation*. Piscataway, NJ, USA: IEEE Press, 1997.
- [14] S. Ye, L. Jansson, and I. Galton, "A multiple-crystal interface PLL with VCO realignment to reduce phase noise," *IEEE J. Solid-State Circuits*, vol. 37, no. 12, pp. 1795–1803, Dec. 2002.
- [15] C. L. Wei, T. K. Kuan, and S. I. Liu, "A subharmonically injection-locked PLL with calibrated injection pulselwidth," *IEEE Trans. Circuits Syst. II, Express Briefs*, vol. 62, no. 6, pp. 548–552, Jun. 2015.
- [16] R. B. Staszewski, C.-M. Hung, N. Barton, M.-C. Lee, and D. Leipold, "A digitally controlled oscillator in a 90 nm digital CMOS process for mobile phones," *IEEE J. Solid-State Circuits*, vol. 40, no. 11, pp. 2203–2211, Nov. 2005.
- [17] Y.-H. Liu and T.-H. Lin, "A wideband PLL-based G/FSK transmitter in 0.18  $\mu$ m CMOS," *IEEE J. Solid-State Circuits*, vol. 44, no. 9, pp. 2452–2462, Sep. 2009.
- [18] V. Peiris *et al.*, "A 1 V 433/868 MHz 25 kb/s-FSK 2 kb/s-OOK RF transceiver SoC in standard digital 0.18  $\mu$ m CMOS," in *IEEE ISSCC Dig. Tech. Papers*, Feb. 2005, pp. 258–259.
- [19] L. Zhang *et al.*, "A reconfigurable sliding-IF transceiver for 400 MHz/2.4 GHz IEEE 802.15.6/ZigBee WBAN hubs with only 21% tuning range VCO," *IEEE J. Solid-State Circuits*, vol. 48, no. 11, pp. 2705–2716, Nov. 2013.
- [20] C. Bachmann *et al.*, "A 3.5mW 315/400MHz IEEE802.15.6/proprietary mode digitally-tunable radio SoC with integrated digital baseband and MAC processor in 40nm CMOS," in *VLSI Circuits Symp. Dig.*, Jun. 2015, pp. C94–C95.
- [21] M. Vidovjkovic *et al.*, "9.7 A 0.33nJ/b IEEE802.15.6/proprietary-MICS/ISM-band transceiver with scalable data-rate from 11kb/s to 4.5Mb/s for medical applications," in *IEEE ISSCC Dig. Tech. Papers*, Feb. 2014, pp. 170–171.



**Kok-Hin Teng** (S'13) received the Bachelor's degree in electrical engineering from the National University of Singapore, Singapore, in 2010, where he is currently pursuing the Ph.D. degree.

He was with GlobalFoundries, Singapore, developing various EDA related tech files using PERL and UNIX programming scripts. In 2012, he joined the Institute of Microelectronics, Singapore, as a Research Engineer, where he was involved in temperature sensor circuits design. His current research interests include frequency synthesizers and low-power transceiver circuits design.



**Chun-Huat Heng** (S'96–M'04–SM'13) received the B.Eng. and M.Eng. degrees from the National University of Singapore, Singapore, in 1996 and 1999, respectively, and the Ph.D. degree from the University of Illinois at Urbana–Champaign, Champaign, IL, USA, in 2003.

From 2001 to 2004, he was with Wireless Interface Technologies, San Diego, CA, USA, which was later acquired by Chrontel. Since 2004, he has been with the National University of Singapore, where he is currently an Associate Professor. His current research interests include CMOS integrated circuits involving synthesizers, delay-locked loops, and transceiver circuits.

Dr. Heng is currently a Technical Program Committee Member of the International Solid-State Circuits Conference and the Asian Solid-State Circuits Conference. He was a recipient of the NUS Annual Teaching Excellence Award in 2008, 2011, and 2013, and the Faculty Innovative Teaching Award in 2009. He was an Associate Editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II. He was on the ATEA Honor Roll in 2014.