

# A CMOS Polar Class-G Switched-Capacitor PA With a Single High-Current Supply, for LTE NB-IoT and eMTC

Elbert Bechthum<sup>1</sup>, Mohieddine El Soussi, *Member, IEEE*, Johan F. Dijkhuis, *Member, IEEE*, Paul Mateman, Gert-Jan van Schaik, Arjan Breeschoten, Yao-Hong Liu<sup>1</sup>, *Senior Member, IEEE*, and Christian Bachmann

**Abstract**—Class-G efficiency enhancement of switched-capacitor (SC) power amplifiers (PAs) requires the power management unit (PMU) to have two high-current power supplies. This paper presents a Class-G efficiency enhancement that only requires one single high-current power supply with a lower bandwidth requirement, significantly simplifying the PMU. The system analysis shows that a transmitter with a resolution of 13 bits and a sample rate of  $F_{RF}/2$  meets the requirements of the Cat-M1 and Cat-NB1 standard. The presented PA consists of a number of parallel SC cells. Each cell can be configured in two output-power modes, using only a single supply. At 807 MHz, the peak output power is 27.1 dBm, with an efficiency (PAE) of 33.3%. The efficiency at  $-6$  and  $-12$  dBFS is 22.5% and 14.4%, respectively, which is an improvement of 1.3x and 1.7x compared to Class-B. The 13-bit resolution enables a power-control range for Cat-M1 and Cat-NB1 signals of  $>63$  dB. In that range, the error vector magnitude (EVM) is  $<4.2\%$ .

**Index Terms**—Cat-M1, Cat-NB1, Class-G, enhanced machine type communication (MTC), long term evolution (LTE), narrow band (NB)-IoT, power amplifier (PA), single-supply Class-G, switched-capacitor (SC) PA.

## I. INTRODUCTION

THE 3rd Generation Partnership Project (3GPP) has introduced two cellular IoT standards based on Long Term Evolution (LTE): Cat-M1 [subtype of enhanced machine type communication (eMTC)] [1] and Cat-NB1 [subtype of narrow band (NB)-IoT] [2]. Massive deployment of the IoT sensor nodes requires low-cost hardware, i.e., small hardware footprint and low-cost battery. A small footprint can be achieved by integrating as much as possible on one CMOS die with few external components. Hence, an integrated CMOS power amplifier (PA) is crucial. Simplifying the power management unit (PMU) function helps to integrate the PMU on the same die, reducing system cost. Reducing the power consumption of the sensor node helps to drive the battery cost down. The energy consumption of a sensor node is studied in [3], where it is shown that the energy consumption of the transmitter (TX) limits the battery life. Hence, for long battery lifetime and low battery cost, the efficiency of the PA is crucial. The 3 GPP defines the TX power control range

Manuscript received November 20, 2018; revised February 5, 2019 and March 26, 2019; accepted April 1, 2019. Date of publication May 30, 2019; date of current version June 26, 2019. This paper was approved by Guest Editor Kaushik Sengupta. (*Corresponding author: Elbert Bechthum.*)

The authors are with the Holst Centre, IMEC, 5605 KN Eindhoven, The Netherlands (e-mail: elbert.bechthum@imec.nl).

Color versions of one or more of the figures in this paper are available online at <http://ieeexplore.ieee.org>.

Digital Object Identifier 10.1109/JSSC.2019.2910407

from 23 dBm down to  $-40$  dBm. The Cat-M1 and Cat-NB1 signals can have high peak-to-average power ratio (PAPR) values ( $>6$  dB). Therefore, the average and back-off efficiency is just as important as the peak efficiency. This paper will focus on the uplink (transmit) hardware in the sensor node.

Since the Cat-M1 and Cat-NB1 uplink signals are relatively NB, a polar transmitter (TX) can be used to achieve high efficiency [4], [5]. Based on the characteristics of the signals and based on modeling of the TX, transmitter specifications can be derived (see Section II).

For I/Q transmitters specifically, the (backoff) efficiency can be improved by I/Q sharing [6] or reducing the phase difference between the I and Q vectors [7]. Other techniques for improving backoff efficiency include outphasing [8], Doherty [9]–[11], dynamic impedance modulation [12], supply modulation [13], or switched-capacitor (SC) Class-G [9], [11] and [14]. Since SC Class-G is based on switching behavior, it is compatible with standard CMOS processes and is easily portable to smaller technology nodes. Furthermore, Class-G efficiency enhancement does not have a limited bandwidth. One of the disadvantages of Class-G is the need for two high-current supply voltages. These dual supplies require a complex PMU, increasing PMU cost (see Section III).

The Class-G efficiency-enhancement technique, which requires only one single high-current supply voltage, is further discussed in Section V. This technique is validated in a 40-nm CMOS PA for LTE Cat-M1 and Cat-NB1 for subgigahertz frequencies (699–915 MHz). The focus of the validation is on the differentiating characteristics of the proposed architecture. Key features of the PA include: single-supply Class-G efficiency enhancement technique, fully digital control with  $>60$  dB power-control range, and high spectral purity.

This paper is organized as follows: Section II discusses system considerations and derives key specifications for the PA. PMU challenges for the traditional and proposed Class-G topologies are discussed in Section III. In Section IV, the operation of the proposed single-supply Class-G efficiency enhancement is explained. In Section V, the full PA schematic is explained, and Section VI presents the measurement results.

This paper elaborates further on [15]. The additional contributions of this paper are in the analysis of system requirements and PMU considerations, the translation of the results of this analysis in design parameters and choices, the mapping of input code to the usage of the various back-off modes of the PA unit cells, analysis of the passive power combiner, and more measurement results.



Fig. 1. Block diagram of transmitter system simulation model.

## II. SYSTEM ANALYSIS

For IoT connections based on LTE, range is of high importance (i.e., large link budget). Therefore, the transmitter operating-frequency in this work is limited to the subgigahertz bands that are widely used: 699–915 MHz. Furthermore, only Cat-M1 and Cat-NB1 are considered, using either binary phase-shift keying (BPSK), quadrature phase-shift keying (QPSK), or 16-QAM. The bandwidth ranges from 3.75 kHz for single-carrier Cat-NB1 to 1.08 MHz for Cat-M1 with six resource blocks (6 RB).

This transmitter is designed to comply with the specifications of the 3 GPP LTE standard [1], [2]. The main requirements from the LTE standard, which are important for the transmitter, are: error vector magnitude (EVM), spectral mask, spurious emissions (in particular for coexistence). The EVM should be below 12.5% for 16-QAM or 17.5% for the other modulation types ( $-18$  and  $-15$  dB, respectively). The requirements on spurious emissions in a coexistence scenario go down to  $-50$  dBm/1 MHz, which puts stringent requirements on the digital PA.

### A. Transmitter System Model

The system model shown in Fig. 1 is used to derive the requirements for the sub-blocks of the Cat-M1 and Cat-NB1 transmitter. The baseband signals are generated at 1.92 MHz because of the subcarrier spacing and fast Fourier transform (FFT) size defined in the LTE Cat-M1 standard. The generated baseband signals are up-sampled by a factor of  $M$  ( $F_{BB} = M \cdot 1.92$  MHz), and then filtered by a root-raised cosine filter of order  $12 \cdot M$  and a rolloff factor of 0.22. The signal is then converted into envelope and phase components.

The amplitude [amplitude modulation (AM)] data is first fractionally up-sampled to  $F_{RF}/K$ , where  $F_{RF}$  is the carrier frequency and  $K$  is an integer. The signal is then filtered using a finite impulse response (FIR) filter to suppress the clock images of the AM data, and finally it is quantized with an  $n_{AM}$  bits quantizer.

The Phase data are processed using a simple phase modulator model: the data are differentiated, wrapped to limit the frequency deviation range, and then quantized to  $n_{pha}$  bits and truncated. The accumulated phase error due to most significant bit (MSB) truncation is minimized by a 1-bit sigma-delta quantization-error feedback to ensure a long-term phase coherency. The resulted data are up-sampled to  $F_{RF}$ , integrated and then fed to the PA.

### B. AM Path

Resolution and sample rate of the AM path have a major impact on the performance of the transmitter. The resolution is



Fig. 2. EVM depends on output power and AM resolution, the limit is  $-15$  dB for QPSK (Cat-M1, 6RB, QPSK, 860.16 MHz).



Fig. 3. Co-existence spurious emissions limit is met for 9-bit quantization noise and a sample rate of  $F_{RF}/2$  (Cat-M1, 6RB, QPSK, 860.16 MHz).

mainly dictated by the requirements on the EVM at the lowest output power and the requirements on the spurious emissions at the highest output power.

1) **AM Resolution:** Fig. 2 shows the simulated EVM for a 6-RB Cat-M1 QPSK signal for various output-power levels and AM resolution values. It shows that, to meet the EVM targets of  $-15$  dB for QPSK and  $-18$  dB for 16-QAM, the resolution of the PA should be 13 bits. An alternative to reduce the required resolution is to use a sigma-delta modulator. However, a sigma-delta modulator increases the quantization noise at frequencies where the stringent coexistence spurious emissions limit applies while only marginally improving EVM. Hence, for this design, no sigma-delta modulator is used and the required PA resolution is set to 13 bit. The quantization level for 9- and 13-bit PAs are shown in Fig. 3. It can be seen that a 9-bit PA is sufficient to meet the spectral mask of the coexistence scenario. If necessary, techniques exist to further reduce the quantization noise in specific frequency bands, e.g., analog FIR filtering [16], [17].

2) **AM Sample Rate:** The LTE spectral mask and spurious emissions limits require high spectral purity for the PA. Since the AM data is sampled data, signal images occur in the output spectrum around the output frequency at multiples of the AM data rate. A high sample rate positions these images far enough from the output frequency and makes them low enough. Transitions in the AM data should be synchronous



Fig. 4. Up to  $-4$  dB hard-clipping, no violations of the co-existence spurious emissions limit occurs (Cat-M1, 6RB, QPSK, 860.16 MHz).

with transitions in the local oscillator (LO) signal to prevent glitches in the output signal which would degrade the spectral purity [18]. Hence, a divided phase-locked loop (PLL) output can be used. Fig. 3 shows the output of a 13-bit PA at a sample rate of  $F_{RF}/2$ . If a lower sample rate would be used, e.g.,  $F_{RF}/4$ , signal images would occur at integer multiples of  $F_{RF}/4$  from the output signal, violating the spurious-emissions limit of the co-existence scenario. Hence, the sample rate is chosen to be  $F_{RF}/2$  ( $\sim 400$  MHz). At integer multiples of  $F_{RF}/2$  from the output frequency, no coexistence requirements are present. Since the complexity of the AM processing is very limited, the high AM sample rate does not impact the total power consumption significantly for higher output powers. For lower output powers, the AM processor could be bypassed, eliminating the additional power consumption of the AM processor.

3) *AM Clipping*: The PAPR has a significant impact on the efficiency of the PA, since it forces the PA to operate in backoff where the PA efficiency is lower than at its peak output power. The PAPR value of a 6RB Cat-M1 signal easily exceeds 9 dB if no clipping is allowed. However, the spectrum only degrades if significant clipping is introduced. Fig. 4 shows the close-in spectrum of a 6RB QPSK Cat-M1 signal for various hard-clipping levels. With respect to no clipping, the spectrum of the signal which is clipped to  $\text{PAPR}=6$  dB is indistinguishable. The 4 dB reduces the margin with respect to the spectral mask, but still does not violate it. A lower PAPR value (e.g., 2 dB) clearly violates the spectral mask. Hence, clipping up to 4 dB PAPR is perfectly acceptable for a 6RB Cat-M1 signal. The same holds for the other signal options that are considered.

### III. POWER MANAGEMENT UNIT

One of the disadvantages of Class-G is the need for two high-current supply voltages. The current in these two power supplies depends on the output signal in a non-linear way, increasing the requirements on the power supply load regulation and dynamic response to prevent distortion of the output signal.

This effect is illustrated in Fig. 5. The left waveform shows the amplitude control for a polar PA at the start of a Cat-M1 6RB QPSK packet, with an average power of  $-6$  dB<sub>FS</sub> of

the PA. A simple representation of the traditional Class-G PA (e.g., [14]) is shown in the top schematic on the next page, along with the current in the two separate power supplies. In this example,  $V_{dd2}=2 \cdot V_{dd1}$ . The PA uses current from one of the two supplies or from both supplies, depending on the AM data. The supply currents depend on the AM data in a non-linear way. The energy of the packet is spread over a large frequency range, putting stringent requirements on the static and dynamic behaviors of the PMU and the matching between the two power supplies to guarantee linear modulation. The two power supplies are typically decoupled on printed circuit board (PCB). However, between the PA core and this decoupling are the bond wires, package, and PCB traces. This degrades the dynamic behavior of the PMU and causes imbalance between the two supplies, degrading the performance of the typical Class-G PA.

A simple representation of a Class-G PA which uses only one power supply, as is proposed in this paper, is shown in Fig. 5 (bottom). The supply current of this PA also depends on the AM, but in such a way as to not increase the bandwidth of the spectrum significantly.

Fig. 6 shows the spectrum of the PA supply currents for the two supplies of the traditional Class-G and the proposed single-supply Class-G. Components up to the signal bandwidth of roughly 1 MHz are present in both PA topologies. The spectral content of the supply current for the proposed PA decreases rapidly for higher frequencies, especially between 1 and 2 MHz. The spectral content of the supply current for the traditional Class-G PA still contains significant energy up to 2 MHz. The difference between these two topologies is more than 15 dB at some frequencies, enabling a less complex PMU solution for a single-supply Class-G PA.

The traditional Class-G PA requires two high-current power supplies, whereas the proposed single-supply Class-G PA only requires one high-current power supply. Furthermore, the two power supplies of the traditional solution require a higher bandwidth and require matching of the static and dynamic responses to guarantee linear modulation. Hence, a single-supply solution clearly has advantages over the traditional dual-supply Class-G PA.

### IV. SINGLE-SUPPLY CLASS-G OPERATION PRINCIPLE

The output stage of the two cores are based on an SC Class-D topology [19]. The switched-capacitor PA (SCPA) has a very linear AM, high efficiency, and allows easy implementation of the Class-G efficiency enhancement. The proposed SCPA consists of a number of parallel cells. Each cell can be configured individually in three different modes: “full-amplitude,” “half-amplitude,” and “off,” but always take their power from the same supply source  $V_{dd}$ . The principles of these three modes are shown in Fig. 7. In “full-amplitude,” the outputs of the output stage ( $Actp$  and  $Actn$ ) are both switched between  $V_{dd}$  and  $V_{ss}$ . When all cells are in “full-amplitude,” the output power of the PA is maximal. In “half-amplitude,”  $Actp$  and  $Actn$  should have a voltage swing of  $1/2V_{dd}$ . In a traditional Class-G PAs, this is implemented using a second supply voltage [9], [14], complicating the PMU. In this work,



Fig. 5. Traditional class-G requires two high-current supply voltages ( $V_{dd2} = 2 \cdot V_{dd1}$ ), proposed architecture requires only one.



Fig. 6. Comparison of spectrum of supply current (Cat-M1 6RB).

a novel switching topology is proposed which eliminates the need for the second high-power supply voltage. In “half-amplitude,” during the first phase of the LO signal,  $Actp$  is connected to  $Vdd$  and  $Actn$  to  $Vss$ . In the second LO-phase,  $Actp$  and  $Actn$  are shorted together. Due to the current loop through the balun, these nodes both settle to  $1/2Vdd$ . Hence, the swing on both  $Actp$  and  $Actn$  is  $1/2Vdd$ . When all cells of the PA are in “half-amplitude,” the output power is  $-6$  dB<sub>FS</sub>. In “off,”  $Actp$  and  $Actn$  are static, producing no output power.

AM is implemented by selecting how many cells are in each mode. Assuming the array is unary weighed, the ideal output power during AM modulation, similar to [14], is

$$P_{out} = P_{peak} \cdot \left( \frac{n_f + n_h/2}{N} \right)^2 \quad (1)$$

where  $n_f$  is the number of cells in the full-amplitude mode,  $n_h$  is the number of cells in the half-amplitude mode, and  $N$  is the total number of cells. The proposed PA consists of two active cores (see Section V). A simplified overview of the usage of



Fig. 7. Principle of single-supply backoff efficiency enhancement.

the PA modes, using both PA cores to implement linear AM, is shown in Fig. 8(a). When sweeping the output power from zero to maximum, the PA starts with all cells “off.” Increasing output power (AM code) is achieved by changing the mode of some cells from “off” to “half-amplitude.” Only when all cells are in “half-amplitude” and more output power is required, some cells are configured in “full-amplitude” instead of “half-amplitude.” With this strategy, highest efficiency is obtained for all output-power levels. Interleaving the usage of the cells of core A and core B cancels the impact of gain mismatch between the two cores.

For low output powers, only few cells will be active. In that case, one of the active cores can be disabled, saving the power



Fig. 8. Cell usage for highest efficiency for all values of AM code, example for three unit cells per PA core, for using both active cores (a) and a single core (b).

consumption of the local LO buffers of that core. The usage of the power modes of the remaining core is shown in Fig. 8(b).

## V. ARCHITECTURE

The proposed PA architecture is shown in Fig. 9. The required peak output power of the PA depends on the average output power (23 dBm) and the PAPR of the Cat-M1 and Cat-NB1 signals. With a PAPR value of 4 dB the target peak output power is 27 dBm. To achieve the desired peak output power of 27 dBm, three techniques are employed: power combining, impedance transformation, and cascoding.

The equation for the maximum output power of a single SCPA of [20] can easily be extended to cover a differential SCPA with power combiner

$$P_{out} = \frac{2}{\pi^2} \cdot \frac{(V_{dd} \cdot N_{stage} \cdot N_{turn})^2}{R_{load}} \quad (2)$$

where  $V_{dd}$  is the supply voltage (2.2-V),  $N_{stage}$  is the number of single-ended active stages (4),  $N_{turn}$  is the turn ratio of the transformers (2), and  $R_{load}$  is the load impedance ( $100\text{-}\Omega$ ). Incorporating (1), the ideal output power during AM modulation is

$$P_{out} = \frac{2}{\pi^2} \cdot \left( \frac{n_f + n_h/2}{N} \right)^2 \cdot \frac{(V_{dd} \cdot N_{stage} \cdot N_{turn})^2}{R_{load}}. \quad (3)$$

The ideal output power of 28.0 dBm accommodates for 1.0 dB implementation loss to achieve the required 27 dBm. Each differential active core ideally generates 25 dBm in its  $12.5\text{-}\Omega$  load. The external load impedance of  $100\ \Omega$  is chosen to enable a design of the external matching network with sufficient suppression at the third harmonic. For the matching network, a  $\pi$  structure is chosen as a compromise between component count and suppression at the HD3.

### A. Power Combining and Impedance Transformation

The output powers of two active cores (Core A and Core B) are combined using transformers with 1:2 turn ratio. The secondary sides of the transformers are series connected, implementing a power combiner [21]. The disadvantage of using on-chip transformers is that they require a large chip area and it causes a loss of 1–2 dB in the PA output. However,

it saves a bulky on-PCB power combiner and simplifies the electrostatic discharge (ESD) protection of the PA output. Applying antiparallel diodes at the ground terminal of power combiner output is sufficient.

The layout of the power combiner can be seen in Fig. 9. The majority of the tracks are in ultrathick copper (in red), the crossing connections is either thick copper (green), or the aluminum top layer (gray). To maximize the inductance while minimizing the used area, the primary side consists of two turns, whereas the secondary side consists of four windings, achieving the desired 1:2 turn ratio. This turn ratio implements an impedance transformation which further lowers the impedance seen by the active stages. The power transfer of the power combiner is given in Fig. 10. The balun is used at the lower corner frequency of the wideband transfer. Moving this corner frequency even lower would cost considerable additional chip area. The wideband transfer enables future dual-band operation at the 900 and 1800 MHz bands [10]. In this design, the external matching network will limit the bandwidth, see Section VI.

### B. Schematic

Each active core (see Fig. 9) consists of a number of parallel cells. Each cell contains an output capacitor ( $C_p$  and  $C_n$ ), an output stage ( $M_1$ – $M_{12}$ ), buffers, and control logic. The capacitors implement a controllable capacitive voltage divider to enable AM by driving some of the cells with the RF frequency while shorting others to (ac) ground. The output stage uses cascoding to double the maximum supply voltage [14]. The levels of the control signals  $A$  to  $D$  and  $A'$  to  $D'$  are designed to guarantee robust operation, similar to [14]. The LO signals are shifted to the high-side switches using self-biased ac-coupled buffers [22]. These buffers feature duty-cycle calibration (indicated with  $DuCy$  in Fig. 9) to suppress the second harmonic of the output signal [23].

Power losses in this PA occur in the balun (finite coupling), the output stage (on-resistance, gate capacitance, overlap currents), and the LO distribution (parasitic capacitance and buffering).

### C. Supply Considerations

The main PA supply voltage  $V_{dd}$  is 2.2 V.  $V_{ddDig}$  and  $V_{ddLO}$  are both 1.1 V, and the current consumption of these supplies is dominated by the input buffers for the  $LO$  and the  $AM$  signals. These input buffers will not be present in the final application where the  $LO$  and  $AM$  signals are generated on-chip. Therefore, the current consumption on  $V_{ddDig}$  and  $V_{ddLO}$  is not taken into account for the PAE calculation.

The current consumption between  $V_{dd}$  and  $V_{dd}/2$  is not identical to the current consumption between  $V_{dd}/2$  and  $V_{ss}$ , mainly because of the size difference between the pMOS and nMOS transistors of the PA output stage, which creates imbalance in the current consumption. Therefore, the  $V_{dd}/2$  power domain needs to be stabilized with a low-current regulator, see Fig. 9. For this design, the regulator is implemented with an on-PCB opamp (supplied by  $V_{dd}$ ), but can easily be integrated on chip in future implementations. This regulator



Fig. 9. Schematic of PA (green: input buffers and digital processing, blue: LO distribution, black: PA core circuits). Layout of the power combiner, consisting of two 2:4 transformers.



Fig. 10. Efficiency of power combiner ( $P_{out}/Pin$ ).

only needs to supply a low current ( $|I| < 20 \text{ mA}$ ), and hence, its implementation is far less complicated than the high-current power supply of  $V_{dd}$ . The power consumption of this regulator is included in the PAE calculation. Hence, the total power consumption for the PAE calculation is  $I_{main} \cdot V_{dd}$ , see Fig. 9.

#### D. Resolution and Bandwidth

The target dynamic range of the PA is determined by the power control range (63 dB), the maximum PAPR ( $\sim 4 \text{ dB}$ ), and the required EVM (12.5% for Cat-M1 16-QAM), dictating the 13-bit resolution. The resolution is divided over the 2 PA cores (1 bit), 2 PA power modes (1 bit), 127 unary cells (7 bit), and 4 binary cells (4 bit). This segmentation is a tradeoff between layout complexity, area, and linearity.

The input AM signal has a sample rate of 61.44 MS/s and is up-sampled to  $F_{RF}/2$  using the on-chip AM processor (see Fig. 9). After up-sampling, a linear interpolation FIR filter suppresses the signal images of the input sample rate. The spurious emissions limit of  $-30 \text{ dBm}$  mainly concerns the second and third harmonic (HD2 and HD3). The HD2 is determined by the symmetry in the differential layout. To counteract

any systematic or random mismatch, duty-cycle calibration is implemented in the global LO buffers [23] (see Fig. 9). The HD3 is filtered by the matching network.

Because of the high sample rate of  $F_{RF}/2$ , the PA core can in principle transmit signals with a bandwidth of  $F_{RF}/4 \approx 200 \text{ MHz}$ . Because of the lack of anti-aliasing filtering, this would result in large aliases. The digital AM processor is designed for a maximum signal bandwidth of 1.4 MHz, limiting the maximum demonstrable signal bandwidth.

#### E. Layout

The proposed PA is implemented in a 40-nm CMOS process. The chip photograph is shown in Fig. 11. The area of the PA is  $1.8 \text{ mm}^2$ , the area of the complete die ( $5.0 \text{ mm}^2$ ) is dominated by the required number of I/O. The chip photograph indicates the main blocks of the PA, and the PA output and inputs. The other bumps are used for the configuration interface, auxiliary power supplies, and some are unconnected (e.g., the bump between the two output bumps). These unconnected bumps are left in for mechanical support and for thermal considerations.

For maximum efficiency, the delay differences between all unit cells (unary and binary) should be minimized. The unit cells of each active core are arranged in an array of  $8 \times 17$  cells, containing the 128 unary cells, 4 binary cells, and 4 dummy cells. To guarantee minimal difference between the response of the unary cells and the binary cells, the binary cells are embedded inside the array and only differ from the unary cells in the size of the output capacitor. The total delay of LO distribution and the output recombination routing are same for each unit cell [24].

## VI. MEASUREMENTS RESULTS

For measurements, the flip-chip Wafer-Level Chip-Scale Package (WLCSP) is directly mounted on the PCB.



Fig. 11. Chip micrograph.

The reference plane for the output power ( $P_{out}$ ) measurements is the SubMiniature version A (SMA) connector at the edge of the PCB, hence the on-PCB loss and the loss of the matching network is not de-embedded. In this work,  $PAE = P_{out}/(I_{main} \cdot Vdd)$ , where  $I_{main}$  includes the power consumption on the  $Vdd$  and  $Vdd/2$  power domains, and the power consumption of the  $VddLO$  and  $VddDig$  is not included in the PAE since that power consumption is dominated by input buffers, which are not present in the final application where the signal generation is on-chip. Thanks to the high linearity of the SC topology, no digital predistortion (DPD) is necessary to comply with the standard. Hence, no DPD is used.

The peak output power and PAE at 807 MHz (middle of subgigahertz LTE bands) are 27.1 dBm and 33.3%, respectively. When compensating for the on-PCB loss, the output power and efficiency are 27.3 dBm and 34.9%, respectively. In the remainder of this paper, the SMA connector at the edge of the PCB is used as reference plane, and hence, the on-PCB loss is not de-embedded.

The load-pull characteristics of the PA are shown in Fig. 12. For a VSWR of 1:3, the worst-case output power is 3 dB lower than at  $50\ \Omega$ , and the worst case efficiency is half the efficiency at  $50\ \Omega$ .

At  $-6\ dB_{FS}$ , the PAE is 22.5%, which is an improvement of  $1.3\times$  with respect to a typical Class-B PAE curve. At  $-12\ dB_{FS}$ , the PAE is 14.4%, which is an improvement of  $1.7\times$ . The efficiency curve as a function of the output power is shown in Fig. 13. Comparison to a typical Class-B PAE curve clearly shows the improvement, thanks to the three-mode output stage. Due to suboptimal implementation, overlap currents limit the efficiency in the “half-amplitude” mode, which can easily be improved by altering the design



TABLE I  
SUMMARY AND COMPARISON

|                          | This work       | [10]                  | [25]           | [8]                  | [9]               | [14]       |
|--------------------------|-----------------|-----------------------|----------------|----------------------|-------------------|------------|
| Technology               | 40nm            | 55nm                  | 65nm           | 45nm                 | 65nm              | 65nm       |
| Area                     | mm <sup>2</sup> | 5.0 / 1.8*            | 1.11           | 1.62                 | 1.21              | 3.2        |
| Freq. range              | MHz             | 699-915               | 850 / 1700     | 750-1015             | 900 / 2400        | ~2900-4800 |
| Standards                |                 | Cat-M1 / NB1          | Cat-NB1 / WLAN | 802.11ac             | LTE               | -          |
| Incl. on-PCB loss        | Y               | ?                     | Y              | N                    | N                 | Y          |
| Integrated balun         | Y               | Y                     | N              | N                    | Y                 | N          |
| DPD-less                 | Y               | N (LUT)               | N (LUT)        | N                    | N (AM-PM)         | Y          |
| Back-off PAE improvement | PAE             | Single-supply Class-G | Doherty        | Voltage-mode Doherty | Class-G & Doherty | Class-G    |
| Supply                   | V               | 2.2***                | 1.2/2.4        | 2.4/1.2              | 1.7/1.2           | 3.0/1.65   |
| Freq                     | MHz             | 807                   | ~850           | 900                  | 900               | 3710       |
| Peak Pout                | dBm             | 27.1                  | 28.9           | 24.0                 | 24.4              | 26.7       |
| PAE@peak                 | %               | 33.3                  | 36.8           | 45                   | 55                | 40.2**     |
| PAE@-6dB <sub>FS</sub>   | %               | 22.5                  | 29.9           | 34                   | 32                | 37.0**     |
| PAE@-12dB <sub>FS</sub>  | %               | 14.4                  | ~15            | ~17                  | 11                | 26.2**     |
| Modulation               |                 | Cat-M1 1.4MHz 16QAM   | Cat-NB1 180kHz | 40MHz 256QAM         | 10MHz             | 1MHz 16QAM |
| PAPR                     | dB              | 4.5                   | 4.5            | 9                    | 6                 | 5.4        |
| PAE                      | %               | 21.2                  | 29.5           | 22                   | 32                | 28.8       |
| EVM                      | %               | 3.1                   | 8.3            | 1.8                  | -                 | 6.3        |
| M1&NB1 Pout range        | dBm             | -42 to +22            | -              | -                    | -                 | -          |

\*PA only; \*\*Drain efficiency; \*\*\*The PA also uses a 1.1V supply, generated by an on-PCB regulator. The power consumption of this regulator is included in the PAE.



Fig. 13. PAE as function of output power at 807 MHz (cont. wave).



Fig. 14. AM/AM distortion at 807 MHz.

Cat-NB1 performance is validated using a  $12 \times 15$  kHz QPSK signal at 807 MHz. The maximum (average) output power is 22.8 dBm, and the corresponding PAE is 23.1%.



Fig. 15. Frequency dependence of output power and efficiency (spikes in PAE are due to variations in output-power measurements).

The spectral mask compliance is shown in Fig. 16, together with the spectral mask compliance of a  $1 \times 3.75$  kHz signal. The worst case EVM is 4.0% at  $-40$  dBm, well below the required 17.5%, see Fig. 17. This output power sweep clearly demonstrates a large dynamic range, thanks to the 13-bit resolution. A wideband spectrum is shown in Fig. 18. The only violation is at HD3, which can easily be suppressed by slightly changing the matching network.

For a 6RB 16-QAM Cat-M1 signal, the maximum output power is 21.8 dBm with a PAE of 21.2%. The worst-case EVM is 4.22%, which is much better than the required 12.5%. The best-case EVM for Cat-NB1 is worse than for Cat-M1 due to the tighter filter baseband for Cat-NB1, necessary to meet the spectral mask. The adjacent channel leakage ratio (ACLR) for the Cat-M1 signal is  $<-33$  dB (see Fig. 19), which is better



Fig. 16. PA meets all requirements on the spectral emissions mask. (a) Cat-M1, 6RB, 16QAM, at 807 MHz. (b) Cat-NB1, 1  $\times$  3.75 kHz at position 47, QPSK, at 807 MHz. (c) Cat-NB1, 12  $\times$  15 kHz, QPSK, at 807 MHz.



Fig. 17. EVM versus output power at 807 MHz for Cat-NB1 and Cat-M1 signals.

than the required  $-30$  dB. The spectral mask compliance is shown in Fig. 16. The output power for Cat-M1 and Cat-NB1 signals ranges from  $-42$  dBm to  $+22$  dBm.



Fig. 18. Spurious-emissions test for Cat-NB1 12  $\times$  15 kHz QPSK. HD3 can be reduced by optimizing the filtering of the matching network.



Fig. 19. ACLR  $< -33$  dB for Cat-M1 6RB 16-QAM (requirement:  $< -30$  dB).

A performance summary and comparison with the state-of-the-art PAs is given in Table I. Main highlights are: the PA only requires a single supply voltage; the measurement results include all on-PCB loss and features an on-chip balun; the specifications of the Cat-M1 and Cat-NB1 with respect to EVM, ACLR, and spectral emission masks are met without using DPD; and the output power control range is  $>63$  dB.

## VII. CONCLUSION

RF PAs with Class-G RF efficiency enhancement require a complex power-supply setup because of the two high-power supply domains. The proposed integrated CMOS PA demonstrates a novel Class-G efficiency enhancement that only requires one single power supply. This significantly reduces the PMU complexity with respect to traditional Class-G efficiency enhancement methods. The proposed architecture is validated in a Cat-M1 and Cat-NB1 PA.

The presented single-supply efficiency enhancement paves the way for single-chip Cat-NB1 and Cat-M1 transceivers with high average efficiency, simple PMU, and high dynamic range, enabling low-cost implementation of cellular sensor nodes.

## REFERENCES

- [1] A. Rico-Alvarino *et al.*, "An overview of 3GPP enhancements on machine to machine communications," *IEEE Commun. Mag.*, vol. 54, no. 6, pp. 14–21, Jun. 2016.
- [2] Y.-P. E. Wang *et al.*, "A primer on 3GPP narrowband Internet of Things," *IEEE Commun. Mag.*, vol. 55, no. 3, pp. 117–123, Mar. 2017.
- [3] M. El Soussi, P. Zand, F. Pasveer, and G. Dolmans, "Evaluating the performance of eMTC and NB-IoT for smart city applications," in *Proc. IEEE Int. Conf. Commun. (ICC)*, May 2018, pp. 1–7.

- [4] E. Bechthum, G. Radulov, J. Briaire, G. Geelen, and A. van Roermund, "Classification for synthesis of high spectral purity current-steering mixing-DAC architectures," *Analog Integr. Circuits Signal Process.*, vol. 85, no. 3, pp. 497–504, Dec. 2015.
- [5] P. Reynaert and M. S. J. Steyaert, "A 1.75-GHz polar modulated CMOS RF power amplifier for GSM-EDGE," *IEEE J. Solid-State Circuits*, vol. 40, no. 12, pp. 2598–2608, Dec. 2005.
- [6] S.-W. Yoo, S.-C. Hung, and S.-M. Yoo, "A 1W quadrature class-G switched-capacitor power amplifier with merged cell switching and linearization techniques," in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, Jun. 2018, pp. 124–127.
- [7] W. Yuan and J. S. Walling, "A multiphase switched capacitor power amplifier," *IEEE J. Solid-State Circuits*, vol. 52, no. 5, pp. 1320–1330, May 2017.
- [8] L. Ding, J. Hur, A. Banerjee, R. Hezar, and B. Haroun, "A 25 dBm upphasing power amplifier with cross-bridge combiners," *IEEE J. Solid-State Circuits*, vol. 50, no. 5, pp. 1107–1116, May 2015.
- [9] S. Hu, S. Kousai, and H. Wang, "A broadband mixed-signal CMOS power amplifier with a hybrid class-G Doherty efficiency enhancement technique," *IEEE J. Solid-State Circuits*, vol. 51, no. 3, pp. 598–613, Mar. 2016.
- [10] Y. Yin, L. Xiong, Y. Zhu, B. Chen, H. Min, and H. Xu, "A compact dual-band digital Doherty power amplifier using parallel-combining transformer for cellular NB-IoT applications," in *ISSCC Dig. Tech. Papers*, Feb. 2018, pp. 408–410.
- [11] V. Vorapipat, C. S. Levy, and P. M. Asbeck, "A class-G voltage-mode Doherty power amplifier," *IEEE J. Solid-State Circuits*, vol. 52, no. 12, pp. 3348–3360, Dec. 2017.
- [12] L. Ye, J. Chen, L. Kong, E. Alon, and A. M. Niknejad, "Design considerations for a direct digitally modulated WLAN transmitter with integrated phase path and dynamic impedance modulation," *IEEE J. Solid-State Circuits*, vol. 48, no. 12, pp. 3160–3177, Dec. 2013.
- [13] J. S. Walling, S. S. Taylor, and D. J. Allstot, "A class-G supply modulator and class-E PA in 130 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 44, no. 9, pp. 2339–2347, Sep. 2009.
- [14] S.-M. Yoo *et al.*, "A class-G switched-capacitor RF power amplifier," *IEEE J. Solid-State Circuits*, vol. 48, no. 5, pp. 1212–1224, May 2013.
- [15] E. Bechthum *et al.*, "A CMOS polar single-supply class-G SCPA for LTE NB-IoT and Cat-M1," in *Proc. IEEE 44th Eur. Solid State Circuits Conf. (ESSCIRC)*, Sep. 2018, pp. 30–33.
- [16] S. M. Taleie, T. Copani, B. Bakkaloglu, and S. Kiaei, "A bandpass  $\Delta\Sigma$ RF-DAC with embedded FIR reconstruction filter," in *ISSCC Dig. Tech. Papers*, Feb. 2006, pp. 2370–2379.
- [17] R. Bhat, J. Zhou, and H. Krishnaswamy, "Wideband mixed-domain multi-tap finite-impulse response filtering of out-of-band noise floor in watt-class digital transmitters," *IEEE J. Solid-State Circuits*, vol. 52, no. 12, pp. 3405–3420, Dec. 2017.
- [18] A. Ba *et al.*, "A 1.3 nJ/b IEEE 802.11ah fully-digital polar transmitter for IoT applications," *IEEE J. Solid-State Circuits*, vol. 51, no. 12, pp. 3103–3113, Dec. 2016.
- [19] S.-M. Yoo, J. S. Walling, E. C. Woo, and D. J. Allstot, "A switched-capacitor power amplifier for EER/polar transmitters," in *Proc. IEEE Int. Solid-State Circuits Conf.*, Feb. 2011, pp. 428–430.
- [20] S.-M. Yoo, J. S. Walling, E. C. Woo, B. Jann, and D. J. Allstot, "A switched-capacitor RF power amplifier," *IEEE J. Solid State Circuits*, vol. 46, no. 12, pp. 2977–2987, Dec. 2011.
- [21] P. Haldi, G. Liu, and A. M. Niknejad, "CMOS compatible transformer power combiner," *Electron. Lett.*, vol. 42, no. 19, pp. 1091–1092, Sep. 2006.
- [22] V. Grygorenko, *Offset Compensation for High Gain AC Amplifiers*. San Jose, CA, USA: Cypress, 2005.
- [23] A. Ba, V. K. Chilla, Y. Liu, H. Kato, K. Philips, and R. B. Staszewski, "A 2.4 GHz class-D power amplifier with conduction angle calibration for -50dBc harmonic emissions," in *Proc. IEEE Radio Freq. Integr. Circuits Symp.*, Jun. 2014, pp. 239–242.
- [24] T. Chen and G. G. E. Gielen, "The analysis and improvement of a current-steering DACs dynamic SFDR-I: The cell-dependent delay differences," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 53, no. 1, pp. 3–15, Jan. 2006.
- [25] V. Vorapipat, C. Levy, and P. Asbeck, "A wideband voltage mode Doherty power amplifier," in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, May 2016, pp. 266–269.



**Elbert Bechthum** received the B.Sc. and M.Sc. degrees in electrical engineering and the Ph.D. degree (*cum laude*) from the Eindhoven University of Technology, Eindhoven, The Netherlands, in 2006, 2008, and 2015, respectively. His Ph.D. thesis was focused on wideband mixing-DACs with high spectral purity.

He is currently with IMEC, Eindhoven, where he is involved in ultralow-power radios and cellular IoT. He has authored or coauthored ten publications. He holds one patent. His research interests include high-performance data converters, low-power radios, and digital PAs.



**Mohieddine El Soussi** (M'09) received the M.Sc. degree in communication engineering from the Technische Universität München, Munich, Germany, in 2007, and the Ph.D. degree from the Université catholique de Louvain (UCL), Louvain-la-Neuve, Belgium, in 2014.

Since 2016, he has been with IMEC, Eindhoven, The Netherlands. His research interests include cooperative network, physical layer security, MIMO, localization, network coding, and optimization.



**Johan F. Dijkhuis** (M'10) received the M.S. degree in electrical engineering from the University of Twente, Enschede, The Netherlands, in 1998.

In 1998, he was an RF and Analog Design Engineer with Philips Semiconductor, Nijmegen, The Netherlands, NXP, Nijmegen, ST-Ericsson, Nijmegen, and NVIDIA, Nijmegen. Since 2014, he has been with the Holst Centre, IMEC, Eindhoven, The Netherlands. His research interests are RF circuit design and power management circuits for ultralow-power radios.



**Paul Mateman** received the bachelor's degree in computer technology from the Poly Technical School, Eindhoven, The Netherlands, in 1988, and the M.S. degree in electrical engineering from the Delft University of Technology, Delft, The Netherlands, in 1993.

He was with Philips Research, Eindhoven, Rockwell, Sophia Antipolis, France, NXP, Nijmegen, The Netherlands, ST Ericsson, Nijmegen, and Nvidia, Nijmegen. In 2014, he joined the Holst Centre, IMEC, Eindhoven, where he is currently involved in mixed-signal RF blocks (e.g., ADPLL).



**Gert-Jan van Schaik** received the bachelor's degree from The Hague University of Applied Sciences, The Hague, The Netherlands, in 1995.

He was with Agere Systems, Nieuwegein, The Netherlands, and Motorola B.V., Nieuwegein. In 2011, he joined the Holst Centre, IMEC, Eindhoven, The Netherlands, where he is currently involved in digital design.



**Arjan Breeschoten** received the B.Sc. degree in computer technology from HTS Windesheim, Zwolle, The Netherlands, in 1994.

Until 2008, he worked within the semiconductor industry involved in research and development for ASICs in the Security, WiFi, WiMax, and DECT application domains. In 2009, he joined the Holst Centre, IMEC, Eindhoven, The Netherlands, where he focused on system integration and project management for ultralow-power communications solutions.



**Yao-Hong Liu** (S'04–M'09–SM'17) received the Ph.D. degree from National Taiwan University, Taipei, Taiwan, in 2009.

From 2002 to 2010, he was with Terax, Hsinchu, Taiwan, Intel, Taipei, and Mobile Devices, Taipei, where he was involved in Bluetooth, WiFi, and cellular wireless SoC products. Since 2010, he has been with IMEC, Eindhoven, The Netherlands. He is currently a Principal Member of Technical Staff, and he is leading the development of the ultralow-power wireless IC design. His research focuses on energy-efficient RF transceivers and radars for the IoT and healthcare applications.

Dr. Liu serves on the technical program committee of IEEE ISSCC and RFIC symposium.



**Christian Bachmann** received the Ph.D. degree in electrical engineering from the Graz University of Technology, Graz, Austria in 2011.

He was involved in various wireless communication solutions for 802.11ah WiFi, Bluetooth LE, 802.15.4 (Zigbee), ultrawideband impulse radio, and so on. He was with Infineon Technologies, Graz, and Graz University of Technology, where he was involved in the research of hardware-accelerated power estimation for VLSI systems. In 2011, he joined IMEC, Eindhoven, The Netherlands, where he is involved in ultralow-power wireless communication systems, digital baseband processing, and hardware/software codesign. He is currently a Program Manager of IMEC's ULP Wireless Systems and Secure Proximity Programs.