

# A Fully Synthesizable All-Digital Dual-Loop Distributed Low-Dropout Regulator

Xiangyu Mao<sup>1</sup>, Member, IEEE, Yan Lu<sup>1</sup>, Senior Member, IEEE, and Rui P. Martins<sup>2</sup>, Life Fellow, IEEE

**Abstract**—Distributed low-dropout voltage regulators (LDOs) can mitigate the global IR drop and improve the local transient performances for a high-current large-area power delivery network. However, they also face integration and current-sharing challenges. To tackle these challenges, this article presents an all-digital dual-loop distributed LDO with one global controller (GC) and multiple scalable local voltage regulators (LVRs). For fully synthesizable and easy integration, all the control circuits are implemented using standard digital cells. In each LVR, we use a 5-bit time-to-digital converter (TDC) for fast local voltage sensing. The global integral loop provides the dynamic reference bits for all the LVRs, compensating the TDC PVT variations in the LVRs. This all-digital comparator-TDC quantizer, combined with coarse-fine tuning and asynchronous window control, enables the proposed LDO architecture to obtain high output accuracy and one-cycle transient response. For current balancing in distributed scenarios, we introduce a digital primary-secondary one-time calibration scheme to tackle the mismatches among the local TDCs. A distributed LDO prototype with one GC and nine LVRs is implemented in a 28-nm bulk CMOS process. Measurements with one LVR and multiple LVRs demonstrate the stability and scalability of the proposed architecture. With nine LVRs, the measured droop is 54 mV under a 1.35-A/10-ns sharp load step. We also obtain a good load regulation of 3 mV/A, a peak current efficiency of 99.67%, and a current density of 16.7 A/mm<sup>2</sup>.

**Index Terms**—Asynchronous window control, distributed LDO, dual loop, fully synthesizable, low-dropout regulator, primary-secondary calibration.

## I. INTRODUCTION

THE number of cores in a microprocessor grows substantially to meet the huge demand for high-performance computing. Per-core dynamic voltage and frequency scaling (DVFS) can significantly improve the system energy efficiency of multicore processors [1]. Fully integrated LDOs

Manuscript received 11 February 2023; revised 27 October 2023; accepted 29 November 2023. Date of publication 22 December 2023; date of current version 29 May 2024. This article was approved by Associate Editor Sanu K Mathew. This work was supported in part by the National Natural Science Foundation of China under Grant 62122001, in part by the Hetao Shenzhen-Hong Kong Science and Technology Innovation Cooperation Zone Project under Grant HTHZQSWS-KCCYB-2023030, and in part by the Macau Science and Technology Development Fund under Grant SKL-AMSV(UM)-2023-2025. (*Corresponding author: Yan Lu*)

Xiangyu Mao and Yan Lu are with the State Key Laboratory of Analog and Mixed-Signal VLSI, the Institute of Microelectronics, and FST-DECE, University of Macau, Macau, China (e-mail: xymao@um.edu.mo; yanlu@um.edu.mo).

Rui P. Martins is with the State Key Laboratory of Analog and Mixed-Signal VLSI, the Institute of Microelectronics, and FST-DECE, University of Macau, Macau, China, on leave from the Instituto Superior Técnico, Universidade de Lisboa, 1049-001 Lisbon, Portugal.

Color versions of one or more figures in this article are available at <https://doi.org/10.1109/JSSC.2023.3340008>.

Digital Object Identifier 10.1109/JSSC.2023.3340008



Fig. 1. Distributed LDO for a large-area microprocessor.

can create individual voltage domains for each core due to the advantages in cost, power density, accuracy, and transient response [2], [3].

Traditionally, designer would place an LDO on one side or on a corner of the load block. For high-current applications, especially when the physical domain is large ( $>1 \text{ mm}^2$ ), the IR drop across the supply rail becomes considerable due to the sophisticated power delivery network. Large IR drop may cause trimming failures [4]. Then, a large supply voltage guard band is required to guarantee the performances but would waste more energy proportional to  $V_{DD}^2$  [5]. In this situation, a distributed regulator architecture becomes attractive [6], [7], [8], [9], [10], [11], [12], [13]. Fig. 1 shows a large-area system with multiple regulators that are distributed spatially within the load blocks. The distributed power network helps to reduce the current redistribution through the grid and thus can decrease the IR drop. In addition, due to the multiple sensing points and their short distances to the load circuits, the distributed regulators can respond much faster to sudden local load transients.

In addition to the typical requirements of a single-point LDO, distributed LDOs have additional challenges. First, a distributed LDO scheme has higher demand for an all-digital solution, as it would be much easier for inserting the regulators into the digital system and for migrating them together to future advanced process nodes. Second, since multiple regulators with different offsets share the same input–output power grid, we should pay attention to the current-sharing/-balancing problem. Severe unbalanced currents may lead to local hot spots and limit the local transient response [7].



Fig. 2. (a) Prior parallel distributed LDOs, (b) dual-loop distributed LDOs, and (c) proposed all-digital dual-loop distributed LDO.

Third, beyond the single LDO stability, it is also critical to analyze the stability of the distributed multiregulator system. Plus, the LDO output capability should be scalable without system stability issues.

We can categorize distributed LDOs in two types: parallel architectures [8], [9], [10] or dual-loop architectures [11], [12]. Fig. 2(a) and (b) exhibits the block diagrams of prior parallel and dual-loop distributed LDOs, respectively. The biggest challenge of the parallel architectures is the current sharing problem. For a digital LDO (DLDO) with a proportional-integral-derivative (PID) controller, we can calculate the control word  $M$  of the power switches (PSs) as follows:

$$M = K_P \times e[n] + K_I \times \Sigma e[n] + K_D \times (e[n] - e[n - 1]) \quad (1)$$

where  $e[n]$  represents the difference between  $V_{OUT}$  and  $V_{REF}$ .  $K_P$ ,  $K_I$ , and  $K_D$  are the coefficients of the controller. For the parallel distributed DLDOs, since each DLDO requires its own quantizer and digital controller, the quantization error among different DLDOs may lead to the imbalance current sharing. According to (1), we have

$$\Delta M = K_P \times \Delta e[n] + K_I \times \Sigma \Delta e[n] \quad (2)$$

where  $\Delta M$  represents the difference in the number of “on” PSs. Due to the integral term  $\Sigma \Delta e[n]$ , even though there is a small quantization error between two DLDOs, their final control codes could have a large difference. In [8], the measured maximum unbalanced current error can be almost 100%.

Besides the current-sharing issue, the parallel-distributed DLDOs with comparator-based quantizers [9], [10] require multiple global analog reference signals routing in a noisy digital environment, imposing design difficulties.

The dual-loop architectures have a global controller (GC) and multiple local voltage regulators (LVRs). Due to the shared GC, the prior dual-loop architectures [11], [12] can obtain current balancing easier, but the prior dual-loop works [11], [12], [13] adopt many analog circuit blocks, such as charge-pump (CP)-based integrator, analog comparator (CMP), and current-mode flash ADC (CMFADC). These analog circuits may limit the design cycle and the time for process migration. Also, we need to integrate these circuits carefully in the digital environment to avoid interferences. Besides, they all require a large output capacitor (750 nF in [11], 481 nF in [12], and 1000 nF in [13]), which considerably increases the cost.

According to the above analysis, a dual-loop and fully synthesizable distributed DLDO could be a better choice to achieve current balancing and easier integration. Fig. 2(c) shows our proposed dual-loop all-digital distributed LDO architecture. Section II presents the advantages of the single-point version of the proposed dual-loop LDO architecture, and the three key techniques: comparator-TDC quantizer, coarse-fine tuning and asynchronous window control, and also the loop stability analysis. In Section III, we investigate how to distribute the proposed LDO and suggest the principles for scaling. Besides, we introduced the digital primary-secondary calibration for current-balancing among the LVRs. Section IV demonstrates the distributed DLDO system prototype with one GC and nine LVRs. The measurement results confirm the efficacy of the proposed techniques. Finally, we draw conclusions in Section V.

## II. DUAL-LOOP ARCHITECTURE AND WORKING PRINCIPLE

### A. Dual-Loop Architecture for Digital Control LDO

Fig. 3 shows the prior DLDO architectures and the proposed dual-loop architecture. Fig. 3(a) is the shift-register-based control DLDO [14]. The comparator (CMP) compares  $V_{REF}$  and  $V_{OUT}$  in each clock cycle, determining the shift direction of the bidirectional shift register (Bi-SR). Since an error of only a few millivolt is sufficient to change the comparator output, the dc load regulation error can be very small for high output accuracy. However, in each cycle, only one power transistor unit can be turned on or off, leading to a slow transient response.

Fig. 3(b) illustrates the prior DLDO with ADC-based control. A high-accuracy fast-response DLDO usually requires a high-resolution and high-speed ADC, which is complex and power hungry. Moreover, the output accuracy is limited by the ADC's resolution, increasing the resolution will complicate the digital controller design exponentially, with higher power consumption and area. Obviously, there are power, speed, accuracy, and complexity design tradeoffs.

Fig. 3(c) presents the proposed dual-loop digital control architecture. In this work, we separate the ADC's high-accuracy and high-speed requirements first and then combine



Fig. 3. (a) Shift-register-based DLDO, (b) ADC-based DLDO, and (c) proposed dual-loop digital control architecture.



Fig. 4. (a) CMP-based ADC, (b) TDC-based ADC, and (c) proposed CMP-TDC structure with closed-loop regulation.

them with a proportional and integral control. The comparator and the binary shift register form an accurate integral loop, which ensures high output accuracy. The ADC, subtractor, and poststage controller form a high-speed proportional part. Here, we only need a coarse-resolution high-speed ADC, which can significantly reduce the design complexity of the ADC and controller.

The proposed CMP-ADC dual-loop architecture combines the advantages of the SR-based and the ADC-based DLDOs and has obvious advantages compared to each of them. For the distributed LDOs, we need to find a suitable ADC structure for easy integration and fast transient response. This ADC in the LVRs needs to detect  $V_{OUT}$  changes quickly.

We can categorize the commonly used ADCs into the voltage domain ones [15], [16] and time domain ones [17], [18]. The voltage-domain ADC [Fig. 4(a)] uses multiple comparators and voltage references to detect  $V_{OUT}$  variations. In a single LDO, [15] has five comparators, and [16] has 13 comparators. For the distributed architecture, it means that multiple analog references are required for global long-distance routing in a noisy digital environment. Moreover, tens or even hundreds of comparators need calibration, which significantly increases the complexity and cost.

The time-domain ADCs [Fig. 4(b)] using voltage-controlled oscillators (VCOs) or time-to-digital converters (TCDs) can obtain a multibit detection in one cycle and do not need



Fig. 5. (a) NAND-based comparator. (b) Inverter-chain TDC.

multiple references. However, they are sensitive to PVT variations. Mahajan et al. [17] utilized a pair of VCOs and an analog voltage-to-current converter to resist PVT variations with twice the power consumption and area cost. On the other hand, Bang et al. [8] use only one 6-bit TDC but require a complex active calibration for the target code.

In the proposed dual-loop architecture, we adopt TDC as the voltage sensor [Fig. 4(c)]. Although the TDC output is sensitive to PVT and frequency variations, however, the integral loop just can automatically track these variations and adjust the corresponding  $Q_B$  value to maintain output accuracy. The blue dotted line in Fig. 4(c) shows the process of the closed-loop tracking of  $Q_B$ .

If the TDC output increases due to the PVT and frequency variations, it will result in a decrease in the difference  $D_B$  value. After being processed by the controller and the power stage,  $V_{OUT}$  will decrease. Then, since  $V_{OUT} < V_{REF}$ , the comparator output maintains high. Therefore, the shift-register output  $Q_B$  increase, the difference  $D_B$  increases, and  $V_{OUT}$  will rise back to the target value. Since the variations are either slowly changing (temperature, aging) or relatively fixed (process, frequency), while the regulation speed of the integration loop is in nanosecond level, it is easy to track these variations and to guarantee stability.

Fig. 5 plots the schematics of the comparator and the TDC. We implement all the circuits with standard digital library cells, making them compatible with digital design flows and easily scalable with advanced process technologies.

### B. Working Principle of the Proposed DLDO

We exhibit in Fig. 6 the single-point version of the proposed DLDO architecture with asynchronous floating window control. The integral loop adopts a 10-bit binary-output Bi-SR



$Q_B<9:4>$ : Binary code, higher bit output of Bi-SR.

$Q_B<3:0>$ : Binary code, lower bit output of Bi-SR.

$T_B<4:0>$ : Binary code, output of TDC.

$D_B<5:0>$ : Binary code,  $D_B = Q_B - T_B$ .

B2T: Binary code to thermometer code converter.

$M_B<3:0>$ : Binary code, output of the asynchronous window.

$C_T<12:1>$ : Thermometer code, control signals of the coarse switches.

$F_T<15:1>$ : Thermometer code, control signals of the fine switches.

Fig. 6. Proposed DLDO structure with asynchronous window control.



Fig. 7. Working principle of the proposed DLDO.

rather than a conventional barrel shift register to reduce the number of DFFs. In addition, the binary output is good for subsequent digital signal processing and layout distribution. Both the CMP and Bi-SR use a divided-by-2 clock. The higher bits  $Q_B<9:4>$  serve as a reference code for the fast proportional loop, while the lower bits  $Q_B<3:0>$  are converted to thermometer code  $F_T<15:1>$  for the fine switches. The fast proportional loop employs an inverter-chain-based TDC to quantize  $V_{OUT}$  into  $T_B<4:0>$ . Then, the asynchronous logics perform a subtraction between  $Q_B<9:4>$  and  $T_B<4:0>$ , obtaining their difference  $D_B<5:0>$ , with  $D_B<5>$  as the sign bit.

The asynchronous window has both small- and large-signal hybrid responses. When  $D_B<5:0>$  is within the window, its output  $M_B<3:0> = D_B<3:0>$ . If it exceeds the upper (1100) or lower (0000) limits, the coarse switches will be all-on or all-off.

$$M_B<3:0> = \begin{cases} 0000, & D_B < 0 \\ D_B<3:0>, & 0 \leq D_B \leq 12 \\ 1100, & D_B > 12. \end{cases} \quad (3)$$

Fig. 7 illustrates the abovementioned working principle. In the steady state,  $V_{OUT}$  is stable, and the TDC output  $T_B<4:0>$  would not change, despite the inherent limit-cycle oscillation



Fig. 8. Simulated load transient waveforms under different corners.

for all the digitally controlled systems [19]. The proposed DLDO works with only the Bi-SR, and the integral loop regulates all the PSs with  $Q_B<9:0>$ . When all the fine PSs are on ( $Q_B<3:0> = 1111$ ), if the load current increases, the carry bit increases  $Q_B<9:4>$  by 1, thus turning on a coarse power switch while resetting  $Q_B<3:0>$  to 0000. Since the width ratio of a coarse switch to a fine switch is 16:1, it is equivalent to turning on one fine switch. This is a smooth and high-accuracy regulation. When a sudden load transient occurs, since the subtractor and control window are asynchronous circuits and the TDC can detect a  $V_{OUT}$  droop in one cycle, the fast local loop can turn on the coarse PSs proportionally to prevent further voltage drop. Then, the integral loop with  $Q_B<9:0>$  will regulate  $V_{OUT}$  to the target value.

Fig. 8 displays the simulated load transient waveforms of the single DLDO. When  $V_{IN} = 1$  V,  $V_{OUT} = 0.9$  V, and  $C_L = 1.5$  nF, the load capability  $I_{MAX}$  is about 250 mA. For a load step of 30–180 mA with a 0.1-ns edge time, the typical undershoot is about 96 mV. We also give the simulated waveforms under other corners (FF, 125 °C and SS, -40 °C). The zoomed-in waveforms on the right show that the proposed DLDO works well and maintains a high output accuracy under different simulation corners.



Fig. 9. Simulated transient droop comparison between  $W = 12$  and  $W = 6$ .

The digital window size  $W$  relates to the loop stability and transient performance. It determines the number of coarse PSs. Here,  $I_{\text{MAX}}$  represents the maximum load current,  $I_{\text{UNIT\_C}}$  is the current of a single coarse power switch, and  $I_{\text{UNIT\_F}}$  is the current of a single fine power switch. We have

$$I_{\text{UNIT\_C}} = \frac{I_{\text{MAX}}}{W}, \quad I_{\text{UNIT\_F}} = \frac{I_{\text{UNIT\_C}}}{16} = \frac{I_{\text{MAX}}}{16 \times W}. \quad (4)$$

For the given total strength of the PSs, the smaller the  $W$  value, the greater strength of each coarse power switch. When a load transient occurs, the same voltage droop will generate a larger output current, which helps to improve the transient performance.

Fig. 9 gives the transient droop value  $V_{\text{DROOP}}$  comparison between  $W = 12$  and  $W = 6$  across the edge time  $T_{\text{EDGE}}$  from 0.1 to 20 ns; when  $V_{\text{IN}} = 1$  V,  $V_{\text{OUT}} = 0.9$  V, and  $C_L = 3$  nF, LDO is stable under the two conditions, while  $V_{\text{DROOP}}$  is reduced by 15–25 mV when  $W = 6$ .

### C. Standalone LDO Stability Analysis

Fig. 10 shows the simplified model of the proposed DLDO. To simplify the analysis, we neglect the small delay of the asynchronous controller. Meanwhile, the power switch can be equivalenced as a constant current source with a fixed  $V_{\text{IN}}$  and  $V_{\text{OUT}}$ . In addition, we can further simplify the stability analysis according to the working state of the DLDO. There are two scenarios for the steady state, as the following.

For steady-state case 1, the  $V_{\text{OUT}}$  value is within the TDC (coarse) input boundaries, the TDC output  $T_B$  would be fixed, and thus, we can ignore the proportional loop. The DLDO works in a shift-register mode [14], as shown in Fig. 10(b). This is a stable integral loop with a dominant pole at dc.

For steady-state case 2, the  $V_{\text{OUT}}$  value is close to the TDC coarse sensing thresholds, and the TDC output would oscillate between two adjacent codes ( $N$ ,  $N - 1$ ) due to the output ripple. By now,  $Q_B(3:0)$  and  $T_B(4:0)$  have changed, and the fine switches regulation and coarse switches regulation are all involved. However, since  $Q_B(9:4)$  is fixed, the integral and the proportional loops become an in-parallel connection. The bandwidth of the proportional loop is far beyond the integral loop. According to the analysis in [20] and [21], once the



Fig. 10. (a) Simplified model of the standalone DLDO and that in steady-state (b) case 1 and (c) case 2.

integral loop and proportional loop are connected in parallel, a zero locates near the intersection of the loop gain curves, as shown in Fig. 10 (top right). This zero cancels a dc pole from the integral loop. Therefore, the DLDO stability depends on the proportional loop.

In a special case that  $Q_B(3:0) = 1111$  and need to perform a carry action, a coarse power switch will be turned on and resets  $Q_B(3:0)$  to 0000. One coarse power switch turns on and 15 fine PSs turn off, and the variation in the output current is equivalent to turning on one fine power switch. Therefore, in this situation,  $Q_B(9:0)$  can also be equivalent to unchanged for steady-state analysis.

According to the decomposition of working states above, we can greatly simplify the stability analysis. The open-loop transfer function of the proportion loop is given by

$$H(s) = K_{\text{TDC}} \times \frac{R_O}{1 + sR_O C_L} \times \text{ZOH}(s) \times I_{\text{UNIT\_C}} \quad (5)$$

where  $K_{\text{TDC}}$  is the TDC gain, equals to  $1/(\text{TDC resolution})$ , and  $\text{ZOH}(s)$  is the transfer function of the zero-order hold [22]. We can rewrite the equation as follows:

$$H(s) = \frac{K_{\text{TDC}} R_O I_{\text{MAX}}}{W(1 + sR_O C_L)} \times \frac{1 - e^{-Ts}}{sT}. \quad (6)$$

We can analyze the stability by drawing the Bode plot according to (6), which have explored the loop stability using the following parameters: load resistance  $R_L$ , asynchronous window size  $W$ , output capacitor  $C_L$ , and load capability  $I_{\text{MAX}}$  (representing the total power transistors strength). With the following parameters  $V_{\text{IN}} = 1$  V,  $V_{\text{OUT}} = 0.9$  V,  $I_{\text{MAX}} = 0.25$  A,  $C_L = 1.5$  nF, and  $W = 12$ , we sweep  $R_L$  to



Fig. 11. Stability across (a) load current, (b) window size, (c)  $C_L$ , and (d) PSs strength  $I_{MAX}$ .

evaluate the impact of load current [Fig. 11(a)]. The phase margin is above 65°, and the loop is stable. As shown in Fig. 11(b) and (c), the loop becomes more stable with a larger window size and output capacitor. A larger output capacitor can improve the transient performance, but a larger  $W$  will increase the overshoot and undershoot during load transient (Fig. 9). Also, the smaller  $I_{MAX}$  makes the system more stable [see Fig. 11(d)].

### III. DISTRIBUTED LDO ARCHITECTURE

#### A. Architecture of the Distributed LDO

To supply a large-area digital load, we duplicate the fast proportional loop as LVRs and distribute them spatially over the load. Fig. 12 presents the overall architecture of the proposed distributed DLDO prototype with one GC and nine LVRs. The GC consists of the integral loop, a ring oscillator (RO), and all the fine PSs. As the currents provided by the fine PSs are small, we centralize them all in the GC. The GC outputs  $Q_B \langle 9:4 \rangle$  to control all the LVRs. Fig. 12 (upper left) depicts the floor plan of the prototype. The nine LVRs with coarse PSs are distributed across different areas of the digital load. Each LVR has its own sensing point, which can quickly respond to the sudden load change in local and surrounding areas.

With the LVRs are distributed over a large area, there are mismatches among TDCs, resulting in different  $T_{BN} \langle 4:0 \rangle$ . The mismatch leads to an unbalanced current as they have the same reference code  $Q_B \langle 9:4 \rangle$ . To solve this, we introduce a digital primary-secondary calibration in each LVR to remove the TDC mismatches. The output code  $T_{B1}$  of the TDC in LVR1 is used as the calibration code for the TDC in other LVRs. We will introduce the details of the TDC calibration in Section III-B.

DLDOS for microprocessors usually need to support a wide output voltage range for dynamic voltage scaling (DVS). When  $V_{OUT}$  is low, the power switch unit current  $I_{UNIT}$  increases

significantly [7], [17]. However, the digital load's current demand decreases at low voltage. At this point, the load capability  $I_{MAX}$  is far beyond the maximum load demand. According to the stability analysis in Section III-C, too large  $I_{MAX}$  would cause the loop to be unstable. To optimize the LDO performances over a wide input–output voltage range, we refer to the power strength calibration technique in [11] and [12]. The PSs in each LVR and the fine PSs in the GC are divided into eight parts. When  $V_{IN} - V_{OUT}$  is large, we can reduce the active part number  $P_T \langle 8:1 \rangle$  to improve the loop stability. Table I shows the breakdown of the quiescent current.

#### B. TDC Primary-Secondary Calibration

In the steady state, the control code  $M$  of the coarse switches in each LVR can be calculated as

$$M_N = D_{BN} = Q_B - T_{BN} \quad (7)$$

where  $Q_B$  refers to  $Q_B \langle 9:4 \rangle$ . As mentioned above, there are local mismatches among TDCs. Taking the output of the TDC in LVR1 as a reference, the mismatch is

$$C_{BN} = T_{BN} - T_{B1}. \quad (8)$$

According to (7) and (8), we can obtain the difference in the number of enabled coarse switches among LVRs

$$\Delta M_N = T_{B1} - T_{BN}. \quad (9)$$

Compared with (2) (parallel distributed architecture), the TDC quantization error will only cause the same proportion to unbalance current due to no integral term. We use the signal TR to trigger the calibration and latch the mismatch value at the TR rising edge. Then, each mismatch value is added to each local loop and obtains balanced current-sharing

$$M_N = (Q_B - T_{BN}) + (T_{BN}|_{TR} - T_{B1}|_{TR}). \quad (10)$$

We built a simulation model to verify the current sharing calibration function, as shown in Fig. 13. We added a dc power supply  $V_{OS}$  with different values to each LVR to imitate different quantization errors, making the TDC output different. Moreover, all the outputs of LVRs are directly shorted together to form the extremely unbalanced current scenario. Fig. 14 shows the simulated waveforms of current-sharing calibration, where  $V_{OS1} - V_{OS9}$  are 0, -15, -30, -45, -60, 15, 30, 45, and 60 mV, respectively. We can see that, when  $I_{LOAD} = 1.8$  A, without calibration, the output currents of LVRs are 200, 104, 128, 152, 176, 224, 248, 272, and 296 mA, respectively. Referring to the average current, the unbalanced current of each LVR is proportional to the offset voltage. When the calibration signal TR rising edge comes, all the output currents are almost equal to 200 mA.

In the above analysis, we ignored the coarse switch strength variation between LVRs. Since the LVRs are placed in different locations of the chip, even with the same control code value, the strength of the switches will cause some unbalanced current. Usually, such current sharing accuracy is acceptable. To further improve sharing accuracy, we provide a reference method. First, we can let the LDO in a stable light load (Idle mode), with only one LVR is on at a time and the other LVRs



Fig. 12. Overall architecture of the proposed distributed DLDO.

TABLE I  
BREAKDOWN OF THE QUIESCENT CURRENT

| Main Modules            |                     | Quiescent Current ( $\mu\text{A}$ ) |
|-------------------------|---------------------|-------------------------------------|
| Global Controller       | Comparator          | 72                                  |
|                         | 10-Bit Binary Bi-SR | 90.8                                |
|                         | Other Logic         | 60                                  |
| Local Voltage Regulator | TDC                 | 529                                 |
|                         | Subtractors + Adder | 4.2 $\mu\text{A}^*$                 |
|                         | Window Logic        | 0.5 $\mu\text{A}^*$                 |
|                         | Other Logic         | 10 $\mu\text{A}$                    |

\*In steady state, since the  $T_B<4:0>$  and  $Q_B<9:4>$  maintain unchanged, the main current consumption is the leakage current.



Fig. 13. Digital primary-secondary calibration.

are off, and the fine switches are also off. Then, the load current is provided by the coarse switches of the particular LVR, and we have

$$C_{TN\_AVG} \times I_{CTN} = I_{LOAD} \quad (11)$$



Fig. 14. Simulated waveforms of the calibration function.

where  $I_{CTN}$  is the current through a single coarse switch unit in the  $N$ th LVR, representing the strength of the coarse switch.  $C_{TN\_AVG}$  is the average value of “ON” coarse switches in the  $N$ th LVR, averaged from multiple measurements. For a digital control LDO, we can easily get the  $C_{TN}$  information.

For a fixed load, different  $I_{CTN}$  values mean different  $C_{TN\_AVG}$  values. We can judge the relative strength of the local coarse switches based on the values of  $C_{TN\_AVG}$ , and then, we can calibrate the strength in the PMOS strength calibration table.

### C. Distributed LDO Stability Analysis

The distributed LDO forms a multi-input and multi-output (MIMO) system [10], [23]. Using state-space representation to investigate the system stability requires a substantial level of



Fig. 15. Circuit model of two-LVR connections with (a)  $R_G = 0$ , (b)  $R_G = \infty$ , and (c)  $R_G$  is a finite value.



Fig. 16. System stability across (a)  $I_{MAX}/C_L$  and (b)  $R_G$ .

mathematical calculations. For simplicity, we build a system model with two LVRs (Fig. 15) and analyze the stability of one LVR in this system. Each LVR has a local load ( $R_{L1}$  and  $R_{L2}$ ) and capacitor ( $C_{L1}$  and  $C_{L2}$ ).  $R_G$  is the parasitic resistance between the two LVRs.

When  $R_G$  is large, there is no interaction between the two LVRs. We can consider each LVR as a separate regulator [Fig. 15(b)]. The stability analysis is the same as that in Section II-C.

When  $R_G$  is small, since the two LVRs have the same controller and  $V_{OUT}$  response, the system can be equivalent to one LVR with twice output capability ( $I_{MAX1} + I_{MAX2}$ ) and twice output capacitor ( $C_{L1} + C_{L2}$ ). According to the stability analysis in Section II-C, large  $C_L$  makes the LVR more stable, while large  $I_{MAX}$  may make the LVR unstable. Let us fix the ratio of  $I_{MAX}/C_L$  and evaluate the stability under different  $I_{MAX}$  and  $C_L$  values. Fig. 16(a) presents the corresponding Bode plot. The bandwidth is almost the same with different  $I_{MAX}$  and  $C_L$  values. The phase margin is around 72°, and the system is stable when we maintain or reduce the total  $I_{MAX}/C_L$  ratio.

When  $R_G$  is intermediate, each LVR will help other LVRs respond to its local load transient. The change of  $R_{L2}$  and the regulation of LVR2 can be equivalent to a dynamic load current for LVR1. In this way,  $C_{L2}$  becomes the far-end capacitor of LVR1, while  $R_G$  is the equivalent series resistance, as shown in Fig. 15(c). According to (6), we obtain the new



Fig. 17. Simulated waveforms of load transient response with different propagation delays.



Fig. 18. (a) Baseline version of the proposed LDO, (b) distributed PSs using single LVRC, and (c) distributed PSs and LVRCs.

TABLE II  
ENABLED PSS OF EACH LVR WHEN  $I_{LOAD} = 1.2 \text{ A}$

| LVR (Tile ID)    | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
|------------------|---|---|---|---|---|---|---|---|---|
| w/o calibration  | 7 | 6 | 6 | 5 | 6 | 6 | 5 | 7 | 6 |
| with calibration | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 |

transfer function

$$H(s) = \frac{K_{TDC} R_{L1} I_{MAX} (1 + s R_G C_{L2}) \times \frac{1}{sT}}{W [1 + s(R_{L1}(C_{L1} + C_{L2}) + R_G C_{L2}) + s^2 R_{L1} C_{L1} R_G C_{L2}]} \quad (12)$$

where  $C_{L1} = C_{L2} = 1.5 \text{ nF}$ ,  $I_{MAX} = 250 \text{ mA}$ ,  $R_{L1} = 9 \Omega$ , and we sweep  $R_G$  from 10 to 100 Ω. The Bode plot in Fig. 16(b) shows the bandwidth increase with a larger  $R_G$  value and all the phase margins are more than 70°. For the proposed architecture, when all individual LVRs are stable, the parasitic resistance  $R_G$  will not affect the stability of distributed systems.

Another thing to consider is the impact of propagation delay from the GC to the local controller. When LVRs are far away from the GC, the propagation delay of  $Q_B(9:4)$  may be several nanoseconds or even more.

In a steady state, the output ripple generated by the LVR's coarse switches should be within a controllable range for loop stability. This requires  $Q_B(9:4)$  to maintain unchanged or to have at most with 1-bit limit cycle oscillation (LCO). The frequency of the GC is 800 MHz.  $Q_B(9:4)$  is the higher bits, so the interval time  $T_D$  for  $Q_B(9:4)$  to changes is

$$T_D = 16 \times 1.25 \text{ ns} = 20 \text{ ns}. \quad (13)$$



Fig. 19. Die micrograph of the proposed distributed DLDO.



Fig. 20. Load regulation performance of the proposed DLDO.



Fig. 21. Fine PSs enable or disable (one LVR).



Fig. 22. Measured load transient responses under dropout voltage.

If the propagation delay of  $Q_B(9:4)$  is less than 20 ns, during this period, the value of  $Q_B(9:4)$  will not change. However, if the delay is greater than 20 ns, since the control signals of the coarse switches  $C_T(12:1)$  have not changed during



Fig. 23. Measured load transient responses with different numbers of LVRs and different load steps.



Fig. 24. Measured (a) DVS with PMOS strength adjustment and (b) DVS with frequency variation.

the 20-ns period,  $Q_B(9:4)$  will further increase or decrease, resulting in more than a 2-bit LCO in  $Q_B(9:4)$ . This will obviously increase the output ripple, leading to instability.

Fig. 17 shows the simulated waveforms of load transient response under different propagation delays. The difference in ripple and transient performance between 0 and 18 ns ( $<20$  ns) is very small, but when the delay is 22 ns ( $>20$  ns), the output ripple increases significantly. Therefore, for distributed applications, we need to ensure the propagation delay of  $Q_B(9:4)$  is less than the interval time  $T_D$ .

According to the analysis above, the proposed DLDO architecture can easily expand to other load scenarios. For more flexibility and scalability, we divide a single DLDO into three parts: GC, LVR controller (LVRC), and PS, as shown

TABLE III  
PERFORMANCE SUMMARY AND COMPARISON WITH PRIOR ARTS

| Publication                                               | [8]<br>Intel<br>ISSCC 2020                         | [9]<br>HKUST<br>ISSCC 2018 | [10]<br>Columbia U.<br>JSSC 2021  | [11]<br>IBM<br>ISSCC 2014 | [12]<br>IBM<br>JSSC 2020 | [13]<br>Samsung<br>ISSCC 2021 | This Work                                    |
|-----------------------------------------------------------|----------------------------------------------------|----------------------------|-----------------------------------|---------------------------|--------------------------|-------------------------------|----------------------------------------------|
| Process                                                   | 10nm                                               | 65nm                       | 65nm                              | 22nm                      | 14nm                     | 5nm                           | <b>28nm</b>                                  |
| Distributed structure                                     | <b>Parallel</b>                                    |                            |                                   | <b>Dual-loop</b>          |                          |                               | <b>Dual-loop</b>                             |
| Sensor type                                               | Digital                                            | Analog                     | Analog                            | Analog                    | Analog                   | Analog                        | <b>Digital</b>                               |
| Control type                                              | Digital                                            | Analog-Digital             | Digital                           | Switching                 | Switching                | Digital-Switching             | <b>Digital</b>                               |
| Load-sharing balance scheme                               | No, depends on $R_G$ and $\Delta V$ * <sup>1</sup> |                            |                                   | Duty cycle balance        | Duty cycle balance       | Analog calibration loop       | <b>Digital primary-secondary calibration</b> |
| # of LDOs                                                 | 9                                                  | 9                          | 9                                 | 1 Global + 64 Local       | 4 Global + 16 Local      | 1 Global + 16 Local           | <b>1 Global + 9 Local</b>                    |
| Area (mm <sup>2</sup> )                                   | 0.126                                              | 0.776                      | 0.373                             | 0.355                     | 0.155                    | 0.16                          | <b>0.12</b>                                  |
| V <sub>IN</sub> (V)                                       | 0.7-1.05                                           | 0.6-1.2                    | 0.5-1                             | 0.68-1.1                  | 0.64-1.1                 | 0.55-0.8                      | <b>0.7-1.05</b>                              |
| V <sub>OUT</sub> (V)                                      | 0.65-0.95                                          | 0.55-1.15                  | 0.45-0.9                          | 0.61-1.03                 | 0.6-1.06                 | 0.5-0.75                      | <b>0.65-0.95</b>                             |
| I <sub>MAX</sub> (A)                                      | 2.74                                               | 0.5                        | 0.417                             | 11.9                      | 12                       | 6.4                           | <b>2</b>                                     |
| I <sub>Q</sub> (mA)                                       | 21 to 57                                           | 0.5                        | 0.683                             | NA                        | NA                       | 7.3                           | <b>6.5</b>                                   |
| C <sub>L</sub> (nF)                                       | 5.4                                                | 0.9                        | 0.9                               | 750                       | 481                      | 1000                          | <b>11.5</b>                                  |
| ΔV <sub>OUT</sub> , ΔI <sub>LOAD</sub> @T <sub>Edge</sub> | 200mV, 0.17A@0.1ns                                 | 125mV, 0.45A@20ns          | 180mV, * <sup>2</sup> 0.407A@10ns | NA                        | 20mV, 4.9A@10ns          | 20mV, 1A@1μs                  | <b>54mV*<sup>4</sup>, 1.35A@10ns</b>         |
| Max Current Eff.                                          | 98.6                                               | 99.9                       | 99.8                              | 96.7                      | 99.1                     | 99.89                         | <b>99.67</b>                                 |
| Load Reg.(mV/A)                                           | 8.5                                                | >55                        | >80                               | 0.5                       | 1.1                      | NA                            | <3                                           |
| Current Density (A/mm <sup>2</sup> )                      | 21.75                                              | 0.64                       | 1.12                              | 33.5                      | 77.4                     | 40                            | <b>16.67</b>                                 |
| FoM(ps) * <sup>3</sup>                                    | 3.04                                               | 0.28                       | 0.67                              | NA                        | NA                       | 146                           | <b>2.38</b>                                  |

\*<sup>1</sup> R<sub>G</sub> is the parasitic resistance of the power network and ΔV is the output voltage error of different LDOs.

\*<sup>2</sup> Observed from the Fig. 26 in [12].

\*<sup>3</sup> FoM=(I<sub>Q</sub>×C<sub>L</sub>×ΔV<sub>OUT</sub>)/ΔI<sub>LOAD</sub><sup>2</sup>.

\*<sup>4</sup> @ V<sub>IN</sub> = 1.05 V, V<sub>OUT</sub> = 0.85 V.

in Fig. 18(a). If the total  $I_{MAX}/C_L$  ratio is appropriate, we can directly increase the number of PS to expand the load capability [Fig. 18(b)]. Also, we can connect multiple LVRCs and PSs in parallel to form a distributed LDO [Fig. 18(c)]. The arrangement method depends on the top metal and package routing resources.

#### IV. MEASUREMENT RESULTS

A prototype chip of the distributed DLDO is fabricated in a 28-nm CMOS process, as shown in Fig. 19. It comprises nine tiles, each with an LVR and a testing load. Each LVR has an output capacitor of 1.25 nF. All the control blocks are implemented using the standard library cells. The active area of each LVR is 0.011 mm<sup>2</sup>, and the total active area of the distributed DLDO prototype is 0.12 mm<sup>2</sup>. The DLDO prototype can support a 2-A load current with an 80-mV dropout voltage, achieving a current density of 16.67 A/mm<sup>2</sup>.

Fig. 20 shows the dc load regulation measurements. When V<sub>IN</sub> = 1 V, according to V<sub>OUT</sub> codes, we adjust the active PMOS strength to avoid large oversizing and to improve loop stability. The LDO achieves a good dc load regulation of 1.6–3 mV/A.

We can enable one, two, three, six, or nine LVRs for testing. First, we measured the steady-state performance of one LVR, as shown in Fig. 21. When V<sub>IN</sub> = 1 V, V<sub>OUT</sub> = 0.8 V, and I<sub>LOAD</sub> = 0.3 A, with and without the fine switches, the output ripples are 3.5 and 11 mV, respectively. The proposed DLDO can work even without the fine switches, but with the fine switches, it can obtain higher accuracy. Moreover, when the fine switches regulate V<sub>OUT</sub> within the coarse resolution boundaries, it can eliminate the coarse switching, thereby reducing the output ripple and switching loss.

Fig. 22 shows the measured load transient waveforms under dropout voltage. The input bond-wire parasitic inductance causes the V<sub>IN</sub> droop during the load transient, which worsens the transient performance to a certain degree. To reduce the effect of V<sub>IN</sub> droop, we measure the load transient when the dropout voltage is 200 mV (V<sub>IN</sub> = 1.05 V and V<sub>OUT</sub> = 0.85 V). As shown in Fig. 23, for the load step of 30–270 mA with 10-ns edges, the undershoots with one, two, and three LVRs are 58, 34, and 22 mV, respectively. For the same load, the more LVRs, the larger the load capability, thus the smaller the V<sub>OUT</sub> droop. When the load step and the LVR number change proportionally (150 mA for each LVR), the undershoots with three, six, and nine LVRs are 36, 45, and 54 mV, respectively.

The transient measurements demonstrate the stability and scalability of the proposed distributed DLDO, as the descriptions in Sections II-C and III-C.

Fig. 24(a) displays the DVS together with PMOS strength adjustment. When  $V_{IN} = 1$  V,  $I_{LOAD} = 0.6$  A, and  $P_T = 4$ ,  $V_{OUT}$  changes from 0.7 to 0.9 V in 285 ns smoothly, and the LDO loop is stable with a small output ripple. When  $V_{OUT}$  rises to 0.9 V, the active PMOS strength  $P_T$  needs to be gradually increased to guarantee load capability. At  $V_{OUT} = 0.9$  V, the variations are close to 20 mV during the  $P_T$  value changes. We also compare the logic sequences of the  $P_T$  adjustment and DVS. For the voltage-rising edge,  $P_T$  should change later, and for the falling edge,  $P_T$  should change earlier. If we increase  $P_T$  to 6 at  $V_{OUT} = 0.7$  V, the loop will become unstable with a large output ripple. The DVS measurement waveforms also verify the stability analysis of  $I_{MAX}$  in Section II-C.

The local voltage sensors (TDCs) are sensitive to PVT and frequency variations. To visually display the automatic variation tracking feature of the proposed architecture, we changed the clock frequency from 1.6 to 1.3 GHz. The proposed DLDO can work properly with different clock frequencies, and the DVS rising time changes from 285 to 350 ns, as shown in Fig. 24(b).

Table II shows the measured number of enabled coarse PSs in each LVR when  $I_{LOAD} = 1.2$  A. Without digital primary-secondary calibration, the maximum difference is 2. After calibration, the difference becomes 0 in our measurement, indicating a good current balancing.

Table III summarizes the LDO performances and compares the proposed work with prior distributed LDO designs [8], [9], [10], [11], [12], [13]. Only Bang et al. [8] and the proposed DLDO have realized the all-digital solutions. In contrast, the proposed LDO obtains current balancing by adopting the dual-loop structure and digital primary-secondary calibration. The proposed DLDO achieves balanced performances among accuracy, transient response, quiescent current, and output capacitor. Compared with [8], [9], and [10], we achieve better dc load regulation by using the comparator-based integral loop and fine switches regulation. Our load capability is lower than the prior dual-loop designs [11], [12], [13] due to the limited available silicon area. However, according to the discussion in Section III-C, we can increase  $I_{MAX}$  and  $C_L$  proportionally. For a 12-A load capability, the proposed architecture only needs a 69-nF output capacitor, which is far less than the prior requirements (481 nF–1  $\mu$ F). With such a large output capacitor, we can greatly reduce the control window size to improve the load transient performance. Due to the small-area all-digital control circuit, the current density is 16.67 A/mm<sup>2</sup>. If we adopt a more advanced process and design a larger load capability, we expect the current density to increase at least several times.

## V. CONCLUSION

This article presents a fully synthesizable dual-loop distributed DLDO for high-current large-area digital loads. The proposed DLDO features a CMP-TDC all-digital voltage quantizer, coarse-fine tuning, an asynchronous window control, and a digital primary-secondary calibration, obtain-

ing high output accuracy, one-cycle transient response, and current-balancing among the distributed LVRs. In addition, the proposed architecture is highly modular and can easily expand to other load scenarios without redesigning the LDO. We investigate the system stability and provide the design principles for distributed expansion. The prototype with one GC and nine LVRs designed in a 28-nm process demonstrates the efficacy of the proposed techniques.

## REFERENCES

- [1] S. T. Kim et al., “Enabling wide autonomous DVFS in a 22 nm graphics execution core using a digitally controlled fully integrated voltage regulator,” *IEEE J. Solid-State Circuits*, vol. 51, no. 1, pp. 18–30, Jan. 2016.
- [2] R. Muthukaruppan et al., “A digitally controlled linear regulator for per-core wide-range DVFS of atom™ cores in 14 nm tri-gate CMOS featuring non-linear control, adaptive gain and code roaming,” in *Proc. 43rd IEEE Eur. Solid State Circuits Conf. (ESSCIRC)*, Leuven, Belgium, Sep. 2017, pp. 275–278.
- [3] Z. Wang, S. J. Kim, K. Bowman, and M. Seok, “Review, survey, and benchmark of recent digital LDO voltage regulators,” in *Proc. IEEE Custom Integr. Circuits Conf. (CICC)*, Apr./May 2022, pp. 1–8.
- [4] D. Khalil and Y. Ismail, “Optimum sizing of power grids for IR drop,” in *Proc. IEEE Int. Symp. Circuits Syst.*, May 2006, pp. 481–484.
- [5] M. Cho et al., “Post-silicon voltage-guard-band reduction in a 22 nm graphics execution core using adaptive voltage scaling and dynamic power gating,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Jan. 2016, pp. 152–153.
- [6] I. Vaisband, B. Price, S. Köse, Y. Kolla, E. G. Friedman, and J. Fischer, “Distributed LDO regulators in a 28 nm power delivery system,” *Anal. Integr. Circuits Signal Process.*, vol. 83, no. 3, pp. 295–309, Jun. 2015.
- [7] J. F. Bulzacchelli et al., “Dual-loop system of distributed microregulators with high DC accuracy, load response time below 500 ps, and 85-mV dropout voltage,” *IEEE J. Solid-State Circuits*, vol. 47, no. 4, pp. 863–874, Apr. 2012.
- [8] S. Bang et al., “A fully synthesizable distributed and scalable all-digital LDO in 10 nm CMOS,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2020, pp. 380–381.
- [9] Y. Lu, F. Yang, F. Chen, and P. K. T. Mok, “A 500 mA analog-assisted digital-LDO-based on-chip distributed power delivery grid with cooperative regulation and IR-drop reduction in 65 nm CMOS,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2018, pp. 310–312.
- [10] S. J. Kim, D. Kim, Y. Pu, C. Shi, S. B. Chang, and M. Seok, “0.5–1-V, 90–400-mA, modular, distributed, 3 × 3 digital LDOs based on event-driven control and domino sampling and regulation,” *IEEE J. Solid-State Circuits*, vol. 56, no. 9, pp. 2781–2794, Sep. 2021.
- [11] Z. Toprak-Deniz et al., “Distributed system of digitally controlled microregulators enabling per-core DVFS for the POWER8™ microprocessor,” in *Proc. IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers (ISSCC)*, San Francisco, CA, USA, Feb. 2014, pp. 98–99.
- [12] M. E. Perez, M. A. Sperling, J. F. Bulzacchelli, Z. Toprak-Deniz, and T. E. Diemoz, “Distributed network of LDO microregulators providing submicrosecond DVFS and IR drop compensation for a 24-core microprocessor in 14-nm SOI CMOS,” *IEEE J. Solid-State Circuits*, vol. 55, no. 3, pp. 731–743, Mar. 2020.
- [13] D.-H. Jung et al., “A distributed digital LDO with time-multiplexing calibration loop achieving 40 A/mm<sup>2</sup> current density and 1 mA-to-6.4 A ultra-wide load range in 5 nm FinFET CMOS,” in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC)*, Feb. 2021, pp. 414–415.
- [14] Y. Okuma et al., “0.5-V input digital LDO with 98.7% current efficiency and 2.7- $\mu$ A quiescent current in 65 nm CMOS,” in *Proc. IEEE Custom Integr. Circuits Conf.*, Sep. 2010, pp. 1–4.
- [15] J. Oh, J.-E. Park, Y.-H. Hwang, and D.-K. Jeong, “A 480 mA output-capacitor-free synthesizable digital LDO using CMP-triggered oscillator and droop detector with 99.99% current efficiency, 1.3 ns response time, and 9.8 A/mm<sup>2</sup> current density,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2020, pp. 148–149.
- [16] X. Sun, A. Boora, W. Zhang, V. R. Pamula, and V. Sathe, “A 0.6-to-1.1 V computationally regulated digital LDO with 2.79-cycle mean settling time and autonomous runtime gain tracking in 65 nm CMOS,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2019, pp. 230–231.

- [17] T. Mahajan, R. Muthukaruppan, D. M. Shetty, S. Mangal, and H. K. Krishnamurthy, "Digitally controlled voltage regulator using oscillator-based ADC with fast-transient-response and wide dropout range in 14 nm CMOS," in *Proc. IEEE Custom Integr. Circuits Conf. (CICC)*, Apr. 2017, pp. 1–4.
- [18] S. Kundu, M. Liu, S.-J. Wen, R. Wong, and C. H. Kim, "A fully integrated digital LDO with built-in adaptive sampling and active voltage positioning using a beat-frequency quantizer," *IEEE J. Solid-State Circuits*, vol. 54, no. 1, pp. 109–120, Jan. 2019.
- [19] M. Huang, Y. Lu, S.-W. Sin, U. Seng-Pan, R. P. Martins, and W.-H. Ki, "Limit cycle oscillation reduction for digital low dropout regulators," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 63, no. 9, pp. 903–907, Sep. 2016.
- [20] M. Huang, Y. Lu, and R. P. Martins, "An analog-proportional digital-integral multiloop digital LDO with PSR improvement and LCO reduction," *IEEE J. Solid-State Circuits*, vol. 55, no. 6, pp. 1637–1650, Jun. 2020.
- [21] Z. Guo et al., "Topological classification-based splitting-combining methodology for analysis of complex multi-loop systems and its application in LDOs," *IEEE Trans. Power Electron.*, vol. 34, no. 7, pp. 7025–7039, Jul. 2019.
- [22] J. Ma, X. Wang, F. Blaabjerg, L. Harnefors, and W. Song, "Accuracy analysis of the zero-order hold model for digital pulse width modulation," *IEEE Trans. Power Electron.*, vol. 33, no. 12, pp. 10826–10834, Dec. 2018.
- [23] S. B. Nasir, Y. Lee, and A. Raychowdhury, "Modeling and analysis of system stability in a distributed power delivery network with embedded digital linear regulators," in *Proc. 15th Int. Symp. Quality Electron. Design*, Mar. 2014, pp. 68–75.



**Xiangyu Mao** (Member, IEEE) received the B.Eng. and M.Sc. degrees in electronic engineering from Xidian University, Xi'an, China, in 2009 and 2012, respectively, and the Ph.D. degree in electronic and computer engineering from the University of Macau, Macau, China, in 2022.

From 2012 to 2019, he was an IC Design Engineer and then a Project Manager with Hisilicon Corporation, Shenzhen, China, mainly responsible for various analog IPs from 40- to 7-nm CMOS technology. He is currently a Post-Doctoral Fellow with the State Key Laboratory of Analog and Mixed-Signal VLSI, University of Macau. His research interests include power management systems for multicore processors, including dc-dc converters, fully integrated voltage regulators, and power integrity.



**Yan Lu** (Senior Member, IEEE) received the B.Eng. and M.Sc. degrees in microelectronics from the South China University of Technology, Guangzhou, China, in 2006 and 2009, respectively, and the Ph.D. degree in electronic and computer engineering from The Hong Kong University of Science and Technology (HKUST), Hong Kong, China, in 2013.

In 2014, he joined the State Key Laboratory of Analog and Mixed-Signal VLSI, University of Macau, Macau, China, where he is currently an Associate Professor. He has authored or coauthored more than 150 peer-reviewed technical articles and two books. His research interests include wireless power transfer circuits and systems, high-density integrated power converters, and voltage regulators.

Dr. Lu is serving as a TPC Member for ISSCC and CICC. He was a recipient/co-recipient of the NSFC Excellent Young Scientist Fund (HK-Macau) in 2021, the Macau Science and Technology Award Second Prizes in 2018 and 2020, the IEEE Solid-State Circuits Society Pre-Doctoral Achievement Award from 2013 to 2014, the IEEE CAS Society Outstanding Young Author Award in 2017, and the ISSCC 2017 Takuo Sugano Award for Outstanding Far-East Paper. He has served as a Guest Editor for the IEEE JOURNAL OF SOLID-STATE CIRCUITS (JSSC) in 2022, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS (TCAS-I) in 2019, and IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS (TCAS-II) from 2018 to 2019. He has been a Young Editor of the *Journal of Semiconductors* since 2021. He is an IEEE SSCS Distinguished Lecturer 2022–2023.



**Rui P. Martins** (Life Fellow, IEEE) received the Ph.D. degree in electrical engineering and computers from the Department of Electrical and Computer Engineering, Instituto Superior Técnico, University of Lisbon, Lisbon, Portugal, in 1992.

He was the Founding Director of the State Key Laboratory of Analog and Mixed-Signal VLSI, from 2011 to 2022. He has been with the Department of Electrical and Computer Engineering, since October 1980. Since 1992, he has been on leave from the University of Lisbon. He is currently with the Department of Electrical and Computer Engineering, Faculty of Science and Technology, University of Macau, Macau, China, where he has been a Chair-Professor, since August 2013. He is also the Director of the Institute of Microelectronics, University of Macau. Since July 2010, he has been an Academician with the Lisbon Academy of Sciences, Lisbon. He has authored or coauthored more than 900 publications, including ten books, 12 book chapters, 50 patents, more than 300 articles in scientific journals, and more than 400 papers in conference proceedings. His research interests include analog and mixed-signal VLSI design.

Prof. Martins received the Author Recognition Award at the 70 years of ISSCC in 2023, as a Top Contributor with more than 50 papers, and three Medals from Macau Government in 1999, 2001, and 2021.