

# Power Delivery for High-Performance Microprocessors—Challenges, Solutions, and Future Trends

Kaladhar Radhakrishnan<sup>✉</sup>, Senior Member, IEEE, Madhavan Swaminathan, Fellow, IEEE, and Bidyut K. Bhattacharyya, Fellow, IEEE

**Abstract**—The power delivery requirements for the early microprocessors were fairly rudimentary due to the relatively low power levels. However, several decades of exponential scaling powered by Moore’s law have greatly increased the power requirements and the complexity of the power delivery scheme. The breakdown in Dennard scaling in the mid-2000s has ushered in the multicore era which has increased the number of cores and the power consumption in microprocessors. The steady growth in the power levels and the number of power rails in high-performance microprocessors have increased the power delivery challenges. Integrated voltage regulators (IVRs) have emerged as a key power delivery technology to address these challenges. There are a number of IVR schemes implemented on-die ranging from the simple power gate to fully integrated switching regulators. After covering the fundamentals of power delivery, this article discusses the pros and cons of different types of IVR as well as the technology ingredients required to meet future IVR requirements. This article concludes with a section on advanced packaging technologies that are being developed and needed to enable heterogeneous integration and their impact on power delivery.

**Index Terms**—Decoupling capacitors, heterogeneous integration, integrated voltage regulator (IVR), magnetic inductors, power delivery.

## I. INTRODUCTION

MICROPROCESSORS have undergone a significant evolution in complexity and capability from their introduction in the early 1970s to the present day. The exponential increase in microprocessor performance and affordability can be attributed to the semiconductor industry’s adherence to Moore’s law which posits that the transistor count in a chip will double every two years [1]. Robert Dennard proposed a set of MOSFET scaling guidelines [2] that would enable transistors to achieve improved performance while reducing area and power. The traditional scaling approach as described

Manuscript received November 2, 2020; revised February 4, 2021; accepted February 19, 2021. Date of publication March 12, 2021; date of current version April 26, 2021. Recommended for publication by Associate Editor M. Cases upon evaluation of reviewers’ comments. (*Corresponding author: Kaladhar Radhakrishnan*)

Kaladhar Radhakrishnan is with Intel Corporation, Chandler, AZ 85226 USA (e-mail: kaladhar.radhakrishnan@intel.com).

Madhavan Swaminathan and Bidyut K. Bhattacharyya are with the Georgia Institute of Technology, Atlanta, GA 30332 USA (e-mail: madhavan.swaminathan@ece.gatech.edu; bbhattach6@gatech.edu).

Color versions of one or more figures in this article are available at <https://doi.org/10.1109/TCPMT.2021.3065690>.

Digital Object Identifier 10.1109/TCPMT.2021.3065690



Fig. 1. Scaling trends for some key microprocessor metrics.

by Dennard was very effective until the early 2000s in keeping the power density constant even as the transistors got progressively smaller each generation. However, as the gate oxide thickness scaled down to a handful of atomic layers, subthreshold leakage due to electron tunneling through the oxide has become an appreciable fraction of the overall dynamic current. As a result, process engineers had to resort to alternate methods through innovations in materials and transistor structure [3] to achieve the necessary area scaling to keep pace with Moore’s law. This can be inferred from Fig. 1 which plots the scaling trends for some key microprocessor metrics for the past fifty years [4]. While the nontraditional scaling methods have been mostly successful in scaling the transistor area while improving performance, they were not as effective in reducing power.

The power density of microprocessors started to go up with the breakdown of Dennard scaling in the early 2000s. Furthermore, while Dennard scaling provided a means to reduce the gate delay, scaling the interconnect dimensions does not translate to a reduction in the *RC* interconnect delay. As the interconnect delay approaches a significant fraction of the clock period, it becomes another bottleneck in increasing the processor frequency. The slowdown in frequency scaling



Fig. 2. Power rail scaling trends on Intel client and server microprocessors.

since the early 2000s can be clearly seen in Fig. 1 as well. While architectural improvements have resulted in an improvement in the instructions per clock (IPC), this is not enough to overcome the lack of frequency scaling. This has resulted in a slowdown in single core performance scaling. As the single core performance levels off, microprocessor architects have been resorting to the use of multiple cores and parallelizing the workloads to maximize performance. Fig. 2 plots the trend in power rails seen on Intel microprocessors in the client and server segments. The increase in power rails due to increased core count can add complexity to the problem of delivering power to the microprocessors. As different cores in a general-purpose microprocessor can be subject to disparate workloads, it is often advantageous to run each core at its optimal voltage and frequency to minimize the overall power consumption. This implies that the number of power supply domains will also go up with the processor core count.

Another factor that poses additional power delivery challenge is the current trend of scaling thermal design power (TDP). Ongoing enhancements in the thermal cooling capability of datacenter and graphics processors have resulted in a steady growth in the TDP of the microprocessors in these segments. Microprocessors in these high power segments will draw currents in excess of 1000 A in the near future. When it comes to lower power mobile processors, the primary emphasis has been on reducing the overall form factor of the device and maximizing battery life. As a result, the area occupied by the microprocessor, the memory, and the voltage regulators has been forced to shrink to make room for a bigger battery. In addition, the push for thinner devices has meant that the height of the microprocessor as well as power delivery components such as inductors and capacitors have all had to shrink. All of these trends introduce several unique challenges in designing a power delivery network (PDN) that meets the requirements of the microprocessors.

In this article, we will provide an overview of the power delivery architectures that are being used in today's high-performance microprocessors. There are a wide variety of solutions ranging from power management integrated circuits (PMICs) which are popular in smaller handheld devices to

integrated regulators implemented on the processor. We will also cover the evolution of decoupling capacitor solutions including on-die decoupling solutions such as metal-insulator-metal (MIM) capacitors or deep trench capacitors (DTCs). Different types of integrated voltage regulator (IVR) solutions and their pros and cons will be described in detail. This article also looks at some future trends to determine where the power delivery requirements are headed and what types of solutions are being worked on to address these needs.

## II. POWER DELIVERY FUNDAMENTALS

The role of the PDN is to deliver the optimal voltage for different circuit blocks in a microprocessor. A good power delivery design will ensure that the voltage seen by the transistors is always within a certain tolerance band (typically 10%) of the nominal voltage. A voltage that drops too low can cause timing issues resulting in blue screen failure. Conversely, a voltage that is too high can result in excessive power consumption and will compromise device reliability. The power delivery requirements for a consumer product such as a laptop can be quite different from that of a server in a data center. In a consumer device, the designer has to optimize the design for meeting the form-factor requirements, maximizing battery life while keeping the cost low. These factors are often prioritized ahead of system performance. On the other hand, in a data center system, designers are willing to pay a premium to achieve maximum performance. Despite these differences, the fundamentals for power delivery still remain the same, no matter which application is targeted. In this section, we provide an overview of the fundamentals associated with power delivery.

### A. Power Consumption in a Microprocessor

Modern microprocessor chips are fabricated with several billions of transistors and at each clock cycle, an appreciable fraction of these toggle their state. Every time a transistor is switched ON or OFF, there is a small parasitic capacitance that is charged or discharged. The energy associated with the charging and discharging of this parasitic capacitance is derived from the power supply, and this is eventually dissipated as heat

$$P_{\text{dyn}} = \text{AFC}_{\text{dyn}} V^2 f. \quad (1)$$

Equation (1) shows the relationship between the dynamic power consumed by the microprocessor, the switching capacitance ( $C_{\text{dyn}}$ ), voltage, frequency, and activity factor (AF). The AF is a value ranging from 0 to 1 and represents the fraction of the transistors that are switching for a given workload.

In addition to the dynamic switching power, microprocessors also dissipate static power due to the leakage current through the CMOS transistors. The static or quiescent power dissipated in a microprocessor can be expressed as

$$P_{\text{static}} = V(I_{\text{sub}} + I_{\text{gate}} + I_{\text{junc}}). \quad (2)$$

The leakage current is comprised of several different components as shown in (2). The dominant leakage terms are the subthreshold leakage and the gate leakage, while the junction



Fig. 3. Ingredients of a typical PDN.

leakage current is relatively low. It is important to note that the leakage currents are a strong function of voltage. As a result, the power supply to idle power domains is either gated off or dropped down to a low retention voltage to minimize static power consumption. The total power dissipated by the microprocessor is a function of the dynamic and static power. Since AF changes as a function of time, the dynamic power will vary as a function of time. The change in static power over time is relatively small and only happens in response to fluctuations in voltage and temperature.

#### B. Power Delivery Network

Various ingredients of a PDN are shown in Fig. 3. Modern microprocessor systems have multiple voltage regulators on the platform to power different parts of the chip. The input to these voltage regulators can vary depending on which product segment the microprocessor is used in. For a data center server, the input voltage can be as high as 48 V, while desktop computers use the 12-V output from the power supply unit as the input to the regulator. Handheld devices and laptops typically use the battery voltage as the input to the voltage regulator. For smartphones that use a single cell lithium polymer battery, this voltage is 3.7 V. Laptops use two or three of these lithium polymer cells in series to generate an input voltage of 7.4 or 11.1 V.

The output from different voltage regulators is routed to the chip through the PCB, the socket, and the package. Due to the size of the PCB, multiple power planes can be routed in a single layer. The package is connected to the PCB through a socket in desktop and server segments. To minimize the parasitic impedance of the socket pins, multiple power and ground pins are used to deliver power. The number of power pins used on a given power rail scales with the maximum current that the rail is expected to deliver. Sockets are usually not used in mobile segments in an effort to minimize the thickness of the device. In these segments, the microprocessor package is soldered down to the PCB using an array of solder balls. Power is routed from the solder pads on the backside of the package to the die bumps on the topside of the package through the package planes and vias. The most common type of packages used for microprocessors are flip chip organic packages. These packages incorporate a relatively

thick dielectric core to provide mechanical stability and have build-up layers on either side of them. Since these packages usually have more layers than a PCB, dedicated power planes can be used to route power for high-current rails. The final stage of the PDN is on the silicon die. The microprocessor chip has several metal layers that gradually increase in thickness as they transition from the polysilicon layer on which the transistors are fabricated to the thick metal layer that connects to the package through solder bumps. The lower metal layers close to the transistors are mostly used for routing signals within the die. The thick metal layers on the far back-end are typically used for distributing power across the die. Alternating power and ground traces routed on the thick metal layers are used to form a power grid that reduces the lateral resistance on the die. Decoupling capacitors, which are not shown in Fig. 3, are typically used on the platform, the package, and the die to manage the transient response of the PDN.

#### C. Voltage Regulator

The voltage regulator on the platform is designed to deliver the output voltage requested by the microprocessor. In smartphones or other handheld devices, the multiple voltage regulators are implemented in a single PMIC to minimize their area footprint. Each PMIC has the power FETs and the control logic of the regulator, while the output filter is implemented using discrete components on the printed circuit board (PCB). In desktop and higher power server segments, the voltage regulators are implemented using discrete power FETs and discrete output filter components on the PCB. The most commonly used voltage regulator topology on the platform is the synchronous buck. Most of the higher power rails use a multiphase buck regulator with interleaved phases to reduce the current ripple. The multiphase regulators also enable high efficiency operation across a wide current range through phase shedding.

Modern microprocessors do not operate at a fixed frequency at all times. Instead, they rely on dynamic voltage and frequency scaling (DVFS) [5] to reduce the dynamic power consumption. Voltage and frequency for different logic blocks are scaled based on the workload to ensure the processor always operates within a certain power budget or to minimize the power to complete a given task. The power management unit within the microprocessor communicates to the platform-level voltage regulator a series of bits called voltage identification (VID) to prescribe the voltage desired by the processor to implement DVFS. The platform switching regulator then adjusts its duty cycle to ensure that the output voltage measured at the sense-point matches the reference voltage corresponding to VID. The high power platform regulator such as ones that power the cores or the integrated graphics are usually placed close to the microprocessor to minimize the parasitic impedance in the path. However, this may not always be possible due to other system-level constraints such as breaking out high-speed signals or accommodating the socket retention mechanism.

The key metrics for comparing different voltage regulators are their conversion efficiency, output current density, and

TABLE I  
CAPACITOR TYPES AND THEIR CHARACTERISTIC PROPERTIES

| Capacitor Type          | Sub-Classification           | Form-Factor   | Height Class    | Capacitance                   | ESL           | ESR/Time Constant |
|-------------------------|------------------------------|---------------|-----------------|-------------------------------|---------------|-------------------|
| Electrolytic Capacitors | <i>Aluminum Capacitors</i>   | 5mm diameter  | 9 mm, 12 mm     | 560 uF, 820 uF                | 1.5 – 2.0 nH  | 4 – 6 mΩ          |
|                         |                              | 10mm diameter | 10 mm, 21 mm    | 820 uF, 1.5 mF                | 3 – 6 nH      | 10 – 20 mΩ        |
|                         | <i>Tantalum Capacitors</i>   | 3528          | 1.1 mm, 1.9 mm  | 220 uF, 470 uF                | 500 – 1000 pH | 5 – 10 mΩ         |
|                         |                              | 7343          | 1.9 mm, 2.8mm   | 330 uF, 680 uF                | 500 – 1500 pH | 10 – 20 mΩ        |
| Ceramic Capacitors      | <i>2-Terminal Capacitors</i> | 01005         | 0.15mm, 0.22mm  | 0.1 uF, 0.22 uF               | 120 – 150 pH  | 4 – 6 mΩ          |
|                         |                              | 0201          | 0.15mm, 0.22mm  | 0.1 uF, 0.47 uF               | 150 – 200 pH  | 10 – 20 mΩ        |
|                         |                              | 0402          | 0.22 mm, 0.33mm | 1.0 uF, 2.2 uF                | 200 – 250 pH  | 5 – 10 mΩ         |
|                         |                              | 0603          | 0.5 mm, 0.7 mm  | 4.7 uF, 10 uF                 | 300 – 350 pH  | 10 – 20 mΩ        |
|                         |                              | 0805          | 0.7 mm, 1.4 mm  | 22 uF, 100 uF                 | 300 – 400 pH  | 4 – 6 mΩ          |
|                         | <i>RGC Capacitors</i>        | 0204          | 0.22mm, 0.33mm  | 0.22 uF, 1 uF                 | 80 – 90 pH    | 4 – 6 mΩ          |
|                         |                              | 0306          | 0.33mm, 0.5 mm  | 1 uF, 2.2 uF                  | 80 – 100 pH   | 10 – 20 mΩ        |
|                         | <i>IDC Capacitors</i>        | 0603          | 0.5 mm          | 1 uF, 2.2 uF                  | 30 – 50 pH    | 5 – 10 mΩ         |
|                         |                              | 0805          | 0.7 mm          | 2.2 uF, 4.7 uF                | 50 – 60 pH    | 10 – 20 mΩ        |
| Silicon Capacitors      | <i>MOS Capacitor</i>         | N/A           | N/A             | 1 – 3 nF/mm <sup>2</sup>      | Negligible    | RC < 250 ps       |
|                         | <i>MIM Capacitor</i>         | N/A           | N/A             | 20 – 200 nF/mm <sup>2</sup>   | Negligible    | 250ps < RC < 5ns  |
|                         | <i>DTC</i>                   | N/A           | N/A             | 300 – 1500 nF/mm <sup>2</sup> | Negligible    | 2ns < RC < 20ns   |

transient response. The efficiency is a measure of how much power is dissipated as part of the voltage conversion process. The current density of the VR determines how much of the platform area will be taken up by the VR. The transient response characteristic is indicative of how fast the VR can respond to load transients. High bandwidth VRs can respond quickly to load transients, while low bandwidth VRs will need additional decoupling capacitors to make up for their slow response time.

In data center applications where the input voltage can be as high as 48 V, multistage VRs are a popular option. The first stage is typically designed using a fixed ratio high efficiency converter. The switched tank converter [6] or the LLC converter [7] is two popular topologies for accomplishing the fixed ratio conversion. The intermediate voltage generated by the first stage is then used as the input to the second stage which is usually a buck regulator.

#### D. Decoupling Capacitors

The load current in a microprocessor can ramp up quickly within a few clock cycles which is significantly faster than the response time of the platform regulator. Most microprocessor PDNs use multiple stages of decoupling capacitors to handle short-term fluctuations in the load current. Several factors need to be considered while choosing a decoupling capacitor. The size of the capacitors, their parasitic resistance and inductance, variation in the capacitance value, and cost are some of the key factors that influence the choice of capacitors. All capacitors have an equivalent series resistance (ESR) and an equivalent series inductance (ESL) associated with them. It is important to incorporate the ESR and ESL of the capacitors into the circuit when trying to model the performance of the PDN. The actual capacitance of the capacitor could also be significantly

different from the rated value when there is a voltage bias or a change in temperature. There are several types of capacitors available to choose from based on the area available as well as the amount of capacitance required. Table I provides a list of commonly used capacitors, their sizes, and typical ESL, ESR, and C values.

Electrolytic capacitors are most commonly used to provide the bulk output filter capacitance for the switching regulator on the platform. Electrolytic capacitors offer high capacitance density but suffer from large variations in the value of the capacitor. Since precision capacitance values are not a crucial requirement for decoupling applications, electrolytic capacitors are good candidates for low-frequency decoupling on the motherboard (MB). Aluminum electrolytic capacitors are inexpensive and are the most commonly used capacitors on desktop and server platforms which do not have constraints on the height of these capacitors. Tantalum polymer capacitors are shorter, more expensive, and are used primarily on mobile platforms which have platform height constraints.

Multilayer ceramic capacitors (MLCCs) are cheap high density capacitors that come in a variety of sizes and shapes. These capacitors are made with high permittivity ferroelectric materials such as barium titanate and use multiple layers of alternating electrodes and dielectric to maximize their capacitance. The larger capacitors typically have higher capacitance, while the terminal arrangement is modified to achieve lower ESL and ESR. For example, reverse geometry capacitors (RGCs) have terminals placed along the long edge and this helps achieve lower ESR and ESL in the same form factor. Interdigitated capacitors (IDCs) are another type of capacitor which use multiple power and ground terminals to achieve even lower ESL and ESR than RGCs. Most MLCCs are also subject to significant variation in capacitance as a function of temperature and voltage bias. Despite this drawback, MLCCs

have become the most popular type of decoupling capacitor found on electronic packages and MBs. Modern smartphones have hundreds of ceramic capacitors in sizes ranging from 01005 to 0805. Microprocessor packages typically use MLCCs on the landside of the package or the die side of the package.

Silicon capacitors are used to provide high-frequency decoupling in a microprocessor PDN. The metal–oxide–semiconductor (MOS) capacitors are fabricated on the transistor layer and use the metal gate and the doped semiconductor as the two electrodes, while the gate oxide acts as the dielectric. The MOS capacitors have negligible inductance and relatively low ESR which makes them suitable for very high-frequency decoupling. However, the capacitance is a strong function of the bias voltage due to its impact on the depletion characteristics of the channel. More recently, MIM capacitors [8] have become more popular as they provide significantly higher capacitance density than MOS capacitors. The MIM capacitor electrodes are typically placed in the thicker back-end metal layers. MIM capacitors can also be stacked to generate even higher capacitance density. Unlike MOS capacitors, MIM capacitors provide a very stable capacitance that does not change as a function of bias voltage due to the absence of the semiconductor layer. Another type of silicon capacitor is the DTC [9]. In a DTC, deep trenches are etched into the bulk substrate to increase the effective surface area of the electrodes. As a result, DTCs can achieve higher capacitance density than traditional planar MIM capacitors. These silicon capacitors can be either fabricated on the microprocessor die or as a separate integrated passive device (IPD) that is mounted on the package as an alternative to MLCCs. Since IPDs can be much thinner than ceramic caps, they could be good options for providing decoupling when there are extreme  $z$ -height challenges. One such application could be package landside capacitors in a fine pitch BGA package. The BGA heights on these packages are extremely small and the capacitor height needs to be less than  $100 \mu\text{m}$  to fit in this space. Another possible location for the DTCs is a silicon interposer which is increasingly being used to disaggregate the microprocessor into smaller die chiplets while providing high density routing between the chiplets. Silicon capacitors like DTC and MIM capacitors have seen a steady increase in capacitance density over the years. This opens up the possibility for completely eliminating the MLCCs from the package and relying entirely on silicon capacitors on the microprocessor die and the silicon interposer to provide all of the high- and mid-frequency decoupling.

#### E. Thermal Considerations

While the earlier sections discussed the importance of delivering the power to the microprocessor, it is equally important to remove the power that is being dissipated by the microprocessor to ensure the devices do not overheat. This is accomplished by using a thermal solution such as a heatsink or a heatpipe. The objective of the thermal solution is to ensure that the die junction temperature always stays under a certain maximum limit to ensure device reliability. Microprocessors have a feedback mechanism to sense the die

TABLE II  
TYPICAL TDP BY SEGMENT

| Segment                | TDP         |
|------------------------|-------------|
| Smartphones            | 1 – 3 W     |
| Tablets                | 3 – 7 W     |
| Thin and Light Laptops | 10 – 20 W   |
| Performance Laptops    | 20 – 50 W   |
| Desktop Computers      | 65 – 130 W  |
| Workstations           | 100 – 150 W |
| High Power Servers     | 200 – 400 W |
| High Power GPUs        | 300 – 600 W |

junction temperature and throttle the frequency and voltage if it exceeds the maximum limit. The TDP represents the sustained maximum power that a platform can support while keeping the die junction temperature below its maximum allowable limit which is typically around  $100^\circ\text{C}$ . The following equation represents the relationship between the TDP, the maximum allowable junction temperature ( $T_{j\max}$ ), the ambient temperature ( $T_{\text{ambient}}$ ), and the effective thermal resistance ( $\psi_{ja}$ ) defined in degree Celsius per watt:

$$\text{TDP} = \frac{T_{j\max} - T_{\text{ambient}}}{\psi_{ja}}. \quad (3)$$

The thermal resistance is a measure of how much power can be cooled by the thermal solution for every degree rise in temperature. The thermal resistance can vary significantly from segment to segment due to a number of reasons. For example, form-factor constraints in thin handheld devices limit the efficacy of the thermal solution. The cooling ability of such systems is further constrained by the inability to use fans or other active cooling mechanisms. On the other hand, high end servers can use expensive liquid cooling or immersion cooled systems which allow them to dramatically reduce the thermal resistance. As a result, the TDP for high power servers or GPUs can be much higher than that for smartphones. The TDP levels for microprocessors in different product segments are shown in Table II.

Even though the TDP is fairly small for handheld devices and laptops, they can still consume instantaneous power that is significantly higher. Today's microprocessors take advantage of any thermal headroom by dynamically scaling the voltage and frequency to higher levels to achieve better performance [10]. Furthermore, the relatively large thermal time constant allows the microprocessor to operate in a burst mode where they can ramp up power levels for a short period of time before dropping down to their TDP levels once the die junction temperature reaches its maximum allowable limit. As a result, when designing a power delivery solution, it is important to not just design for TDP, but rather design to the maximum power that the processor is expected to consume at any given time instant.

### III. POWER DELIVERY NETWORK PERFORMANCE

The effectiveness of a PDN is determined by its ability to keep the load voltage within a narrow operating range even

as the load current changes or if there are fluctuations in the input power supply. The performance of a PDN can be entirely described by the effective impedance seen by the load as a function of frequency. The power delivery impedance needs to be kept low across a broad range of frequencies all the way from dc to several hundred megahertz. Since the operating bandwidth of the platform-level regulators is limited to a few hundred kilohertz, the impedance at frequencies beyond this is mostly managed by the use of decoupling capacitors. The MB capacitors are used to provide a low impedance from the VR bandwidth to a few megahertz. The effective series inductance (ESL) of the MB capacitors as well as the inductance in the MB planes and socket pins render them ineffective beyond a few megahertz. The capacitors placed on the package have a lower ESL than MB capacitors due to their smaller form factor. In addition, package capacitors can come with special terminal arrangement such as RGCs or IDCs to further lower their ESL. The relatively low impedance in the path from the package capacitors to the microprocessor helps extend their effectiveness to a few tens of megahertz. Frequencies higher than this are the realm of on-die capacitors. Since the dimensions on die are extremely small, the parasitic inductance is negligible even at the highest microprocessor current slew rate. The on-die capacitors do have an effective series resistance associated with them which can limit their effectiveness at higher frequencies.

#### A. IR Drop

The total effective dc resistance of the PDN, from the output of the voltage regulator to the transistors on die, is an important parameter for improving efficiency and performance. The steady-state voltage seen by the die ( $V_{\text{die}}$ ) when there is a microprocessor load current ( $I_{\text{load}}$ ) given by

$$V_{\text{die}} = V_{\text{out}} - I_{\text{load}} R_{\text{dc}} \quad (4)$$

where  $V_{\text{out}}$  is the voltage at the output of the voltage regulator, and  $R_{\text{DC}}$  is the resistance in the path from the voltage regulator to the die. In open loop, the relationship between the on-die voltage, dc resistance, input voltage to the VR ( $V_{\text{in}}$ ) and the steady-state load current can be written as

$$V_{\text{die}} = M(D)V_{\text{in}} - I_{\text{load}} R_{\text{dc}}. \quad (5)$$

The function  $M(D)$  represents the conversion ratio of the switching regulator as a function of the duty cycle. For a simple buck regulator, the function  $M(D)$  represents the duty cycle. Any change in the microprocessor load current or the input voltage to the VR will cause the on-die voltage to change. Each regulator has a control loop which helps regulate this change in voltage by modulating the duty cycle of the switching regulator. However, the response time of the control loop can be fairly slow on the order of microseconds for platform-level VR solutions. As a result, any increase in the microprocessor load current will cause a temporary drop in voltage seen by the die before the regulator can react to it. Similarly, any drop in the microprocessor load current will cause the voltage to rise temporarily. These voltage fluctuations as seen by the die due to the VR latency can



Fig. 4. Circuit representation of a typical PDN with the resonant loops highlighted.

jeopardize the performance as well as the reliability of the microprocessor. In the case of spatially distributed voltage domains, there could be a large gradient in voltage from one end of the domain to the other. For example, if the voltage at the far side of the power supply is much lower than the voltage at the near side, the duty cycle of the regulator will have to be increased until the far side voltage is above the minimum voltage required. This will cause the circuits on the near side to be subjected to a higher voltage resulting in excessive power dissipation and increased reliability risk. For these reasons, it is important to minimize the voltage gradient in addition to the absolute IR drop. Another factor which is impacted by the dc resistance is Joule heating or routing losses. This is particularly important in high-current rails as these losses scale quadratically with current and can hurt the overall efficiency of the system. The primary mechanism for reducing the dc resistance is by adding more power layers on the PCB and package, using a wider power corridor, adding more power pins, and using thicker power planes. A typical design target for dc drop is around 5% of the output voltage at maximum load current. This translates to a dc resistance design target of 0.5 mΩ on an 1-V domain with a load current of 100 A.

#### B. Transient Noise

Since the load drawn by the microprocessor has a significant ac component in addition to the dc current, the PDN needs to minimize transient noise in addition to the dc drop. The frequency-domain impedance of the PDN is a good measure of its ability to suppress transient noise. The electrical representation of the PDN is shown in Fig. 4 and is comprised of resistive, capacitive, and inductive elements. The PDN shown in Fig. 4 is a resonant network where the impedance can be represented as

$$Z(\omega) \approx A + \sum_{i=1}^{i=3} \frac{B_i}{\Gamma_i^2 + (\omega_i - \omega)^2} \quad (6)$$

where  $(\omega_i)$  [ $i = 1, 2, 3$ ] are various angular resonance frequencies in each of the loops in the PDN. Equation (6) is a first-order approximation that assumes that the resonances of each loop are isolated from each other. The frequency-domain response of a typical PDN for a consumer microprocessor is shown in Fig. 5. As seen from Fig. 5, there are multiple resonant peaks corresponding to the different loops in the



Fig. 5. Impedance profile of the PDN in the frequency domain.



Fig. 6. Time-domain voltage droops in a PDN.

PDN. The design objective is to reduce the impedance below the target value by modulating the  $R$ ,  $L$ , and  $C$  elements in the design. This is typically done by choosing the right type of capacitors or by varying the number of capacitors for each decoupling stage.

The variation in impedance seen in the frequency domain manifests itself as a voltage fluctuation with time across the power supply terminals of the transistors in the die, when the circuits switch. The response of the PDN to a 25-A step current excitation which happens when the processor wakes up from a sleep mode is shown in Fig. 6. The first negative spike that happens within the first few nanoseconds is due to the loop marked  $\omega_1$  in Fig. 4. This voltage droop is dictated by the tank resonance caused by the interaction between the effective package loop inductance  $L_{\text{pkg}}$  and on-die capacitance  $C_{\text{die}}$ . The effective package loop inductance includes the ESL of the package decoupling capacitance and other interconnect structures such as redistribution layers, vias, power/ground planes, and C4 bumps. The resistance of the die capacitor and the other elements in the loop help dampen the voltage droop [11] and any subsequent ringing. The second and third voltage droops that are shown in Fig. 6 occur due



Fig. 7. Power distribution stages in a data center.

to the remaining resonant loops  $\omega_2$  and  $\omega_3$  in that order. However, as shown in Figs. 5 and 6, the amplitude of these droops tends to be diminished as compared with the first droop. Traditionally, the second and third droops have been managed by adding more capacitors in the package and MB, respectively. However, practical limitations in the amount of capacitance available on die have made it difficult to manage the first droop. Recently, the introduction of MIM capacitors has improved the amount of capacitance available on die which helps mitigate the impact of first droop.

### C. System Efficiency

An important metric in PDN design is system efficiency which has a direct impact on TDP and battery life. Power system efficiency is defined as

$$\eta = P_{\text{out}} / P_{\text{in}} \quad (7)$$

where the output power refers to the power consumed by the microprocessor, while input power is the power delivered to the input of the VR. The definition for system efficiency varies from one segment to another. For example, in a battery powered device, the system efficiency is a measure of the fraction of the power from the battery that is used by the microprocessor to do useful work. The remaining power is dissipated as heat due to inefficiencies in the VR conversion process and Joule heating losses. Improving the system efficiency in such devices has an appreciable impact on battery life. The losses on the PDN can occur either at the platform level or on the package and the die. From a system efficiency standpoint, the losses on the platform are just as important as the losses in the package and the die. However, the losses on the platform do not have a significant impact on the processor TDP, while the losses on the package and die will count toward the TDP envelope.

The overall system efficiency is becoming increasingly important in large data center systems. Data centers were estimated to consume 205 TWh in 2018 [12]. As a result, data center engineers spend a lot of time to minimize their noncomputing energy which includes conversion losses, Joule heating losses, and other overheads like cooling. Fig. 7 shows the power delivery path in a data center. Large data centers often have a sub-station on site which delivers an ac voltage of 480 V. This power is fed through a rectifier which delivers a dc voltage of 400 V. A step down dc-dc converter then takes this down to 48 V which is then delivered to all the platforms. Each rack typically has an uninterrupted power supply (UPS)



Fig. 8. Routing losses as a function of output power for different input voltages.

to ensure continuity of operation in the event of a power failure. The 48 V is converted to the processor voltage using a single-stage or dual-stage converter. Each power conversion stage in the data center PDN has an extremely high efficiency in the mid to high 90s. However, at higher power levels, the routing losses in the path from the last dc-dc converter on the platform to the microprocessor start to dominate and have a detrimental impact on overall system efficiency. One potential work around for these solutions is to bring in power at a higher voltage to the microprocessor through the use of an IVR. High-voltage power delivery allows for a reduction in current through the power delivery path from the platform regulator to the microprocessor. Since routing losses have a quadratic dependence on current, increasing the power delivery voltage to the microprocessor in the package can be quite effective at minimizing the Joule heating and routing losses on the MB. Fig. 8 plots the routing losses on the MB as a function of output power for different input voltages on the platform.

#### IV. INTEGRATED VOLTAGE REGULATOR

IVRs are broadly defined as solutions which incorporate the final stage of voltage regulation on the package or the die. IVR options have been increasing in popularity and have been implemented on a number of commercial microprocessors [13]–[15].

##### A. Motivation for IVRs

Two key factors are fueling the transition from platform-level voltage regulators to IVRs implemented on the package or the die. The first factor is the proliferation of on-die power domains, as shown in Fig. 2, driven by a need for fine grain power management. It is not practical to have tens of voltage regulators on the platform due to a lack of platform-level resources. It is much more efficient to use the finite resources to have a small number of robust platform-level voltage regulators which can deliver the input power to various IVRs on the package or the die. This is



Fig. 9. Schematic illustrating the use of IVRs to generate multiple power supplies from a single platform-level VR.



Fig. 10. (a) Single-stage platform VR-based PDN. (b) Dual-stage IVR-based PDN.

illustrated in Fig. 9 where a single platform regulator is used to feed multiple IVRs implemented on the die.

The second factor that is driving the push for IVR is the steady growth in processor power levels, especially in datacenter CPUs and GPUs. As the power levels go up, the routing losses in the PDN can have significant impact on the overall system efficiency. IVRs can address this problem by bringing power to the processor at a higher voltage. This reduces the current through the PDN and minimizes the routing losses, as shown in Fig. 8. At high power levels, the reduction in routing losses is more than enough to offset the conversion losses introduced by the IVR. This is best illustrated with an example which compares the system efficiency of a single stage platform VR-based PDN with that of a dual stage IVR-based PDN. Fig. 10(a) represents a PDN with a single-stage VR on the platform (VR1). The VR has an input voltage of 12 V and is delivering a current of 100 A to the processor at 1 V. For this example, we assume that the effective resistance in the MB is 1 mΩ, while the effective resistance in the package is 0.5 mΩ. A typical platform VR will have an efficiency of around 90% for 12–1-V conversion. For this example, the conversion losses are 12.8 W, and the routing losses are 15 W. The overall system efficiency is 78.3%. In Fig. 10(b), we have a two-stage VR to deliver the power to the same system. The first VR on the platform is a simple high efficiency fixed ratio converter that takes the 12-V input

TABLE III  
SYSTEM EFFICIENCY—SINGLE-STAGE PDN VERSUS DUAL-STAGE PDN

| Parameter                                                | Single Stage PDN | Dual Stage PDN |
|----------------------------------------------------------|------------------|----------------|
| CPU Power ( $P_{CPU}$ )                                  | 100 W            | 100 W          |
| Output Voltage ( $V_{OUT}$ )                             | 1 V              | 1 V            |
| Output Current ( $I_{OUT} = P_{CPU}/V_{OUT}$ )           | 100 A            | 100 A          |
| VR2 Input Voltage ( $V_{IN2}$ )                          | 1 V              | 3 V            |
| VR2 Efficiency ( $\eta_2$ )                              | 100%             | 88%            |
| VR2 Input Power ( $P_{IN2} = P_{CPU}/\eta_2$ )           | 100 W            | 113.6 W        |
| PDN Current ( $I_{PDN} = P_{IN2}/V_{IN2}$ or $I_{OUT}$ ) | 100 A            | 37.9 A         |
| Package Resistance ( $R_{PKG}$ )                         | 0.5 mΩ           | 0.5 mΩ         |
| Package Losses ( $P_{PKG} = I_{PDN}^2 \times R_{PKG}$ )  | 5 W              | 0.7 W          |
| MB Resistance ( $R_{MB}$ )                               | 1 mΩ             | 1 mΩ           |
| MB Losses ( $P_{MB} = I_{PDN}^2 \times R_{MB}$ )         | 10 W             | 1.4 W          |
| Routing Losses ( $P_R = P_{PKG} + P_{MB}$ )              | 15 W             | 2.1 W          |
| VR1 Output Power ( $P_{OUT1} = P_{IN2} + P_R$ )          | 115              | 115.7 W        |
| VR1 Efficiency ( $\eta_1$ )                              | 90%              | 97%            |
| VR1 Input Power ( $P_{IN1} = P_{OUT1}/\eta_1$ )          | 127.8 W          | 119.3 W        |
| System Efficiency ( $\eta_{sys} = P_{CPU}/P_{IN1}$ )     | 78.3%            | 83.8%          |

and generates a 3-V output. Since fixed ratio converters can be implemented very efficiently [16], we can achieve an efficiency of 97%. In this scenario, the second-stage IVR is implemented on the die and converts the 3 V to the desired output voltage. Due to the higher switching frequency required for on-die implementation, the efficiency of this VR will not be as high as that of a low-frequency platform VR. This is partially offset by reduction in the input voltage which allows for operation at higher duty cycles. For this example, we assume that the efficiency of the second stage VR is 88% for 3–1-V conversion. The second stage VR dissipates 13.6 W to deliver 100 W to the processor. However, the routing losses are only 2.1 W due to the significant reduction in the current from the platform VR. The power dissipated in the fixed ratio platform VR is only 3.6 W due to its high efficiency. The overall system efficiency for the IVR case is 83.8%. The results are summarized in Table III. In order to use the same equations for the two columns, VR2 for the single-stage converter is assumed to be a 1:1 converter with 100% efficiency.

#### B. Types of IVR

IVRs can be broadly classified based on their topologies. The simplest on-die power delivery solution is a power gate switch shown in Fig. 11(a). Power gates are used to turn OFF the power to inactive circuits to minimize their leakage power consumption. A common application for power gates is in delivering power to multiple cores using a single platform-level power supply as described in [17]. Each core has its own power gate which is turned OFF when the core is inactive or idle to minimize leakage power. This helps save leakage power on the idle cores and allows the active core to use a higher fraction of the overall power budget for the microprocessor. Power gates are simple to implement as there is very little complexity in the design. There is a



Fig. 11. Types of IVR topologies. (a) Simple power gate. (b) LDO regulator. (c) Switching buck regulator. (d) SCVR.

small IR drop penalty in using power gates as the switches have a finite resistance associated with them. The biggest downside of power gates is their inability to regulate the output voltage. Linear or low-dropout (LDO) regulators, as shown in Fig. 11(b), address this drawback by including a control loop in their design. LDOs are relatively easy to implement on die as well due to the absence of an energy storage element. However, LDOs are usually limited to applications where the input voltage is close to the output voltage. Hence, they are not good candidates for high power rails where the motivation for using an IVR is to minimize routing losses by bringing in power at a significantly higher voltage.

The switching regulators shown in Fig. 11(c) and (d) are better suited for IVR implementations that require a higher input voltage. Switching voltage regulators use an energy storage element to achieve high efficiency voltage conversion. The energy storage element in a buck regulator is an inductor, while a switched capacitor voltage regulator (SCVR) uses a capacitor as its energy storage element. During the first part of the switching cycle, the input delivers enough power to charge the energy storage element and provide power to the output. During the second part of the switching cycle, the input is disconnected from the circuit and the energy stored in the capacitor or inductor is used to power the output. As capacitors typically have a higher energy storage density than inductors, it is possible to design high efficiency compact SCVRs. However, simple SCVRs suffer from poor regulation as they are best suited for fixed ratio conversion from input to output and often suffer from poor efficiency when the input-to-output voltage deviates significantly from the optimal ratio. More recently, new switched capacitor-based hybrid topologies have been introduced to work around these drawbacks [18]. Hybrid schemes based on buck regulators and linear regulators have also been implemented. In order to generate a scalable number of on-die power domains with fewer inductors, single-inductor multiple-output (SIMO) regulators augmented with linear voltage regulators for transient management have been implemented in [19].



Fig. 12. Interaction between PDN and LDO response.

### C. Linear Regulators

LDO regulators are an enhanced version of power gates which produces a regulated output voltage. Unlike power gates which can only be ON or OFF, linear regulators have a control loop to modulate the effective resistance of the power gate transistors. They can regulate the output voltage to a preset value prescribed by the control loop. LDOs can enable on-chip fine-grain power management in multi-core microprocessor and system-on-a-chip platforms to increase system-level energy efficiency. Digital LDOs have emerged recently as a candidate for on-chip voltage conversion and regulation of digital load circuits [20], since they can enable on-chip fine-grain power management. Due to their digital logic synthesizability and automated placement and routing, they can enable per-core DVFS in large microprocessors and systems-on-chip (SoC) at a low design complexity and integration time [21].

Even though linear regulators have many benefits such as ease of implementation and high efficiency, in some scenarios, they do have some limitations. A key reason for the ease of implementation is the absence of any energy storage elements which are found in a switching regulator. As a result, the input current that is fed to the linear regulator must match the sum of the load current and the quiescent current consumed by the regulator. Consequently, the efficiency of the linear regulator can be expressed as

$$\eta = \frac{I_{\text{OUT}} V_{\text{OUT}}}{(I_{\text{OUT}} + I_Q) \cdot V_{\text{IN}}}. \quad (8)$$

At higher currents, the quiescent current is negligible and, therefore, the load conversion efficiency of the linear regulator is simply the ratio of the output to the input voltage

$$\eta = V_{\text{OUT}} / V_{\text{IN}}. \quad (9)$$

From (8), it is clear that linear regulators are not well suited for applications where there is a large difference between the output and input voltage. A common use of linear regulators is when there are a large number of voltage domains whose output voltages are relatively close to each other. Linear regulators are also used to derive an isolated rail that is used to power sensitive analog circuits with poor noise immunity.

One of the challenges in implementing the LDO is the power supply rejection ratio which is a measure of the fraction of input noise that makes it to the output. The PSR is a frequency-dependent measure and is defined as

$$\text{PSR} = \hat{v}_{\text{out}} / \hat{v}_{\text{in}}. \quad (10)$$

The voltages in (9) and Fig. 12 represent the ac component of the voltages. The PSR peaks when the control loop



Fig. 13. Transformer-based topology for 48-1-V conversion.

gain approaches 0 dB. At these frequencies, a significant portion of the input noise will pass through to the output. Since the crossover frequency for typical LDOs occurs in the 50–100-MHz range, it is important to ensure there are no PDN resonances near this frequency. One way to do this is through the use of power transmission lines as described in [22].

### D. Transformer-Based Topologies

Transformer-based converters are based on topologies that use a transformer to provide galvanic isolation between the input and output, which makes them attractive for direct operation OFF the power grid. The use of a transformer can also facilitate high efficiency conversion even at high input-to-output voltage ratios. However, the use of a transformer makes on-die integration of these topologies challenging. Recent developments in PCB technologies have made it possible to design high-performance transformers such as the one described in [23].

The most commonly used data center power delivery architecture uses a two-stage approach to go from 48V to the microprocessor voltage, as shown in Fig. 7. Here, the first stage is typically an unregulated fixed ratio converter which is followed by a fully regulated point of load converter. In contrast, another approach is shown in Fig. 13 where the high efficiency fixed ratio converter is used as the point of load current multiplier, while a fully regulated buck-boost converter is used as the first stage [24]. As the power density of the current multiplier improves with time, it may become possible to implement this stage on the package which will greatly limit the routing losses on the PCB and the socket. One downside of this scheme is the inability to derive a large number of voltage rails which could necessitate the need for an LDO downstream of the point of load current multiplier.

### E. Integrated Buck Regulator

The key difference between buck regulators on the platform and the integrated buck voltage regulator is in the density requirements. Platform-level regulators can have a much bigger footprint than integrated regulators which allow them to use large output filter components. This enables the platform-level buck regulators to switch at relatively low frequencies of the order of 1 MHz. Since there is not enough room on the package or the die to accommodate the large output filter components, integrated buck regulators have to



Fig. 14. Types of buck IVR configurations. Separate VR die with (a) discrete inductor, (b) embedded inductor, and (c) silicon inductor. VR in CPU with (d) discrete inductor and (e) embedded inductor.

switch at much higher frequencies. Integrated buck regulators switch at tens of megahertz or even over 100 MHz to enable the miniaturization of the filter components required for on-package or on-die integration. Fig. 14 shows several types of buck IVR configurations based on the location of inductors and the VR circuits. Configurations shown in Fig. 14(a)–(c) use a separate VR chip, while the configurations shown in Fig. 14(d)–(e) incorporate the VR circuits in the CPU die [13]. In Fig. 14(a) and (d), discrete inductors are used in the package, while in Fig. 14(b) and (e), the inductors are embedded inside the package. These inductors could be either air core inductors (ACIs) [25] or implemented using magnetic material inside the package. In Fig. 14(c), the inductors are implemented on the silicon using thin-film magnetics [26].

Due to the significant improvement in transistor switching characteristics, designing power FETs on CMOS at these switching frequencies is not a problem. However, these CMOS switches are somewhat limited in the input voltage they can handle. Since the maximum voltage a CMOS logic device can handle is of the order of 1 V, they have to be stacked to enable a higher input voltage. The input voltage used on FIVR is 1.8 V and is achieved using two-stack cascode transistors for the power train [13]. Higher input voltages can be achieved by stacking even more transistors, but this does increase the conduction losses and the switching losses of the device while increasing the overall device area. Furthermore, the inductor requirements get more stringent as the input voltage is increased. On the flip side, increasing the input voltage can have significant impact on the size of the platform VR and the routing losses due to the reduction in input current. These benefits need to be weighed against the potential impact to efficiency and device area before deciding on the optimal input voltage for the integrated buck regulator.

The output capacitance of the buck regulator can be implemented using package capacitors or on-die MIM capacitors. As mentioned in the earlier section, the increase in capacitance density of silicon capacitors such as MIM or DTC makes it possible to design a high-frequency IVR that only uses silicon capacitors for its output decoupling. However, package capacitors may still be needed to address the input decoupling requirements. The biggest challenge in the design of an



Fig. 15. Commonly used inductor topologies. (a) Spiral inductor. (b) Inductor with closed magnetic loop. (c) Solenoidal inductor.

integrated buck voltage regulator is that of the inductor. These inductors can be designed either in the package or on the die.

#### F. Inductors for IVR

IVR inductors implemented on the die or package need to operate at a higher frequency than what is typically seen on platform VRs. The need for high permeability and low-loss characteristics at the relatively high switching frequencies limits the material options for inductors. On the other hand, the high switching frequency used in some IVRs makes it possible to use inductors with an inductance as low as 1–2 nH. This opens up the possibility of using ACIs as an alternative to magnetic inductors. While it is possible to design on-die ACIs for RF applications, volumetric constraints make it difficult to design an on-die ACI for high density power conversion. ACIs can be designed in a package using standard package traces and vias if the switching frequency is high enough. This was a key technology enabler to develop the first integrated switching regulator in high volume manufacturing [13]. While ACIs perform well at high frequencies and are relatively simple to implement in a package, they do have some disadvantages. ACIs do not have enough inductance density to enable switching frequencies lower than 50 MHz. This makes it difficult to enable high-voltage conversion using ACIs as the high voltage devices usually operate at lower frequencies. Electromagnetic interference (EMI), radio frequency interference (RFI), and noise coupling to adjacent signal lines are another area of concern with an ACI-based design since the magnetic field from the inductors is relatively unconstrained. Another downside of ACIs is their lack of scalability to support the increased current density requirements driven by process scaling. As a result, there is a small drop in efficiency with each generation as the current density increases. One workaround to this is the use of magnetic inductors.

The use of magnetic materials can help achieve much higher inductance densities than what is possible using ACIs. However, magnetic inductors have to deal with additional losses such as eddy current and core losses and can saturate at higher currents. These are factors that need to be comprehended when designing a magnetic inductor. The three most commonly used magnetic inductor topologies are shown in Fig. 15. The spiral inductor shown in Fig. 15(a) is a popular choice for on-die implementations. The inductor can



Fig. 16. Package embedded inductors. (a) Solenoidal inductor. (b) Toroidal inductor.

be placed on top of a magnetic material to increase the inductance. However, their planar geometry and the difficulties with adding a second magnetic layer limit their applicability for power conversion [26]. Another option is the structure in Fig. 15(b) [27] which consists of copper traces that form a single loop and is used commonly for implementing thin-film magnetic inductors on die. By using a high permeability magnetic material like CoZrTa or Permalloy that provides a closed path for the magnetic flux, it is possible to boost the inductance of the structure. However, high permeability magnetic materials often have eddy current losses and can saturate at relatively low currents. Eddy current losses can be mitigated by using laminations, while the saturation is addressed through the use of a coupled multiphase design where the magnetic fields of dc currents through adjacent copper traces effectively cancel. Another approach to avoid saturation is through the introduction of air gaps between the two magnetic layers as described in [28]. Solenoidal inductors shown in Fig. 15(c) [29] are another commonly used topology which can achieve high inductance density with just a single magnetic layer by increasing the number of turns. One drawback of solenoidal inductors is that it does not provide a closed magnetic path for the flux which can reduce the  $L/R_{dc}$  ratio and introduces unwanted coupling.

While on-die magnetic inductors can deliver high inductance densities, they are not as effective in achieving high-current densities. The thin-film magnetic material used on these inductors can saturate at relatively low currents. On-die magnetic inductors also tend to have lower  $L/R_{dc}$  ratio which makes it difficult to increase the inductor current. Magnetic inductors implemented on the core of the package can take advantage of the increased metal thickness and the larger volume of magnetic material to achieve high-current densities without saturation. Fig. 16 shows a couple of different inductor topologies that can be used for magnetic inductors in a package [30]. The structure in Fig. 16(a) represents a solenoidal design, which is an open-loop design for the magnetic flux. The structure in Fig. 16(b) represents a toroidal design with a closed path for the magnetic flux. This makes the latter a good choice for embedding within the package. Fig. 16 also shows



Fig. 17. (a) Backside of an Intel Core microprocessor package. (b) Zoomed-in view of the components on the backside of package. (c) Front and back of an MIA module.

the change in inductance when the structure is embedded within the package. There is a 35% drop in inductance for the solenoidal inductor due to the presence of the conducting planes in the package. In contrast, the closed-loop toroidal structure shown only sees a 5% drop in inductance when embedded inside the package.

While embedded inductors are a viable option in packages with a relatively thick core, they may not be suitable in coreless packages and are not as effective in ultra-thin core (100- $\mu$ m core) packages. In such scenarios, the magnetic inductors can be assembled on the landside of the package. This was the case with the tenth generation Intel Core microprocessor package shown in Fig. 17. A number of magnetic inductor array (MIA) modules were used to power the different voltage domains on that microprocessor [31].

## V. TECHNOLOGY INGREDIENTS FOR NEXT-GENERATION IVR

The advent of the multicore era has seen the implementation of IVRs to provide independent power supplies to each of the cores. However, these IVRs have been implemented with either an LDO [14], [15] or with a low-voltage buck regulator [13]. As power levels continue to rise, it is imperative to develop a high-voltage IVR. This section discusses some of the key technology ingredients required to enable a high-voltage IVR.

### A. Device Options

The power MOSFETs used in a dc–dc converters are judged by their resistance when in their ON-state, the voltage they can block while in their OFF-state and the amount of energy consumed to toggle them. The most commonly used metric to quantify the performance of the power MOSFET is the figure of merit (FOM) as shown in the following equation:

$$\text{FOM} = R_{DS(on)} Q_{GS}. \quad (11)$$

In the above expression,  $R_{DS(on)}$  represents the resistance between the source and the drain when the switch is in the ON-state and  $Q_{GS}$  represents the gate charge required to turn on the switch. The FOM is an effective metric for quantifying device performance as it encapsulates both the conduction losses and the switching losses for a given device. Another key metric for a device is the voltage rating which is a measure of the blocking voltage across the switch when it is in its



Fig. 18. FOM comparison for different device technologies as a function of voltage rating.

OFF-state. Fig. 18 plots the FOM as a function of voltage rating for a number of different device technologies.

Power MOSFETs used for platform-level voltage regulators are bulky high-voltage devices which operate at relatively low switching frequencies. The most common device technologies for these switches are laterally diffused MOS (LDMOS) or vertically diffused MOS (VDMOS) [32]. These switches have an extended diffusion region near the drain which enables them to block relatively high dc voltages. These devices have very different characteristics than that of the CMOS FinFETs used in most logic circuits. The FinFETs used in modern processors have excellent FOM but can only block a voltage of around 1 V in their OFF-state. Most process nodes also offer IO transistors which have a thicker gate oxide to enable a higher voltage rating than logic transistors. However, these transistors with a thicker gate oxide do not perform as well as the thin gate logic transistors.

Gallium nitride (GaN)-based transistors have recently grown in popularity for high-voltage applications. GaN has a wider bandgap than silicon which allows for much more efficient operation at high-voltage, high-temperature applications [33]. GaN high electron mobility transistors (HEMTs) can achieve much better FOM than silicon-based LDMOS at 48 V, as shown in Fig. 18. Single-stage converters with an input voltage of 48 V will see significant efficiency and density benefits from using GaN HEMT over silicon-based switches. However, the FOM advantage of GaN HEMT diminishes as the input voltage drops below 10 V. Ongoing research in the development of low-voltage GaN-based switches shows promise in achieving 4× lower FOM than thin gate FinFETs even at 5 V [34]. These low-voltage GaN devices could be well suited as the second stage with a greater than 5-V input for a two-stage converter.

#### B. High-Frequency Magnetic Inductors

While there are a lot of magnetic material options for low-frequency inductors used in platform VRs, the choice of materials at high frequency of operation is still limited.

The magnetic materials can be classified into ferromagnetic metal alloy thin films, laminated multilayers, nanogranular thin films, nanogranular multilayers, ferrite thin films, ferrite tape, and ferrite composites as described in [35]. The permeability range is 5–1000, with magnetic loss tangent in the range of 0.01–1 and ferromagnetic resonance frequency (FMR) in the range of 1–100 MHz. In general, as the permeability increases so does the loss tangent with a reduced FMR. For composite materials, an important exercise is, therefore, to optimize the volume fraction of the metal particles, thereby increasing FMR while reducing magnetic loss tangent, for a reasonable permeability.

FinFET-based low-voltage IVR solutions can switch at frequencies up to 100 MHz or higher. However, as we move to higher voltage IVRs using stacked FinFETs or GaN-based devices, the switching losses are too high to enable switching frequencies in excess of 100 MHz. The target switching frequency for high-voltage IVR is in the range of 5–50 MHz. This allows for a relatively small inductor while keeping the switching losses manageable. From a trade-off between saturation current, required inductance density, and dc resistance, the required properties of the magnetic material can be derived using the Lorentz and Landau–Lifshitz–Gilbert equation as shown in [35]. For example, for a 48-V/1-V conversion at 10 MHz with 90% efficiency, the magnetic material should have a permeability of ~90, loss tangent less than 0.033 at 10 MHz, and stability up to 50 MHz [36]. However, most commercially available high permeability materials have a loss tangent that is unacceptably high at frequencies of 10 MHz or higher. Fig. 19 shows the comparison of the magnetic material properties of a hypothetical inductor that is required to enable high-voltage conversion at 10 MHz versus what is commercially available today.

#### C. Advanced Decoupling Solutions

In Section II-D, we have discussed in detail the range of decoupling solutions that are used today. The increase in capacitance density of silicon capacitors like MIM capacitors has made them good candidates for filtering the output ripple in high-frequency IVRs. However, as the switching frequency drops from 100 down to 10 MHz to enable high-voltage IVRs, more capacitance density is required from the output MIM capacitors to keep the output ripple and the voltage droops manageable. Recent improvements in DTC technology could make them a good candidate for providing sufficient decoupling even down to 10 MHz.

One downside of currently available silicon capacitors is their inability to handle high voltage. As we target high-voltage IVRs with higher input voltage, we will need high-frequency decoupling for the input rail that can handle the high input voltage. Development of high-voltage silicon capacitors like MIM or DTC will be important to meet the high-frequency decoupling needs of the input rail. One possible alternative to high-voltage silicon capacitors is the development of high-voltage, high-frequency package capacitors such as thin-film capacitors [37] and embedded array capacitors [38].



Fig. 19. Magnetic material properties—commercially available versus what is required.



Fig. 20. Hybrid series capacitor tapped buck inductor.

#### D. Topology for High-Voltage Conversion

The efficiency of traditional buck regulators decreases as the input-to-output voltage conversion ratio increases. For example, a 48-V IVR with an output voltage of 1 V will have an extremely small duty cycle. At such low duty cycles, the energy in the harmonics of the switching frequency can be very high. This can cause significant inductor losses since there is still significant energy in the harmonics that lie beyond the FMR of the magnetic inductor. One work around is the two-stage solution described earlier. Here, a high efficiency fixed ratio unregulated first stage could bring the input voltage down to 5 V, and the second stage can handle the conversion from 5 V to the output voltage. By bringing in power at 5 V, one can still keep the routing losses relatively small, as shown in Fig. 8.

Another option is the use of alternate topologies to extend the duty cycle. One such example is the hybrid series capacitor tapped buck inductor shown in Fig. 20 [39]. In Fig. 20, the inductances act as current sources that can charge or discharge the output capacitance. The series capacitor and the tapped inductor extend the duty cycle for a 48:1 converter from ~2% to 10%. Despite the increase in losses from the two additional transistors, it is possible to achieve relatively high efficiency with this topology [36].

#### E. Co-Design and Thermal Considerations

Design space exploration and optimization based on chip-package co-design are critical for the implementation of IVRs due to the interactions between the power delivery ingredients and the rest of the microprocessor. The power delivery designers need to work closely with the microprocessor architects to identify the right location for the IVR. The design considerations are very different for IVRs integrated on the

microprocessor die as compared with IVRs implemented as a separate chip on the package. For the latter, the location of the IVR chip and inductor could have a significant impact on the design of the rest of the package. It is important to keep noise sensitive high-speed IO signals far away from the IVR and the embedded inductors. It is also important to ensure that thermal solutions are used to cool the IVR and the inductors as well. Care should be taken to ensure that the bumps and the inductors can handle the sustained current required to power the microprocessor. It is not uncommon to have the size of the IVR chip determined by the current carrying capability of the bumps rather than the size of the power MOSFET and the control logic.

In scenarios where the IVR is integrated on the microprocessor die, the location of the IVR could have significant thermal ramifications. For example, if it is placed next to a hotspot on the microprocessor, the extra heat generated by the IVR could further increase the temperature of the hotspot, thereby limiting the TDP for the microprocessor. One benefit of integrating the IVR in a location of relatively low current density is the ability to use the bumps under the neighboring logic circuits to deliver current to and from the switches.

#### VI. PRESENT AND FUTURE HETEROGENEOUS PACKAGING ARCHITECTURES AND POWER DELIVERY IMPACT

Up until recently, the increase in transistor density enabled by Moore's law scaling has enabled integration of most of the system-level functionality to a single microprocessor chip. For example, computers from the early 2000s had a separate memory controller chip, graphics processor, voltage regulators, and a peripheral IO chipset. Since then, more and more of that functionality has been integrated to the main processor chip, thereby earning the moniker, SoC. While this has helped enable dramatic reduction in the overall system footprint, this approach is not without its drawbacks. A lot of the logic circuits implemented on the SoC do not derive significant performance benefits from being on the latest process node. For example, analog circuits like voltage regulators or legacy IO buffers do not need to be fabricated on the latest process node to achieve optimal performance. Often times, implementing these circuits on a cheaper process node that is one or two generations behind could help reduce the overall cost of the system with minimal performance impact. The biggest impediment in transitioning from homogeneous integration on a single process to a heterogeneous integration of chips fabricated on different process nodes has been the lack of interconnect density on the package. Recently, that has started to change with the introduction of technologies like silicon interposer [40], embedded multidie interconnect bridge (EMIB) [41], and 3-D face-to-face stacking technology called Foveros [42]. These advanced packaging technologies used to enable heterogeneous integration can be broadly classified as 2.5-D or 3-D. In the case of 2.5-D packaging technologies, two or more integrated circuits are connected to the same package using high density routing. Three-dimensional packaging technologies are those that enable stacking of two or



Fig. 21. Silicon interposer for disaggregation.



Fig. 22. EMIB.

more integrated circuits. We take a closer look at each of these technologies and discuss their impact on power delivery.

#### A. 2.5-D Packaging for Heterogeneous Integration

The design rules for routing on an organic package are significantly coarser than the ones available for routing on silicon. In addition, the mismatch in the thermal expansion coefficient between the silicon substrate and the organic package makes bump pitch scaling for the package to silicon interconnect difficult. This makes it challenging to increase the interconnection routing density from one chiplet to another. In a monolithic chip where the different logic blocks are integrated, it is possible to have thousands of interconnects from one section to another. If these sections were to be disaggregated into separate chiplets, we would need to route thousands of signals through the package to connect the two chiplets. Accomplishing this interconnect routing on the package becomes prohibitively expensive due to the bump pitch constraints and the coarse design rules on the package. One work around for this is through the use of Silicon Interposer. Fig. 21 shows a cross section of a silicon interposer that is used to disaggregate the processor into three different chiplets. The design rules for routing on silicon and the microbump pitch are much more favorable to enable the routing density required to disaggregate the processor into smaller chiplets.

From a power delivery standpoint, one downside of using silicon interposers is that all the power from the package needs to be routed through the through-silicon vias (TSVs). Since the resistance of the TSVs is significantly higher than that of a traditional bump, these can contribute to an increase in IR drop through the PDN. A potential power delivery benefit from using a silicon interposer is that it enables the possibility of using some form of MIM capacitor or DTC on the interposer. This capacitance can supplement the on-die capacitance on the top die to help suppress any high-frequency noise on the PDN.

The primary downside of the use of silicon interposer is the need to use a large interposer that is a superset of all the chiplets on the top. This can increase the overall cost especially in a large system. The EMIB addresses this problem by using a much smaller silicon bridge that is just large enough to



Fig. 23. Foveros architecture.

accommodate all the interdie routing between the two chiplets. Fig. 22 shows a package with an EMIB to connect two die chiplets. Unlike the silicon interposer, the EMIB does not interfere with the power routing to most of the chiplet area. There is no IR drop penalty across the TSVs. However, the introduction of the EMIB does prevent the portion of the die chiplets that falls in its shadow from having a direct power delivery path. The circuitry in this region gets power that is cantilevered from the adjacent bumps and relies on lateral routing on-die as well as the surface layer on the package. Despite these challenges, EMIB has proven to be an attractive alternative to silicon interposer when the total number of die chiplets is relatively low.

#### B. 3-D Packaging for Heterogeneous Integration

The 3-D packaging technologies have been used to stack a group of chips that perform an identical function such as memory stack. However, 3-D packaging for heterogeneous integration is still a somewhat nascent field with new technologies such as Foveros or system on integrated chip (SoIC) [43]. From a packaging and assembly standpoint, the Foveros architecture is very similar to that of a silicon interposer. The key difference with Foveros is that instead of using a passive base die, it uses a base die with active circuits. For example, the circuits that were used on die chiplet 3 in Fig. 21 could be moved to the base die to help reduce the number of die chiplets used on the top. This configuration is shown in Fig. 23. By stacking the circuits in this fashion, it is also possible to reduce the overall size of the package and the die complexity. The power delivery impact of Foveros is similar to that of a silicon interposer in that power has to be routed through the TSVs which results in an increased IR drop. Just as with the silicon interposer, it is possible to add MIM capacitors or DTCs to the base die to improve the high-frequency performance. One power delivery challenge that is unique to Foveros is that we now have to deliver power to both the top die and the bottom die. To accomplish this, the die metal and decoupling resources have to be shared between the power domains on the top and the bottom. Foveros also allows for the possibility of integrating the voltage regulator on the base die directly underneath the load on the top die. Since the base die in a Foveros configuration is typically fabricated using an older process node, it makes for a good solution for integrating power delivery circuits such as a switching regulator, LDO, or power gate. These are analog circuits which perform just as well on a process node that is one or two generations behind.



Fig. 24. Power delivery for emerging heterogeneous integration platforms (Courtesy: JUMP ASCENT).

### C. Power Delivery for Extreme Heterogeneity

As the current trend of integration continues, we can expect a system with CPU, GPU, accelerator (ACC), and memory dies on an interposer. To reduce power losses, the platform VR needs to be integrated on the interposer in close proximity to the logic dies, as shown in Fig. 24. This could be enabled using high-voltage complementary GaN (CGaN) devices with embedded inductors designed in the package using high-frequency high permeability materials. The decoupling is handled through a combination of discrete surface mount capacitors on the package as well as silicon capacitors. It is important to ensure that the thermal solution for cooling the microprocessor is also extended to remove the heat from the IVR. The joule heating from the inductors could be another source of heat which can be removed through the use of thermal vias on the backside of the interposer since joule heating of the inductors can reduce overall efficiencies.

## VII. SUMMARY

Microprocessor power delivery schemes have steadily grown in complexity since the early 2000s. Modern microprocessors have started to rely on IVRs to decouple the microprocessor power delivery complexity from the platform requirements. As the power levels continue to rise, it becomes important to develop high-voltage IVRs to maximize system efficiency and provide fine grain power management at the individual core level. Development of building block technologies such as high-voltage switches with good FOM, high-frequency magnetic inductors, and advanced packaging technologies for enabling heterogeneous integration will become necessary as new computing architectures evolve.

## REFERENCES

- [1] G. E. Moore, "Cramming more components onto integrated circuits," *Electronics*, vol. 38, no. 8, pp. 114–117, Apr. 1965.
- [2] R. H. Dennard, F. H. Gaenslen, H. N. Yu, V. L. Rideout, E. Bassous, and A. LeBlanc, "Ion implanted MOSFETs with very short channel lengths," in *IEDM Tech. Dig.*, Dec. 1973, pp. 152–155.
- [3] M. T. Bohr and I. A. Young, "CMOS scaling trends and beyond," *IEEE Micro*, vol. 37, no. 6, pp. 20–29, Nov. 2017.
- [4] K. Rupp, *42 Years of Microprocessor Trend Data*. Accessed: May 18, 2020. [Online]. Available: <https://www.karlrupp.net/2018/02/42-years-of-microprocessor-trend-data/>
- [5] K. Choi, R. Soma, and M. Pedram, "Dynamic voltage and frequency scaling based on workload decomposition," in *Proc. Int. Symp. Low Power Electron. Design - ISLPED*, 2004, pp. 174–179.
- [6] S. Jiang, C. Nan, X. Li, C. Chung, and M. Yazdani, "Switched tank converters," in *Proc. IEEE Appl. Power Electron. Conf. Expo. (APEC)*, Mar. 2018, pp. 81–90.
- [7] B. Yang, F. C. Lee, A. J. Zhang, and G. Huang, "LLC resonant converter for front end DC/DC conversion," in *Proc. APEC. 17th Annu. IEEE Appl. Power Electron. Conf. Expo.*, Mar. 2002, pp. 1108–1112.
- [8] C. Auth *et al.*, "A 22nm high performance and low-power CMOS technology featuring fully-depleted tri-gate transistors, self-aligned contacts and high density MIM capacitors," in *Proc. Symp. VLSI Technol. (VLSIT)*, Jun. 2012, pp. 131–132.
- [9] S. Y. Hou *et al.*, "Integrated deep trench capacitor in Si interposer for CoWoS heterogeneous integration," in *IEDM Tech. Dig.*, Dec. 2019, p. 19.
- [10] J. Charles, P. Jassi, N. S. Ananth, A. Sadat, and A. Fedorova, "Evaluation of the Intel Core i7 turbo boost feature," in *Proc. IEEE Int. Symp. Workload Characterization (IISWC)*, Oct. 2009, pp. 188–197.
- [11] B. K. Bhattacharyya, N. Laskar, S. Debnath, and D. Baral, "Innovative scaling method to minimize cost of integrated circuit packages and devices," *IEEE Trans. Compon., Packag., Manuf. Technol.*, vol. 4, no. 9, pp. 1489–1494, Sep. 2014.
- [12] E. Masanet, A. Shehabi, N. Lei, S. Smith, and J. Koomey, "Recalibrating global data center energy-use estimates," *Science*, vol. 367, no. 6481, pp. 984–986, Feb. 2020.
- [13] E. A. Burton *et al.*, "FIVR-fully integrated voltage regulators on 4th generation Intel Core SoCs," in *Proc. IEEE Appl. Power Electron. Conf. Expo. - APEC*, Mar. 2014, pp. 432–439.
- [14] A. Grenat *et al.*, "4.2 increasing the performance of a 28nm ×86-64 microprocessor through system power management," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Jan. 2016, pp. 74–75, doi: [10.1109/ISSCC.2016.7417913](https://doi.org/10.1109/ISSCC.2016.7417913).
- [15] Z. Toprak-Deniz *et al.*, "5.2 distributed system of digitally controlled microregulators enabling per-core DVFS for the POWER8TM microprocessor," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2014, pp. 98–99, doi: [10.1109/ISSCC.2014.6757354](https://doi.org/10.1109/ISSCC.2014.6757354).
- [16] Z. Ye, Y. Lei, and R. C. N. Pilawa-Podgurski, "The cascaded resonant converter: A hybrid switched-capacitor topology with high power density and efficiency," *IEEE Trans. Power Electron.*, vol. 35, no. 5, pp. 4946–4958, May 2020.
- [17] S. Rusu *et al.*, "Power reduction techniques for an 8-core xeon processor," in *Proc. ESSCIRC*, Sep. 2009, pp. 340–343, doi: [10.1109/ESSCIRC.2009.5326028](https://doi.org/10.1109/ESSCIRC.2009.5326028).
- [18] C. Schaeff, E. Din, and J. T. Stauth, "10.2 a digitally controlled 94.8%-peak-efficiency hybrid switched-capacitor converter for bidirectional balancing and impedance-based diagnostics of lithium-ion battery arrays," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2017, pp. 180–181, doi: [10.1109/ISSCC.2017.7870320](https://doi.org/10.1109/ISSCC.2017.7870320).
- [19] S. Mueller *et al.*, "Design of high efficiency integrated voltage regulators with embedded magnetic core inductors," in *Proc. IEEE 66th Electron. Compon. Technol. Conf. (ECTC)*, May 2016, pp. 566–573.
- [20] M. Huang, Y. Lu, S.-W. Sin, U. Seng-Pan, and R. P. Martins, "A fully integrated digital LDO with coarse–fine-tuning and burst-mode operation," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 63, no. 7, pp. 683–687, Jul. 2016, doi: [10.1109/TCSII.2016.2530094](https://doi.org/10.1109/TCSII.2016.2530094).
- [21] S. B. Nasir *et al.*, "A 65nm, 1.15–0.15 V, 99.99% current-efficient digital low dropout regulator with asynchronous non-linear control for droop mitigation," in *Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, May 2018, pp. 1–5, doi: [10.1109/ISCAS.2018.8351633](https://doi.org/10.1109/ISCAS.2018.8351633).
- [22] D. C. Zhang, M. Swaminathan, A. Raychowdhury, and D. Keezer, "Enhancing the bandwidth of low-dropout regulators using power transmission lines for high-speed I/Os," *IEEE Trans. Compon., Packag., Manuf. Technol.*, vol. 7, no. 4, pp. 533–543, Apr. 2017.
- [23] M. H. Ahmed, C. Fei, F. C. Lee, and Q. Li, "48-V voltage regulator module with PCB winding matrix transformer for future data centers," *IEEE Trans. Ind. Electron.*, vol. 64, no. 12, pp. 9302–9310, Dec. 2017.
- [24] P. Yeaman and E. Oliveira, "A high efficiency high density voltage regulator design providing VR 12.0 compliant power to a microprocessor directly from a 48 V input," in *Proc. 28th Annu. IEEE Appl. Power Electron. Conf. Expo. (APEC)*, Mar. 2013, pp. 410–414.
- [25] W. J. Lambert, M. J. Hill, K. Radhakrishnan, L. Wojewoda, and A. E. Augustine, "Package inductors for Intel fully integrated voltage regulators," *IEEE Trans. Compon., Packag., Manuf. Technol.*, vol. 6, no. 1, pp. 3–11, Jan. 2016.
- [26] D. S. Gardner, G. Schrom, F. Paillet, B. Jamieson, T. Karnik, and S. Borkar, "Review of on-chip inductor structures with magnetic films," *IEEE Trans. Magn.*, vol. 45, no. 10, pp. 4760–4766, Oct. 2009, doi: [10.1109/TMAG.2009.2030590](https://doi.org/10.1109/TMAG.2009.2030590).
- [27] N. Sturcken *et al.*, "A 2.5D integrated voltage regulator using coupled magnetic- core inductors on silicon interposer," *IEEE J. Solid-State Circuits*, vol. 48, no. 1, pp. 244–254, Jan. 2013.

- [28] W. J. Lambert, M. J. Hill, K. P. O'Brien, K. Radhakrishnan, and P. Fischer, "Study of thin-film magnetic inductors applied to integrated voltage regulators," *IEEE Trans. Power Electron.*, vol. 35, no. 6, pp. 6208–6220, Jun. 2020, doi: [10.1109/TPEL.2019.2948825](https://doi.org/10.1109/TPEL.2019.2948825).
- [29] D. Won Lee, K.-P. Hwang, and S. X. Wang, "Fabrication and analysis of high-performance integrated solenoid inductor with magnetic core," *IEEE Trans. Magn.*, vol. 44, no. 11, pp. 4089–4095, Nov. 2008.
- [30] C. Alvarez, M. Bellaredj, and M. Swaminathan, "Open and closed loop inductors for high-efficiency system-on-package integrated voltage regulators," in *Proc. IEEE 69th Electron. Compon. Technol. Conf. (ECTC)*, May 2019, pp. 1672–1679.
- [31] M. Sankarasubramanian *et al.*, "Magnetic inductor arrays for Intel fully integrated voltage regulator (FIVR) on 10th generation Intel Core SoCs," in *Proc. IEEE 70th Electron. Compon. Technol. Conf. (ECTC)*, Jun. 2020, pp. 399–404, doi: [10.1109/ECTC32862.2020.00071](https://doi.org/10.1109/ECTC32862.2020.00071).
- [32] R. K. Williams, M. N. Darwish, R. A. Blanchard, R. Siemieniec, P. Rutter, and Y. Kawaguchi, "The trench power MOSFET—Part II: Application specific VDMOS, LDMOS, packaging, and reliability," *IEEE Trans. Electron Devices*, vol. 64, no. 3, pp. 692–712, Mar. 2017, doi: [10.1109/TED.2017.2655149](https://doi.org/10.1109/TED.2017.2655149).
- [33] U. K. Mishra, S. Likun, T. E. Kazior, and Y.-F. Wu, "GaN-based RF power devices and amplifiers," *Proc. IEEE*, vol. 96, no. 2, pp. 287–305, Feb. 2008, doi: [10.1109/JPROC.2007.911060](https://doi.org/10.1109/JPROC.2007.911060).
- [34] H. W. Then *et al.*, "Advances in research on 300mm gallium nitride NMOS transistor and silicon CMOS integration," in *IEDM Tech. Dig.*, San Francisco, CA, USA, 2020, pp. 27.3.1–27.3.4, doi: [10.1109/IEDM13553.2020.9371977](https://doi.org/10.1109/IEDM13553.2020.9371977).
- [35] M. L. F. Bellaredj, A. K. Davis, P. Kohl, and M. Swaminathan, "Magnetic core solenoid power inductors on organic substrate for system-in-package integrated high-frequency voltage regulators," *IEEE J. Emerg. Sel. Topics Power Electron.*, vol. 8, no. 3, pp. 2682–2695, Sep. 2020.
- [36] C. Alvarez *et al.*, "Design and demonstration of single and coupled embedded toroidal inductors for 48V to 1V integrated voltage regulators," in *Proc. Electron. Compon. Technol. Conf. (ECTC)*, Jun. 2020, pp. 405–413.
- [37] K. Yoshida, H. Saita, and T. Kariya, "Ultra low profile thin film capacitor for high performance electronic packages," in *Proc. IEEE 70th Electron. Compon. Technol. Conf. (ECTC)*, Jun. 2020, pp. 414–418, doi: [10.1109/ECTC32862.2020.00073](https://doi.org/10.1109/ECTC32862.2020.00073).
- [38] Y. Min *et al.*, "Embedded capacitors in the next generation processor," in *Proc. IEEE 63rd Electron. Compon. Technol. Conf.*, May 2013, pp. 1225–1229.
- [39] K. I. Hwu, W. Z. Jiang, and P. Y. Wu, "An expandable two-phase interleaved ultrahigh step-down converter with automatic current balance," *IEEE Trans. Power Electron.*, vol. 32, no. 12, pp. 9223–9237, Dec. 2017.
- [40] J. U. Knickerbocker *et al.*, "3D silicon integration," in *Proc. 58th Electron. Compon. Technol. Conf.*, May 2008, pp. 538–543, doi: [10.1109/ECTC.2008.4550025](https://doi.org/10.1109/ECTC.2008.4550025).
- [41] R. Mahajan *et al.*, "Embedded multi-die interconnect bridge (EMIB)—A high density, high bandwidth packaging interconnect," in *Proc. IEEE 66th Electron. Compon. Technol. Conf. (ECTC)*, May 2016, pp. 557–565, doi: [10.1109/ECTC.2016.201](https://doi.org/10.1109/ECTC.2016.201).
- [42] D. B. Ingerly *et al.*, "Foveros: 3D integration and the use of face-to-face chip stacking for logic devices," in *IEDM Tech. Dig.*, Dec. 2019, p. 19, doi: [10.1109/IEDM19573.2019.8993637](https://doi.org/10.1109/IEDM19573.2019.8993637).
- [43] M.-F. Chen, F.-C. Chen, W.-C. Chiou, and D. C. H. Yu, "System on integrated chips (SoIC(TM)) for 3D heterogeneous integration," in *Proc. IEEE 69th Electron. Compon. Technol. Conf. (ECTC)*, May 2019, pp. 594–599, doi: [10.1109/ECTC.2019.00095](https://doi.org/10.1109/ECTC.2019.00095).



**Kaladhar Radhakrishnan** (Senior Member, IEEE) received the B.Tech. degree from the Coimbatore Institute of Technology, Coimbatore, India, the M.S. degree from Iowa State University, Ames, IA, USA, and the Ph.D. degree in electrical engineering from the University of Illinois at Urbana-Champaign, Champaign, IL, USA, in 1993, 1995, and 1999, respectively.

He has been with Intel Corporation, Chandler, AZ, USA, since 2000, where he is currently a Fellow with the Technology Development Group.

His primary research interests include microprocessor power delivery, and computational electromagnetics which was the focus of this dissertation work. More recently, his areas of focus have been in integrated voltage regulation and magnetic inductors.



**Madhavan Swaminathan** (Fellow, IEEE) received the M.S. and Ph.D. degrees in electrical engineering from Syracuse University, Syracuse, NY, USA, in 1989 and 1991, respectively.

He was with IBM, East Fishkill, NY, USA, with a focus on packaging for supercomputers. He held the positions of the Founding Director of the Center for Co-Design of Chip, Package, System (C3PS), a Joseph M. Pettit Professor of Electronics with ECE, and the Deputy Director of the Packaging Research Center (NSF ERC), Georgia Tech (GT), Atlanta, GA, USA. He is the John Pippin Chair of microsystems packaging and electromagnetics with the School of Electrical and Computer Engineering (ECE), a Professor of ECE with a joint appointment with the School of Materials Science and Engineering (MSE), and the Director of the 3D Systems Packaging Research Center (PRC), GT. He is currently the Site Director of the NSF Center for Advanced Electronics through Machine Learning (CAEML), Urbana, IL, USA, and the Theme Leader of the Heterogeneous Integration, SRC JUMP ASCENT Center, Durham, NC, USA. He has authored over 500+ refereed technical publications and holds 31 patents. He is the primary author and a co-editor of three books and five book chapters.

Dr. Swaminathan is the Founder and a Co-Founder of two startup companies, and Founder of the IEEE Conference on Electrical Design of Advanced Packaging and Systems (EDAPS), a Premier Conference sponsored by the IEEE Electronics Packaging Society (EPS). He has served as the Distinguished Lecturer for the IEEE Electromagnetic Compatibility (EMC) Society.



**Bidyut K. Bhattacharyya** (Fellow, IEEE) received the B.Sc. degree (Hons.) from the Presidency College, Kolkata, India, the M.Sc. degree in physics from IIT Kanpur, Kanpur, India, and the Ph.D. degree in physics from The State University of New York, Buffalo, NY, USA, in 1975, 1978, and 1983, respectively.

He is currently with the Packaging Research Center, Georgia Institute of Technology, Atlanta, GA, USA.