

## ABSTRACT

MOORE, DANIEL ROSS. General Purpose Intra-Operation Dynamic Voltage Scaling  
(Under the direction of Dr. Alexander Dean).

Embedded peripheral devices are often specified with a range of performance characteristics that are determined by their supply voltage. Intra-Operation Dynamic Voltage Scaling (IODVS) reduces the energy consumption of peripheral devices by modulating the peripheral supply voltage at critical states occurring during operation of the peripheral device. IODVS is designed to have minimal impact on CPU utilization through the use of a lookup table that designates an ideal voltage on a per-state basis. IODVS is unique in that during high-performance states such as data-transmission, peripherals can have the high supply voltage necessary to reduce overall energy-delay product. Likewise, during low-performance states, such as mandatory delays, the system decreases peripheral domain voltage thus reducing energy consumption without adversely affecting performance or correctness. The method is demonstrated on various peripherals common to wireless sensor nodes and total energy savings of up to 40% are observed.

In most cases, a microcontroller and the peripheral devices to which it is connected must use a common supply voltage in order to ensure reliable communication. IODVS breaks this paradigm by exploiting the voltage-independent states of peripheral operations. With communications broken during the voltage-independent states, the host microcontroller must use some heuristic to determine when the operation is ultimately completed. Peripheral Activity Completion, Estimation and Recognition (PACER) is introduced as a variety of algorithms that can be employed to detect completed peripheral operations in real-time. This method was tested in combination with IODVS on multiple common peripheral devices. For the peripheral devices under test, the test fixture confirmed decreases in energy expenditures of up to 62% and latency reductions of up to 67%.

© Copyright 2016 Daniel Ross Moore  
All Rights Reserved

General Purpose Intra-Operation Dynamic Voltage Scaling

by  
Daniel Ross Moore

A dissertation submitted to the Graduate Faculty of  
North Carolina State University  
in partial fulfillment of the  
requirements for the degree of  
DOCTOR OF PHILOSOPHY

Computer Engineering

Raleigh, North Carolina

2016

APPROVED BY:

---

Dr. Alexander Dean  
Chair of Advisory Committee

---

Dr. James Tuck

---

Dr. Eric Rotenberg

---

Dr. Vincent Freeh

## **DEDICATION**

This work is dedicated to my family past present and future. My career would not have been possible without the endless support and encouragement of my mother Barbara Moore and my father William Moore. I am grateful to my grandfather Dr. James McGrath for inspiring my interest in science.

## BIOGRAPHY

Daniel Ross Moore grew up in Virginia and from an early age was fascinated with computers and soon became preoccupied with optimizing system performance. This hobby would ultimately drive him towards pursuing a degree in electrical engineering at Virginia Tech in Blacksburg Virginia.

Upon graduating from Virginia Tech, Daniel began working as a Development Engineer at a local startup, ADMMicro (now GridPoint), dedicated to decreasing the energy consumption of commercial and industrial facilities. Daniel developed the mechanical, electrical and software design for multiple products responsible for sensing and controlling both energy consumption and indoor air quality. Daniel decided to further his education by pursuing a Master of Science degree at NC State University and was often pleased at how immediately applicable the courses were to the professional problems at hand.

After earning the degree, Daniel began employment as a Firmware Engineer, developing firmware for use in networked electricity meters at Elster Solutions (now Honeywell). He simultaneously continued his education by beginning the PhD program at NCSU.

Daniel is currently a Principal Embedded Engineer at Valencell Inc. in Raleigh, North Carolina producing fitness oriented wearable biometric sensors.

## ACKNOWLEDGMENTS

My work is primarily inspired by problems that I have encountered throughout my professional career. I would like to acknowledge my first professional coworkers Danny Dyess and Sam Taylor for their generous instruction and their demands for accuracy and excellence.

I would like to thank all of my Elster coworkers, particularly Andy Borleske, Adrian Howell, Chet Helms, Chris Kachur, Marc Fisher and Su Li. I am very fortunate to have met such a highly qualified group of people who have assisted and challenged me throughout the course of my work.

I am grateful to my advisor Dr. Alexander Dean for encouraging the practical aspects of my research and providing much-needed guidance throughout the academic process. Without the confidence and support of Dr. Dean, this work would not have been possible.

Thanks to my dear friends John Coggin, Kyle Held, Peter Kerstetter, Shana Muhammad and Ryan Hodges for their support throughout the years.

Finally, I would like to recognize the patience, dedication and encouragement of Jen Pettit. This work benefitted greatly from the substantial amount of collaboration and support that she provided.

## TABLE OF CONTENTS

|                                                                |      |
|----------------------------------------------------------------|------|
| LIST OF TABLES .....                                           | viii |
| LIST OF FIGURES .....                                          | x    |
| Chapter 1: Introduction .....                                  | 1    |
| 1.1    Voltage Dependent States .....                          | 1    |
| 1.2    Voltage Independent States .....                        | 4    |
| 1.3    Intra-Operation Dynamic Voltage Scaling.....            | 5    |
| Chapter 2: Background .....                                    | 9    |
| 2.1    Power Supplies.....                                     | 9    |
| 2.1.1    Linear / Low-Dropout Regulator (LDO) .....            | 9    |
| 2.1.2    Charge Pump.....                                      | 10   |
| 2.1.3    Switched Mode Power Supply (SMPS).....                | 11   |
| 2.2    Energy Management Techniques.....                       | 12   |
| 2.2.1    Dynamic Power Management.....                         | 13   |
| 2.2.2    Dynamic Voltage (and Frequency) Scaling .....         | 13   |
| 2.2.3    Wireless Sensor Networks .....                        | 14   |
| 2.2.4    Component Aware Dynamic Voltage Scaling.....          | 14   |
| 2.3    Embedded Peripherals.....                               | 15   |
| Chapter 3: IODVS.....                                          | 16   |
| 3.1    Introduction .....                                      | 16   |
| 3.2    Assumptions .....                                       | 21   |
| 3.3    Methods and Materials .....                             | 23   |
| 3.4    Results .....                                           | 25   |
| 3.4.1    Microchip MCP25AA512 EEPROM .....                     | 25   |
| 3.4.2    Numonyx M25PX16 Serial Flash.....                     | 30   |
| 3.4.3    Micro-SD Memory Card.....                             | 34   |
| 3.4.4    Honeywell HIH6130 Temperature / Humidity Sensor ..... | 44   |
| 3.5    Conclusion.....                                         | 47   |
| Chapter 4: PRIME.....                                          | 49   |
| 4.1    Introduction .....                                      | 49   |

|       |                                                     |     |
|-------|-----------------------------------------------------|-----|
| 4.2   | Adjustable Step-Down Module (ASDM-300F) .....       | 50  |
| 4.3   | Peripheral Power Switch (PPS-330D).....             | 56  |
| 4.4   | Programmable Load Regulator (PLR-5010D) .....       | 58  |
| 4.5   | Discovery Expansion Board (DEB429A) .....           | 62  |
| 4.5.1 | System Architecture.....                            | 64  |
| 4.5.2 | Analog Design .....                                 | 65  |
| 4.5.3 | Digital Design .....                                | 70  |
| 4.5.4 | Results.....                                        | 76  |
|       | Chapter 5: PACER .....                              | 78  |
| 5.1   | Introduction .....                                  | 78  |
| 5.2   | Related Work.....                                   | 80  |
| 5.2.1 | Timing Heuristic .....                              | 80  |
| 5.2.2 | Energy Heuristic .....                              | 80  |
| 5.2.3 | Current Heuristic.....                              | 80  |
| 5.3   | Methods and Materials .....                         | 81  |
| 5.3.1 | Development Platform .....                          | 81  |
| 5.3.2 | PACER-T.....                                        | 82  |
| 5.3.3 | PACER-E.....                                        | 82  |
| 5.3.4 | PACER-C.....                                        | 83  |
| 5.4   | Results .....                                       | 83  |
| 5.4.1 | Microchip MCP25AA512 EEPROM .....                   | 84  |
| 5.4.2 | Numonyx M25PX16 NOR Serial Flash .....              | 87  |
| 5.4.3 | Microchip SST26VB NAND Serial Flash .....           | 89  |
| 5.4.4 | MicroSD Memory Card .....                           | 91  |
| 5.4.5 | Honeywell HIH-6130 Temperature/Humidity Sensor..... | 101 |
| 5.5   | Conclusion.....                                     | 102 |
|       | Chapter 6: Conclusions .....                        | 104 |
| 6.1   | Conclusions .....                                   | 104 |
| 6.2   | Future Work: PRIME Enhancements.....                | 104 |
| 6.3   | Future Work: Supervised IODVS .....                 | 105 |

|                                                                       |     |
|-----------------------------------------------------------------------|-----|
| 6.4 Future Work: PACER Missed Prediction Analysis .....               | 106 |
| References.....                                                       | 107 |
| APPENDIX.....                                                         | 112 |
| Appendix A: PEGMA Schematic .....                                     | 113 |
| 7.1 Microcontroller pinout and SRAM connection.....                   | 114 |
| 7.2 Renewable input boost circuitry, measurement and modulation ..... | 115 |
| 7.3 Energy storage and peripheral boost circuitry.....                | 116 |
| 7.4 Stepdown power supplies (peripheral domains) .....                | 117 |
| 7.5 Energy storage and peripheral boost circuitry.....                | 118 |
| 7.6 Peripheral domain current measurement.....                        | 119 |
| 7.7 Communications peripherals.....                                   | 120 |
| 7.8 Analog domain .....                                               | 121 |
| Appendix B: ASDM-300F Schematic .....                                 | 122 |
| Appendix C: PPS-330D Schematic .....                                  | 123 |
| Appendix D: PLR-5010D (Rev0) Schematic .....                          | 124 |
| Appendix E: PLR-5010D (Rev1) Schematic .....                          | 125 |
| Appendix F: DEB429A Schematic .....                                   | 126 |

## LIST OF TABLES

|                                                                                     |     |
|-------------------------------------------------------------------------------------|-----|
| Table 1: PPP as Derived from State Diagram.....                                     | 18  |
| Table 2: Estimated State Voltage Current and Duration Pairs .....                   | 19  |
| Table 3: Typical External Peripherals.....                                          | 20  |
| Table 4: MCP25AA512 Peripheral Power Profile.....                                   | 27  |
| Table 5: MCP25AA512 Energy Consumption .....                                        | 27  |
| Table 6: MCP25AA512 Energy Consumption and Duty Cycle .....                         | 28  |
| Table 7: M25PX16 Peripheral Power Profile .....                                     | 32  |
| Table 8: M25PX16 Energy Consumption.....                                            | 32  |
| Table 9: Generic Micro-SD Memory Card Peripheral Power Profile .....                | 36  |
| Table 10: Sandisk Micro-SD Card Energy Consumption.....                             | 37  |
| Table 11: Lexar Micro-SD Card Energy Consumption.....                               | 37  |
| Table 12: Micro-SD Card Energy Consumption .....                                    | 40  |
| Table 13: Kingston Micro-SD Card Energy Consumption.....                            | 41  |
| Table 14: HIH-6130 Peripheral Power Profile .....                                   | 47  |
| Table 15: HIH-6130 Energy Consumption.....                                          | 47  |
| Table 16: SMPS and LDO Output Voltages for Various Feedback Inputs .....            | 52  |
| Table 17: Analog Signals Provided by the DEB429A.....                               | 68  |
| Table 18: EEPROM Operation Energy.....                                              | 86  |
| Table 19: EEPROM Operation Latency .....                                            | 86  |
| Table 20: NOR Serial Flash Operation Energy .....                                   | 88  |
| Table 21: NOR Serial Flash Operation Latency .....                                  | 88  |
| Table 22: NAND Serial Flash Operation Energy .....                                  | 90  |
| Table 23: NAND Serial Flash Operation Latency .....                                 | 90  |
| Table 24: Sandisk microSD Card Algorithm / Energy Summary (128 Samples Each) .....  | 94  |
| Table 25: Lexar microSD Card Algorithm / Energy Summary (128 Samples Each).....     | 96  |
| Table 26: Swissbit microSD Card Algorithm / Energy Summary (128 Samples Each) ..... | 98  |
| Table 27: Lexar microSD Card Algorithm / Energy Summary (128 Samples Each).....     | 100 |
| Table 28: HIH-6130 Operation Energy .....                                           | 102 |

|                                            |     |
|--------------------------------------------|-----|
| Table 29: HIH-6130 Operation Latency ..... | 102 |
|--------------------------------------------|-----|

## LIST OF FIGURES

|                                                                                                |    |
|------------------------------------------------------------------------------------------------|----|
| Figure 1: Aperture, Setup and Hold Times .....                                                 | 2  |
| Figure 2: Effects of Slew Rate on Theoretical Maximum Communications Speed .....               | 3  |
| Figure 3: Voltage Dependent / Independent Device States .....                                  | 4  |
| Figure 4: Impact of Voltage on Energy and Delay .....                                          | 5  |
| Figure 5: IODVS Peripheral Device Operation .....                                              | 6  |
| Figure 6: A Linear Regulator / LDO Circuit .....                                               | 10 |
| Figure 7: A Typical Charge Pump Circuit and Efficiencies [2] .....                             | 11 |
| Figure 8: A Simple SMPS in Non-Synchronous Buck Configuration [3].....                         | 11 |
| Figure 9: An IODVS Enabled System .....                                                        | 16 |
| Figure 10: A SPI EEPROM Write / Verify Cycle (Not to Scale).....                               | 17 |
| Figure 11: EEPROM Write Operation State Diagram and Corresponding Voltage /Time Relation ..... | 17 |
| Figure 12: Peripheral Generation Measurement and Allocation (PEGMA) Circuit Board....          | 22 |
| Figure 13: Peripheral Domain SMPS, Control and Current Sense Circuitry .....                   | 23 |
| Figure 14: EEPROM Write State Transition Diagram .....                                         | 27 |
| Figure 15: EEPROM IODVS Test.....                                                              | 29 |
| Figure 16: Serial Flash Write State Transition Diagram .....                                   | 31 |
| Figure 17: Serial Flash IODVS Test.....                                                        | 33 |
| Figure 18: Typical Micro-SD Memory Card Test .....                                             | 34 |
| Figure 19: Micro-SD Memory Card Write State Transition Diagram.....                            | 34 |
| Figure 20: Sandisk Micro-SD Card IODVS Test .....                                              | 38 |
| Figure 21: Lexar Micro-SD Card IODVS Test.....                                                 | 39 |
| Figure 22: SwissBit Micro-SD Card IODVS Test.....                                              | 42 |
| Figure 23: Kingston Micro-SD Card IODVS Test .....                                             | 43 |
| Figure 24: HIH-6130 State Transition Diagram .....                                             | 45 |
| Figure 25: HIH-6130 Temperature / Humidity Sensor IODVS Test.....                              | 46 |
| Figure 26: TPS62240 Reference Circuit [22] .....                                               | 51 |
| Figure 27: MIC94325 Reference Circuit [32] .....                                               | 51 |

|                                                                                      |    |
|--------------------------------------------------------------------------------------|----|
| Figure 28: ASDM-300F.....                                                            | 53 |
| Figure 29: Gain-Bandwidth Characteristics of the MAX4377HAUA+ .....                  | 54 |
| Figure 30: ASDM-300F Output Voltage and Feedback Voltage Testing .....               | 55 |
| Figure 31: PPS-330D .....                                                            | 57 |
| Figure 32: PLR-5010D Rev0 Assembly as Designed.....                                  | 59 |
| Figure 33: PLR-5010D Rev0 Assembly with Rev1 Test Modifications .....                | 59 |
| Figure 34: PLR-5010D Current Output Linearization.....                               | 60 |
| Figure 35: Three PLR-5010D Units Installed on the DEB-429A.....                      | 61 |
| Figure 36: Current Output Sweep of the PLR-5010D as Measured by ASDM-300F .....      | 62 |
| Figure 37: STM32F429 Discovery Front .....                                           | 63 |
| Figure 38: STM32F429 Discovery Back with Peripheral Modules on Breadboard.....       | 63 |
| Figure 39: Finalized Pinout of the STM32F429 on the STMicroelectronics DISCO Board.. | 65 |
| Figure 40: USB 5V to 3.3V Translation on the DISCO Board [34].....                   | 66 |
| Figure 41: DISCO 3V3 Voltage and Current Sense Circuit .....                         | 66 |
| Figure 42: ASDM-300F Implementations on the DEB429A .....                            | 67 |
| Figure 43: ASDM-300F Modulation Circuitry .....                                      | 68 |
| Figure 44: A PPS-330D Controlling Power to a Peripheral Device .....                 | 69 |
| Figure 45: I/O Expansion Enabling PPS-330D Selection.....                            | 70 |
| Figure 46: The UM232H Hi-Speed USB 2.0 Module [35] .....                             | 71 |
| Figure 47: The UM232H Module as Connected to the STM32F429 via the DEB429A .....     | 71 |
| Figure 48: Si1141 Typical Application Circuit .....                                  | 74 |
| Figure 49: ESP-12E Module with RF Shield Removed .....                               | 75 |
| Figure 50: The SBT263C1A Bluetooth Module.....                                       | 75 |
| Figure 51: DEB429A Final Assembly and Power-on Self-Test Firmware .....              | 76 |
| Figure 52: EEPROM Write Current Profile.....                                         | 79 |
| Figure 53: IODVS Result Reproduction via PRIME.....                                  | 84 |
| Figure 54: EEPROM Write with PACER-T + IODVS .....                                   | 85 |
| Figure 55: EEPROM Write with PACER-C + IODVS .....                                   | 85 |
| Figure 56: NOR Serial Flash IODVS Write .....                                        | 87 |

|                                                                                        |     |
|----------------------------------------------------------------------------------------|-----|
| Figure 57: NOR Serial Flash IODVS + PACER-T Write .....                                | 87  |
| Figure 58: NAND Serial Flash IODVS Write .....                                         | 89  |
| Figure 59: NAND Serial Flash IODVS + PACER-C Write.....                                | 89  |
| Figure 60: A Single Standard Write to the Lexar microSD Memory Card .....              | 92  |
| Figure 61: A Single IODVS Write to the Sandisk microSD Card.....                       | 92  |
| Figure 62: Sandisk microSD Card IODVS Write with Cache Hit Detected by PACER-C ...     | 93  |
| Figure 63: Timing Distribution of Standard Writes to the Sandisk microSD Memory Card . | 93  |
| Figure 64: Timing Distribution of IODVS Writes to the Sandisk microSD Card.....        | 94  |
| Figure 65: A Single Write to the Sandisk microSD Memory Card .....                     | 95  |
| Figure 66: Timing Distribution of Standard Writes to the Lexar microSD Card .....      | 95  |
| Figure 67: Timing Distribution of IODVS Writes to the Lexar microSD Memory Card ..     | 96  |
| Figure 68: A Single Write to the Swissbit microSD Memory Card .....                    | 97  |
| Figure 69: Timing Distribution of Standard Writes to the Swissbit microSD Memory Card  | 97  |
| Figure 70: Timing Distribution of IODVS Writes to the Swissbit microSD Memory Card ..  | 98  |
| Figure 71: A Single Write to the Kingston microSD Memory Card .....                    | 99  |
| Figure 72: Timing Distribution of Standard Writes to the Kingston microSD Memory Card  | 99  |
| Figure 73: Timing Distribution of IODVS Writes to the Kingston microSD Memory Card     | 100 |
| Figure 74: HIH-6130 IODVS Measurement.....                                             | 101 |
| Figure 75: HIH-6130 IODVS + PACER-C Measurement .....                                  | 101 |

# CHAPTER 1: INTRODUCTION

Embedded systems exchange dedicated functionality for efficiency and precision. These systems are designed to perform a specific set of functions typified by sensing, control and communications. Each of these tasks requires the expenditure of energy outside the predictable computational energy budget of the embedded processor. System-wide loading characteristics dictate both the capacity and the strength requirements of the energy supply. These requirements impact the physical system in aspects of size, weight and cost as well as performance characteristics such as lifetime and thermal performance.

The impact of energy consumption is felt throughout the embedded system and it is therefore important to minimize energy consumption wherever possible. Extensive research has been performed focusing on minimizing energy consumption of the main processor by means of matching power consumption to performance demands. This research expands the search for energy savings outside of the processor domain and toward the peripherals to which it is connected. Primary attention is focused on performing voltage scaling of peripheral devices during both voltage-dependent and voltage-independent states.

## 1.1 Voltage Dependent States

A situation in which performance is correlated with applied voltage is a voltage-dependent state. Intra-system communication is an example of a voltage-dependent state and there is some energy expenditure necessary to communicate data from a microcontroller (MCU) to an in-system peripheral device. The MCU may use a number of internal peripherals to communicate such as the Serial Peripheral Interface (SPI), Inter-Integrated Communications bus (I2C) or even General Purpose Input / Output pins (GPIO). The energy expenditure of MCU to device communication can be defined as:

$$E = (P_{MCU} + P_{Loss} + P_{DeviceRx}) * T_{Comms}$$

$P_{MCU}$  is the power consumed by the processor with the transmitter enabled.  $P_{Loss}$  is the power exhausted due to  $I^2R$  losses across the communications bus and  $P_{DeviceRx}$  is the power used by the external device to receive the transmission. Of course the total energy expenditure is the aggregate of power consumption throughout the communications interval.

The communications period  $T_{Comms}$  is fundamentally limited by the aggregate aperture time necessary to transmit a packet of data ( $T_{aperture} * N_{bits}$ ). The aperture time is the summation of data setup and hold times as shown in Figure 1. The setup and hold times can be affected by a number of factors, but primarily they are minimized by decreasing the transition time across the threshold voltage  $V_t$  of the receiving circuit. The result directly effects the slope of the clock line in Figure 1 and extends or contracts the setup and hold times accordingly.



**Figure 1: Aperture, Setup and Hold Times**

The slope of these transitions is commonly known as the slew rate and the communications interval can therefore be minimized by increasing slew rate of the transmission. The goal being to decrease the amount of time required to achieve the voltage necessary to register a high-level or “one”,  $V_{IH}$  for positive edges. Likewise, it is desirable to decrease the amount of time required to register a low-level or “zero”  $V_{IL}$  for negative edges.

It is possible to increase the slew rate in two ways. The current sourcing capability of the transmitter can be increased, or the signaling voltage can be increased. Increasing the signaling voltage increases slew rate due to the capacitive nature of the physical communications link governed by:

$$V = V_{sig} * \left(1 - e^{\frac{-t}{RC}}\right)$$

Figure 2 shows the effect of slew rate as it affects a theoretical maximum communications speed. Initially we consider a device with a low signaling voltage and a low current sourcing capability (high series equivalent series resistance). This device (Low V, Low I) takes the longest amount of time to reach the minimum  $V_{IH}$ . Increasing the source capability is effective, yet often impractical because to do so would necessitate larger

semiconductors. Also, increasing the source capability tends to increase leakage currents and noise while also eliminating the intrinsic short-circuit protection afforded by current limited outputs. By doubling the source capacity, the slew rate is nearly doubled as well, the (Low V, High I) trace achieves the minimum voltage at 10ns.

A more practical approach is demonstrated with the (High V, Low I) trace. Many inputs tolerate a wide voltage range and minimum  $V_{IH}$  of 1.6 volts is typical of a device powered by a 3.3 volt supply [1]. As the example of Figure 2 shows, increasing the voltage can be a very effective way of increasing slew rate. The (High V, Low I) trace shows how the receiver can achieve the minimum input voltage at 11ns rather than 19ns. In fact, this method is widely followed and a large number of peripheral devices specify their communications speed as a function of their operating voltage.



**Figure 2: Effects of Slew Rate on Theoretical Maximum Communications Speed**

Minimizing the time spent communicating is of the utmost importance because both the MCU and target device must remain in an active state throughout the transaction. After communication completes, both the MCU and target can typically return to sleep mode where power consumption is drastically reduced. Sleep functionality is common in most embedded devices and therefore the incremental increase in  $P_{Loss}$  by increasing signaling voltage is more than offset by the decreased duration of the total power consumption.

## 1.2 Voltage Independent States

While it is established that overall communications speed is dependent on voltage, many devices have voltage-independent states where performance is not dependent on the supply voltage. For example, a device may have varying communications performance throughout the range of 1.8V – 5.5V, but performs specific functions (sensing, controls, memory) identically throughout the voltage range. In fact this arrangement is common throughout thousands of commercially available peripherals.



Figure 3: Voltage Dependent / Independent Device States

Devices may exhibit voltage independent behavior due the presence of an onboard internal regulator such as in the case of various EEPROM and flash peripherals. Voltage-independence may also exist due to physical characteristics of a peripheral sensor, such as the time required to accumulate a measurement (photons, gas, electrons, etc.). In any case, this is an opportune time to exploit voltage-independence and decrease power consumption throughout the duration of the operation.

Devices such as those shown in Figure 3 exhibit the energy and delay characteristics shown in Figure 4. Increasing the signaling/supply voltage to the device initially results in a sharp decrease in communication delay as described in the previous section. Diminishing marginal returns occur as the peripheral is bounded by internal limitations. However, increasing the signaling voltage also results in an unnecessarily energy consumption during

the voltage-independent state. The ultimate result is that an unnecessarily high energy penalty is paid for marginal increases in performance.

The effect of increased voltage throughout the operation of the peripheral is shown in Figure 4. The overall energy consumption increases exponentially while decreases in response time are marginal. This occurs because only the voltage-dependent states are eligible for the performance increase while the power usage penalty is paid throughout all states.



**Figure 4: Impact of Voltage on Energy and Delay**

### 1.3 Intra-Operation Dynamic Voltage Scaling

Performance of a peripheral device is maximized by operation at its maximum voltage and frequency during voltage-dependent states. Energy consumption of a device is minimized by operation at its lowest possible voltage during voltage-independent states. Such a system would result in transforming the operation shown in Figure 3 into the same operation shown in Figure 5.



**Figure 5: IODVS Peripheral Device Operation**

The supply voltage of the device is manipulated so as to minimize the duration of voltage dependent states and to minimize the power draw of voltage independent states. These transitions occur as peripheral devices are carrying out operations such as memory accesses or environmental measurements and thus the voltage scaling occurs intra-operation.

By merely implementing IODVS and measuring the effect, it is shown that further timing optimizations can be made to the operation of peripheral devices. By observing current consumption during the operation of a peripheral device, it can be more accurately estimated as to when the device has completed the operation. This information is used to build both timing and energy consumption heuristics in order to determine completion and resume operation earlier than specified by the manufacturer.

Achieving the optimal power profile for one device may result in affecting the operation of other devices that are powered from the same domain. To combat this, a supervisor is implemented and tested in order to prevent conflicts from varying device voltage requirements. The supervisor uses multiple methods of varying complexity to determine eligibility for voltage transition.

The remainder of this research investigates the benefits of voltage scaling at such a fine granularity. Energy and delay requirements are considered system-wide in order to exploit the potential savings offered by IODVS. The system is defined as a combination of power supplies, a governing micro controller and a variety of peripheral devices. Specific

consideration is given to maximizing the overall efficiency with which peripheral device operations are performed. This work focuses on extracting those efficiency gains by performing the following investigations:

1. System definition:
  - a. The application microcontroller (MCU). With attention to dynamic voltage and frequency scaling (DVFS) and dynamic power management (DPM) capabilities.
  - b. The characteristics of peripherals attached to the MCU. Fine-grained dynamic loading characteristics are of particular interest.
2. Intra-Operation Dynamic Voltage Scaling (IODVS)
  - a. Fine-grained modulation of peripheral power supplies enables significant energy savings with no-effect on peripheral performance.
3. Precise Real-time In-circuit Micro Energy management system (PRIME)
  - a. Acquiring precise and accurate real-time current measurements enables supervisor to determine the state of ongoing operations. Specifically, the supervisor can detect early completion, thus enabling future optimization.
4. Peripheral Activity Completion Estimation and Recognition (PACER)
  - a. Complete set of peripheral operations performed while metering current consumption and operation duration across multiple devices, voltages and temperatures.
  - b. Time and current based adaptive heuristic for early-completion estimation.
  - c. Integrated energy based heuristics for early completion estimation.
5. Conclusions and Future Work
  - a. Supervised IODVS
    - i. Reduce IODVS domain interference by identifying interfering voltage ranges.
    - ii. A peripheral voltage supervisor addition to the uC/OS-III kernel that uses DPM heuristics to balance IODVS voltage changes against predicted break-even times.

IODVS is shown to reduce energy consumption in many common peripheral devices by 10 – 40% depending on the ratio of voltage-dependent to voltage independent states. In-

system metering of peripheral devices is allows the MCU to detect operation completions and thus decrease the duration of voltage-independent states. Finally, the procedure is generalized for easy implementation in most embedded systems through the development of a supervisor which arbitrates the voltage demands of peripherals that share a voltage domain. PACER algorithms further

# CHAPTER 2: BACKGROUND

Energy management is performed by investigating the efficiency and capability of the power supplies as well as the loading characteristics of the energy consumers. This information enables traditional power management algorithms such as Dynamic Voltage and Frequency Scaling (DVFS), or Dynamic Power Management (DPM) to make real-time adjustments in pursuit of efficiency. Comprehensive system information enables IODVS to operate at extremely fine granularity.

## 2.1 Power Supplies

Voltage translation is a basic necessity for most embedded systems. The majority of embedded systems are supplied a voltage that is higher than is required to operate. Often this is due to legacy requirements as embedded systems trend toward lower operating voltages. On a more practical level, the higher supply voltage also provides margin that allows for a certain amount of voltage droop to be tolerated by end-devices. There are three common methods of DC-DC conversion in order to accomplish the step-down.

### 2.1.1 Linear / Low-Dropout Regulator (LDO)

The least complex circuit for step-down applications is the linear regulator shown in Figure 6. The circuit requires that the input voltage be maintained at some level higher than the output voltage. This margin is known as the dropout voltage or  $V_{\text{dropout}}$ . Modern versions of the circuit have focused on decreasing this margin and are known as LDOs (Low Dropout Regulators).



**Figure 6: A Linear Regulator / LDO Circuit**

This application requires the fewest external components thus minimizing cost and PCB area. It also produces the least amount of noise on the load side of the circuit. However it is the least efficient at DC-DC conversion. The power consumed by the linear regulator is modeled by considering both the converted and quiescent loads:  $P_{Reg} = (V_{in} - V_{out})I_{out} + V_{in}I_Q$ . The quiescent current is usually so low as to be ignored and thus the efficiency of the converter can be approximated to  $\eta = \frac{V_{out}}{V_{in}}$ . Thus the regulator is unsuitable for translating large voltage differentials. Linear regulators tend to dissipate large amounts of heat due to their inefficiency and thermal limitations often limit their applicability.

### 2.1.2 Charge Pump

Another method of DC-DC conversion can be accomplished via the charge pump. This circuit has the benefit of not requiring a physically large inductor and provides more efficient translation of large voltage differentials than the LDO. Figure 7 shows a typical application and they generally require only a few external capacitors in order to function. Additionally, they are capable of generating DC voltages below the ground level of the input. They are commonly found in TTL  $\rightarrow$  RS232 converters because RS232 signaling has a very wide voltage range (typically  $\pm 13V$  on modern implementations).

The current driving capacity of the charge pump is limited by both the size of the external capacitors and by the switching frequency of the device. The efficiency of the device is dictated by many factors, but the typical charge pump will be  $\sim 15\%$  less efficient than a buck switched mode power supply across the current output range. Taking all of these factors

into account, they are best suited for translating a wide input voltage range into a potentially wide output voltage range and at low current.



Figure 7: A Typical Charge Pump Circuit and Efficiencies [2]

### 2.1.3 Switched Mode Power Supply (SMPS)

For systems requiring DC-DC conversion (contrasted with voltage translation by increased current capability), the SMPS is the most common application. For the majority of embedded systems a circuit is required to step down voltage from a high level to a lower level via the buck configuration as shown in Figure 8:



Figure 8: A Simple SMPS in Non-Synchronous Buck Configuration [3]

The SMPS has a number of advantages over charge pumps and linear regulators. The SMPS can translate large voltage differentials with high efficiency. The efficiency of the converter is generally not related to the input to output voltage differential. Rather, the efficiency is dominated a combination of conduction losses and switching losses. Conduction losses are the  $I^2R$  losses incurred due to current flow through the inductor and transistor. Switching losses are incurred by charging and discharging the gate of the MOSFET. Thus, high switching frequencies will cause high switching losses.

Switched mode power supplies can be configured in the step-down buck configuration, the step-up boost configuration, or a combination buck-boost configuration which operates independent of input voltage. In any case, the increased current sourcing capabilities come at the cost of having the most PCB area of the options considered. Also, in all cases, the SMPS produces a ripple voltage at the output which designers strive to minimize through filter circuitry thus increasing total bill of materials cost. Many sensing devices are sensitive to disturbances caused by ripple and thus it is a variable that must be minimized or eliminated through the application of an additional LDO.

## 2.2 Energy Management Techniques

Embedded energy management research to date is split into two distinct fields: Dynamic Power Management (DPM) and Dynamic Voltage and Frequency Scaling (DVFS). DPM policies tend to focus on strict power-state relationships [4], while DVS policies tend to incorporate a linear power-performance relationship [5].

In fact, DVFS is so useful that hardware is designed specifically to take advantage of it [6]. Most DPM implementations focus on optimal scheduling techniques such that peripherals emerge from low-power states just in time for access by tasks requiring their functionality. Generally, the approaches to date can be categorized as a combination of either online [7] or offline [8] and deterministic [9] or probabilistic [10]. Dual-output circuits have been developed primarily targeting systems with a SoC [11]. The same circuitry could be reused to implement IODVS.

With respect to peripheral power management, peripherals can be operated under the same linear, DVFS based constraints [14] as well as step-wise DPM based [15] constraints. Approaches have been explored with respect to optimally scheduling devices with multiple power saving states and with systems where multiple tasks share a common resource (inter-task DPM) [10]. Similar resource availability problems are encountered with IODVS and similar heuristics are applied to address them.

### 2.2.1 Dynamic Power Management

DPM techniques exploit power switching capabilities (clock-gating for example) in order to disable sections of the system while they are unused. Of course, disabling the section entirely results in a wake-up time for that section and therefore, significant research has gone into determining the optimal time to wake up the disabled section so as to minimize the increase in latency.

The break-even time is the figure of merit for DPM as it pertains to energy savings [12]. It is defined as the duration that a device must remain asleep in order to offset the energy spent waking the device throughout which duration the device will be inaccessible. If a device requires a long wake-up time (a time in which it is incapable of being used), then it is incumbent on the system to determine how to best schedule disabling it.

Offline analysis can aid in the implementation of DPM by analyzing the control flow graph of an individual task or task set to determine when a peripheral is likely to be accessed [8]. Similar data can be realized online by profiling tasks and determining which paths lead to a peripheral access [13]. Both methods enhance the accuracy of predictions regarding the optimal peripheral wakeup time. All methods evaluate the cost/benefit of peripheral deactivation with respect to the energy savings gleaned versus the time spent reactivating the device when next needed.

### 2.2.2 Dynamic Voltage (and Frequency) Scaling

Microcontrollers use power at the rate  $P_{MCU} = P_{dynamic} + P_{static}$ . Static power dissipation  $P_{static}$  is due mostly to leakage currents throughout the MOSFETs of the MCU. Dynamic power dissipation  $P_{dynamic} = fCV_{dd}^2$  where  $f$  is the switching frequency of the circuit and  $C$  is the MOSFET gate capacitance of active circuits. The switching voltage  $V_{dd}$  is ripe for optimization because power consumption is proportional to the square.

In addition to the substantial power savings afforded by decreasing  $V_{dd}$ , microcontrollers also have a linear relationship between maximum possible switching frequency and the switching voltage. Thus, reductions in voltage result in decreasing the maximum possible frequency of the microcontroller.

In situations where peak performance is not required of the system, the frequency is adjusted downwards simply to minimize the linear part of the power consumption equation. Thus, if possible, it is also desirable to adjust the switching voltage to match the switching frequency. Optimal systems are operated at a clock frequency exactly sufficient to complete all tasks by their deadlines. Likewise, a sufficient supply voltage must be applied in order to achieve that clock frequency.

### 2.2.3 Wireless Sensor Networks

Minimalistic embedded systems such as wireless sensor networks (WSNs) are tasked with sensing their environment and communicating their readings to a more capable host for processing. Their power requirements are low because processing is typically offloaded to more capable nodes with more reliable power supplies. This is usually accomplished via a mesh network that grows as new nodes find and establish local communication with one another [16].

They are often powered from renewable sources or long term batteries, in some cases lasting over 10 years [17]. The responsibilities of a node on the network are minimized so as to achieve such a long lifespan. Dynamic voltage scaling techniques have been employed to decrease the energy consumption of these devices [18]. Due to the step-wise nature of their task sets, WSNs respond better to DPM schemes as the energy management technique, with DVFS employed during the active period. These systems are an excellent example of where IODVS would be ideal because of their typically short duty cycles.

### 2.2.4 Component Aware Dynamic Voltage Scaling

IODVS is most similar to the Component Aware DVS technique [16] [17] developed for use in the nodes of a wireless sensor network [19]. An adjustable regulator is operated by the MCU in an embedded system such that it is operating at its minimum voltage requirements. CADVS operates at the task-level and therefore results in a power / performance relationship typical of DVS. IODVS differs in that it extends the technique into intra-operation and therefore intra-task granularity.

## 2.3 Embedded Peripherals

Most research to date regarding the energy optimization of embedded peripherals makes use of DPM. This is natural because most embedded peripherals include some form of standby mode that allows the system to drastically decrease the static power consumption of peripheral devices. Thus, a significant amount of research has gone into determining the optimal breakeven time of embedded peripherals. DPM techniques inherently interfere with the operation of the device and impose a lag in response time.

IODVS, while maintaining compatibility with DPM, instead exploits the acceptable operating voltages of the device and does so with no effect on response time. IODVS is primarily beneficial to devices with the same responsibilities and characteristics as those of a node on a WSN. Therefore the peripherals under consideration are likewise responsible for sensing, storing and communicating. Each type of peripheral has both voltage-dependent and voltage-independent states due to the characteristics of the device.

Peripheral storage devices such as EEPROM and flash tend to incorporate onboard voltage regulators so as to ensure a reliable supply during read and write operations. Embedded sensors often incur a voltage-independent state during a sensing operation because the sensor requires the medium to accumulate for a period of time before an accurate measurement can be made. Communicating peripherals such as wireless transmitters also incorporate voltage regulators and buffers. This is necessary because the output voltage of a wireless transmitter must be maintained within strict parameters and the output message must be transmitted within strict temporal limits.

## CHAPTER 3: IODVS

## Intra-Operation Dynamic Voltage Scaling [21]

### 3.1 Introduction

Consider an embedded system where the supply voltage to an application MCU is decoupled from the supply voltage of its peripherals as shown in Figure 9. This design is becoming more common as modern MCU applications take advantage of Dynamic Voltage and Frequency Scaling (DVFS) and, in effect, IODVS is a natural extension of DVFS to the peripheral domain. It is demonstrated that energy can be saved by lowering the peripheral domain voltage during voltage independent states such as mandatory wait periods and where the application of traditional DVS or DPM techniques would adversely affect operation of either the device or the system.



**Figure 9: An IODVS Enabled System**

For example, EEPROM is a typical peripheral device that is used to provide non-volatile data storage. The devices are usually specified for use in systems that require a quick data access time and have low storage capacity requirements. The chips are often specified to operate at multiple voltage levels to achieve compatibility with systems using voltages from 1.8V to 5.0V.

A write operation to the SPI device (and optional verification stage) is typified by the timing diagram shown in Figure 10. Maximum communication speed scales with slew rate and therefore scales with voltage. It follows that communication between the MCU and

peripheral domains should occur at coordinated voltages, thereby maximizing data transfer, minimizing energy delay product (EDP) and eliminating the need for voltage level translation.



**Figure 10: A SPI EEPROM Write / Verify Cycle (Not to Scale)**

The most distinct benefit of IODVS can be realized during the longest portion of the typical transaction described in Figure 10: the delay. IODVS decreases the supply voltage to the chip during this voltage-independent period and it is demonstrated that the total energy cost of the transaction is significantly decreased.



**Figure 11: EEPROM Write Operation State Diagram and Corresponding Voltage / Time Relation**

IODVS is implemented by creating a state transition diagram for each operation that a device may perform and noting the voltage-dependent (VD) and voltage-independent (VI) states. Figure 11 shows the example state transition table for write operation to EEPROM.  $V_{min}$  is specified by the datasheet as the voltage at which the device will cease to operate predictably.  $V_{max}$  is voltage capable of providing maximum performance throughout the

state. It is bounded by the lower of either where the device ceases to increase in performance or by the voltage at which the MCU is unable to communicate with the peripheral.

In order to write to memory, the device transitions from the VI idle state, into the VD writing state where the MCU is sending data to the device. The transaction is voltage dependent because the communications performance scales with voltage. From the writing state the device transitions into the VI waiting state. The waiting state is specified by the datasheet to be 5ms regardless of applied voltage and is therefore, by definition, voltage-independent. The MCU then reads back the data to ensure integrity it was committed properly. This ‘verifying’ state is voltage-dependent because the MCU and peripheral are communicating throughout.

The operational state diagrams can be used to create a Peripheral Power Profile (PPP) for each device. This PPP forms a lookup table of state-voltage pairs. Combining the operational state transition table of Figure 11 with the Vmin and Vmax determined by the datasheet and MCU (summarized under MCP25AA512 in Table 3) the PPP for the device is formed in Table 1:

**Table 1: PPP as Derived from State Diagram**

| State     | Voltage |
|-----------|---------|
| Idle      | 1.8v    |
| Writing   | 3.3v    |
| Waiting   | 1.8v    |
| Verifying | 3.3v    |

It can be beneficial to estimate the energy savings of implementing IODVS at design time against the cost in both design effort and bill of materials. This is a remarkably difficult estimate to create as the current consumption dynamics of the Results section illustrates. A reasonable estimate can be made by comparing the current consumption of the device at Vmin against the current consumption at Vmax and accumulating power consumption throughout the duration that the device spends in each state as described in (5).

$$E = \sum_{s=0}^{S-1} V_s I_s T_s \quad (1)$$

Returning to the example of a typical EEPROM the data in Table 2 are found within the device specification or in some cases either extrapolated or interpolated:

**Table 2: Estimated State Voltage Current and Duration Pairs**

| State                                                                           | Current<br>@3.3V | Duration<br>@3.3V | Current<br>@1.8V | Duration<br>@1.8V(Est.) |
|---------------------------------------------------------------------------------|------------------|-------------------|------------------|-------------------------|
| Idle                                                                            | Not Provided     | Steady State      | Not Provided     | Steady State            |
| Writing                                                                         | 6.0mA            | ~200us            | 4.5mA            | ~1000us                 |
| Waiting                                                                         | Not Provided     | 5ms               | Not Provided     | 5ms                     |
| Verifying                                                                       | 7.5mA            | ~200us            | 3.0mA            | ~1000us                 |
| *Note that current and duration are average estimates from the device datasheet |                  |                   |                  |                         |

Because idle current consumption was not provided for varying voltages, the idle state is removed from the estimate. Thus, the estimate will reflect a device that is operating 100% of the time (never returning to idle). Likewise, the mandatory wait period current consumption was specified as an average occurring throughout the writing state. For the purposes of estimation, the wait state is combined with the writing state. Noting the latency increase due to the decrease in signaling voltage, these estimates result in:

$$E_{3.3V} = 3.3v * 6ma * .2ms + 3.3v * 7.5ma * 5.2ms = 132.66\mu J \quad (2)$$

$$E_{1.8V} = 1.8v * 4.5ma * 1ms + 1.8v * 3.0ma * 6ms = 40.5\mu J \quad (3)$$

The energy result of (3) is the lower bound on the energy that a peripheral operation can consume without IODVS while the 5.4ms duration of (2) is the lower bound on operation latency. Without IODVS, the latency decrease of 1.6ms is paid for through the energy increase of 92.16uJ. Because the manufacturer specified only an average current throughout the write operation and knowing that IODVS will lower the voltage to Vmin throughout the course of the delay portion of the write operation, we can estimate between the two bounds as:

$$E_{IODVS} = 3.3v * 6ma * .2ms + 1.8v * 3.0ma * 5.0ms + 3.3v * 7.5ma * .2ms = 35.91\mu J \quad (4)$$

When compared against constant latency, the change in energy consumption from (2) to (4) is decreased by 73%. For cases where latency is irrelevant, comparing (3) to (4) yields energy savings of 11.3% due to the decrease in time spent in voltage-dependent states. The resulting estimation of (4) is likely to contain inaccuracies due to both the current and voltage dynamics of devices on the domain. Specifically, this estimation assumes instantaneous voltage changes between states which may or may not be accurate depending on the capacitance of the domain and the current consumption of the domain at the switching time. System level losses such as pull-up resistors and trace impedances will also affect the actual results.

The IODVS technique is applicable to any peripheral that has a voltage/performance dependence and particularly applicable to those with a wait-state. The investigation considered the peripherals listed in Table 3 as a representative sample. The device descriptions and voltage requirements are listed next to their physical location on the test fixture.

Enabling IODVS requires only an adjustable power supply and a means of modulating the output voltage. A switched mode power supply (SMPS) is preferable because it is an efficient means of translating voltage levels. An adjustable linear regulator could be used, but only the benefits of decreased current consumption would be realized.

**Table 3: Typical External Peripherals**



|                                                                                                                |                                                                       |
|----------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------|
| Honeywell HIH-6130 I <sup>2</sup> C<br>Temperature / Humidity<br>Sensor                                        | Vmax: 5.5V<br>Vmin: 2.3V                                              |
| Microchip MCP 25AA512<br>512Kbit (64KB) SPI<br>EEPROM                                                          | Vmax: 5.5V<br>Vmin: 1.8V                                              |
| Numonyx M25PX16<br>16Mbit (2MB) SPI Serial<br>Flash                                                            | Vmax: 3.6V<br>Vmin: 2.3V                                              |
| SPI Mode SD Cards:<br>Lexar SDSC: 1.0GB<br>Sandisk SDSC 1.0GB<br>SwissBit: SDSC 512MB<br>Kingston: SDSC: 2.0GB | Vmax: 3.6V<br>Vmin: 2.7V<br>(Operating)<br>Vmin: 2.0V<br>(Idle/Ready) |

IODVS was tested on each of the seven sample peripherals listed in Table I. Each device was characterized by a peripheral power profile (PPP). The state voltages were derived from the specifications of the device datasheet. A common sequence of operations was performed repeatedly and random input parameters were used on each iteration. The output was analyzed, and no failures or unexpected behavior occurred. Idle state energy decreases of up to 66% and intra-operational state energy decreases of up to 40% were observed.

## 3.2 Assumptions

The MCU and peripheral voltage domains are decoupled and the peripheral domain is adjustable using an MCU-controlled DAC. All digital signaling between the MCU and the peripheral domain are made at the same voltage. The increased cost and decreased performance of level translation or isolation is too great to warrant implementation [20] for this purpose in most embedded systems. Above all, the lowest communication energy-delay product is found at matched voltage/frequencies.

The current measurements are taken at the output of the peripheral power supply. Thus, the data will indicate the effect of IODVS on the set of peripherals on the domain and not on any one peripheral in particular.

The PPP state-voltage lookup table of each device is constructed solely from the acceptable usage specifications contained within the device datasheet. It was discovered experimentally that many of the devices that were tested operated well below their specified minimum voltage requirements. Although minimizing energy consumption by means of minimizing voltage is the primary goal of this work, it is necessary to ensure functionality of the device is maintained across all environments that may degrade performance.

For instance, the EEPROM under test is specified to operate in the range of -40°C to +80°C and with a minimum endurance of 1,000,000 write cycles. As the device nears the edge of its acceptable operating temperature or approaches its lifetime write-cycle limit, the minimum necessary voltage to guarantee completion of a write operation is likely to be that specified by the designers along with an acceptable factor of safety.



**Figure 12: Peripheral Generation Measurement and Allocation (PEGMA) Circuit Board**

### 3.3 Methods and Materials

The TPS62240 [21] adjustable (SMPS) was selected to power the peripheral domain because of its high efficiency at light loads, output capacity and adjustability. Peripheral domain voltage modulation is accomplished using a DAC output on a STM32F205 MCU signaling into the resistive feedback circuit on the SMPS. To measure the results of IODVS, the domain is outfitted with current sense circuitry [22] on both the input to the SMPS and the output to the domain.



**Figure 13: Peripheral Domain SMPS, Control and Current Sense Circuitry**

As shown in Figure 13 and expanded upon in Appendix A, the adjustable peripheral power supply is outfitted with current sense resistors and amplifiers on both the input to the supply and the output to the peripheral domain. These signals, along with the input voltage to the supply and the output voltage to the domain are fed into the ADC of the STM32F205 microcontroller and sampled at up to 1MSPS. The MCU has 3 simultaneously sampling ADCs which allows for simultaneous measurement of the output voltage, input current and output current.

Peripheral operations are broken up into states per an intrinsic state transition diagram. For example, to perform a write to EEPROM, the MCU must issue the write command, write the data, wait for a specified delay period and then read the data back in

order to verify a correct write. Therefore, the states are delineated as Idle, Writing, Waiting and Verifying.

Each peripheral operation is associated with a specific voltage. For instance, per the assumptions, data transfers must occur at equal voltages between the domain and MCU. The voltages of the writing and verifying states must then equal that of the MCU (3.3V). This leaves the idle and wait states available for voltage scaling.

For each device, the pairs of states to voltages form a lookup table (PPP). Each test designates a power profile to use. Peripheral memory was tested with random data and across random memory addresses. Tests were run 2048 times, and the results were averaged. While operating IODVS within the specifications of the device datasheet, no operations failed. All test results were measured entirely in-system using the three 12-bit simultaneously sampling ADC converters onboard the MCU. The converters are triggered from a timer overflow using a reload value that allows for a complete buffer fill roughly corresponding to the expected length of the test. For example, the duration of the EEPROM test was approximately 10ms with a buffer size of 10240 samples yielded 976.6ns per sample (or a rate of 1.024MHz). Upon a trigger, the state of the peripheral is stored synchronously with the sample. Each test data set was retrieved upon completion and is composed of:

- Time Scale
- 10240 12-bit ADC Samples per channel
- Output Voltage
- Input and Output Current
- 10240 Device State Samples (reading / writing / etc.)
- Bit Resolution (ADC value → Current or Voltage)

The energy consumed throughout a test is calculated using the fundamental relationship shown in (5). The results were calculated offline and digitally integrated via (6) and (7), where  $S$  is the state of the device, and  $T_s$  is the sampling period.

$$P = VI = \frac{E}{t} \quad (5)$$

$$E_s = \sum_{n=0}^{N-1} V_n I_n T_s \quad (6)$$

$$E_{total} = \sum_{s_0}^{s_{n-1}} E_s \quad (7)$$

Separating the energy consumption by state is important because it allows us to consider the effect of duty cycle on the results of IODVS. Each device has an idle state where the voltage applied to the device is the minimum allowed by specification.

Likewise, IODVS is applicable to an exploitable sequence of active operations, resulting in decreased energy consumption without the performance impacts of DVFS. Energy consumption can be separated into two intervals as shown in (8).

$$E_{total} = E_{idle} + E_{active} \quad (8)$$

$$E_{total} = P_{idle} * T_{idle} + P_{active} * T_{active} \quad (9)$$

A duty cycle of 0% will be dominated by  $T_{idle}$  and energy consumption will converge on that of the idle state. On the other hand, a duty cycle of 100% will be dominated by  $T_{active}$ . In which case, energy consumption converges on the weighted average of the set of states comprising the active period. In any case, the actual energy decrease because of IODVS will lie in between these two extremes.

## 3.4 Results

### 3.4.1 Microchip MCP25AA512 EEPROM

IODVS uses peripheral power profiles to correlate peripheral voltages with internal state. The PPP specified for the EEPROM under test is derived from the specifications of its datasheet [23]. The EEPROM can communicate at 10MHz at 3.3V, while only 1.8V is required for basic operation. However, the length of the mandatory page-write delay is voltage independent and exploitable by IODVS.

The standard PPP is considered a control group and mandates that all states (writing/waiting/verifying/etc.) should have 3.3V applied to the peripheral. The 1.8VIW (1.8V Idle/Wait) profile mandates that the EEPROM should have 1.8V applied during the idle and waiting states and 3.3V applied on all others. Figure 15 provides a comparison of both the standard PPP and the 1.8VIW profiles enabled by IODVS.

The state transition diagram of Figure 14 is known a-priori and is followed throughout the tests illustrated in Figure 15. The black line indicates device state and is sampled synchronously with the voltage and current measurements.

The test begins with the EEPROM powered up and in the idle state. The WREN (write enable) command is transmitted to the peripheral, along with the write command and address which is followed by 128 bytes of random data (1 page-size). The peripheral and device driver then transition into the page-write delay state and the peripheral voltage is decreased to 1.8V. After the delay, the device driver increases the voltage to the 3.3V necessary for communication and then reads the data back from the device to verify that it was committed properly. The verification stage is optional, but is standard practice among devices where data integrity is critical.

The effects of IODVS are most distinct during the Idle and Wait states. Energy consumption during these states decreased 66.7% and 48.7% respectively. Energy consumption during the Write state increased by 30%. This is primarily as a result of the energy required to charge the domain to 3.3V which is required to complete the transaction.

Note that although the current measurement appears to exceed the graph in Figure 15, the current spike was indeed measured to be approximately 15mA and the data were integrated accordingly. In fact, charging the domain voltage is responsible for the 29% and 37% increases in the write and verify states respectively.

Two of the SPI lines on the test fixture are multiplexed for I2C communication. This causes the 1mA current swings during the communication phases of the test. The current consumption of the device indicates the behavior of the operation within. Two distinct periods of increased power demand are noticeable, these are likely to be an internal erase operation followed by a write.

The idle time of the test lasted 1ms out of a total test time of 9.475ms. The duty cycle of this test was 89.45%, and energy consumption was reduced by 26.67%. Removing the idle time from the total would yield an energy decrease of 22.36% at a duty cycle of 100%. Realistically, this type of device is used much less frequently owing to its finite number of useable write-cycles. At a duty cycle of 0%, the savings would converge on 66.7%.



Figure 14: EEPROM Write State Transition Diagram

Table 4: MCP25AA512 Peripheral Power Profile

| State     | Voltage<br>(Control) | Voltage<br>(IODVS) | Duration     |
|-----------|----------------------|--------------------|--------------|
| Idle      | 3.3v                 | 1.8v               | Steady State |
| Writing   | 3.3v                 | 3.3v               | ~500us       |
| Waiting   | 3.3v                 | 1.8v               | 5ms          |
| Verifying | 3.3v                 | 3.3v               | ~1ms         |

Table 5: MCP25AA512 Energy Consumption

| State             | Static (uJ)   | IODVS (uJ)   | Delta          |
|-------------------|---------------|--------------|----------------|
| Idle              | 9.84          | 3.28         | -66.70%        |
| Write             | 13.28         | 17.08        | 28.61%         |
| Wait              | 62.03         | 31.83        | -48.69%        |
| Verify            | 16.12         | 22.08        | 36.96%         |
| <b>Test Total</b> | <b>101.27</b> | <b>74.26</b> | <b>-26.67%</b> |

**Table 6: MCP25AA512 Energy Consumption and Duty Cycle**

| <b>Duty Cycle</b> | <b>Static avg. (uJ)</b> | <b>IODVS avg. (uJ)</b> | <b>Delta</b> |
|-------------------|-------------------------|------------------------|--------------|
| Duty: 0%          | 9.84                    | 3.28                   | -66.70%      |
| Duty: 25%         | 30.24                   | 20.20                  | -33.19%      |
| Duty: 50%         | 50.63                   | 37.13                  | -26.67%      |
| Duty: 75%         | 71.03                   | 54.05                  | -23.90%      |
| Duty: 100%        | 91.42                   | 70.98                  | -22.36%      |

The EEPROM does implement an optional sleep state that incurs a 100us penalty from which to wake. Systems that optimized for response time (such as in the case NVRAM) may not be capable of waiting 100us and therefore IODVS is attractive for the 66.7% power consumption reduction without incurring the wake penalty. For systems that are capable of withstanding the wake penalty, the PPP would be modified to include a 1.8v sleep state which would further reduce energy consumption of the device.



Figure 15: EEPROM IODVS Test

### 3.4.2 Numonyx M25PX16 Serial Flash

Serial flash modules have a somewhat more complicated state transition diagram than EEPROM. Serial flash chips can only program zeroes to their memory locations. At a simplistic level, this requirement necessitates a complete erase of a subsector before modifying the memory within it. The M25PX16 [24] supports a minimum of 4KB (subsector) erase and a maximum of 256B (page) sequential writes. To perform a read-modify-write operation, the transition diagram shown in Figure 16 is followed.

As with all of the devices under test, the control PPP was standardized at 3.3V throughout, while the 23VIW (2.3V idle/wait) PPP was constructed from the parameters listed within the datasheet. The device has a minimum operational voltage of 2.3V which is used for the idle and wait states. The subsector erase is specified to take a maximum of 150ms, while the page-write completes with a maximum delay of 5ms. Cross-subsector writes were not evaluated because that would simply require two test sequences to occur sequentially.

The device can reach the idle state either 10ms after power up or approximately 30us after the execution of a wake command. As the test begins, the chip is in the idle state; it does not require an initialization routine to execute before entering a functional state. A random sub-sector of memory is read into cache and is modified with the 128 bytes of random data to be committed. The sub-sector erase operation is executed, and IODVS adjusts the peripheral voltage to the wait state (2.3V in this PPP).

Upon completion of the erase cycle, the modified cached data are written back to the flash module one page at a time, resulting in 16 total page-writes. The writes cause a series of alternating “write-wait” states, and the corresponding voltage/state changes are evident in Figure 17. After the final page-write delay is complete, the data are read back and verified with the cached copy to ensure data integrity.

Table IV summarizes the energy decrease per state yielded through the use of IODVS. As expected, the most significant savings are found in the idle and wait states, while an increase is seen in the active states.

Because the test was in the idle state for 1ms out of the total length of 257ms, the test represents a duty cycle of 99.6%, which is effectively the worst case. As the duty cycle

decreases, the idle energy savings begins to dominate and pushes the average toward a limit of 48.66%.

It is noteworthy that this erase-write sequence is common to all flash memory, and so IODVS is applicable to flash memory in general. In high performance parallel NOR and NAND devices, writes complete on the order of micro-seconds. However, erase operations complete on the order of seconds and are easily exploitable by IODVS.



**Figure 16: Serial Flash Write State Transition Diagram**

**Table 7: M25PX16 Peripheral Power Profile**

| <b>State</b>       | <b>Voltage<br/>(Control)</b> | <b>Voltage<br/>(IODVS)</b> | <b>Duration</b>               |
|--------------------|------------------------------|----------------------------|-------------------------------|
| Idle               | 3.3v                         | 2.3v                       | Steady State                  |
| Reading            | 3.3v                         | 3.3v                       | ~10ms                         |
| Erase<br>(Command) | 3.3v                         | 3.3v                       | ~10us                         |
| Waiting            | 3.3v                         | 2.3v                       | ~150ms Erase<br>~5ms per Page |
| Writing            | 3.3v                         | 3.3v                       | ~1ms per Page                 |
| Verifying          | 3.3v                         | 3.3v                       | ~10ms                         |

**Table 8: M25PX16 Energy Consumption**

| <b>State</b>      | <b>Static (uJ)</b> | <b>IODVS (uJ)</b> | <b>Delta</b>   |
|-------------------|--------------------|-------------------|----------------|
| Idle              | 10.27              | 5.27              | -48.66%        |
| Reading           | 89.85              | 90.86             | 1.13%          |
| Write*            | 80.35              | 89.20             | 11.02%         |
| Wait*             | 551.18             | 344.92            | -37.42%        |
| Verify            | 57.52              | 72.45             | 25.96%         |
| <b>Test Total</b> | <b>2517.04</b>     | <b>1530.27</b>    | <b>-39.20%</b> |

\*sequential write and wait states combined



Figure 17: Serial Flash IODVS Test



### 3.4.3 Micro-SD Memory Card

Figure 18: Typical Micro-SD Memory Card Test



Figure 19: Micro-SD Memory Card Write State Transition Diagram

Micro-SD Cards follow a standard outlined by the SD Association [25]. The standard is a minimum set of electrical and communication specifications that must be met and some vendors exceed those specifications [26] [27] [28]. A few of the variable parameters include clock speed, slew rate, initialization time, block length, read/write timing and power

consumption. Additionally, the devices use a MMU which causes access timing to vary. The cumulative effect of these variations results in the SD Card protocol relying heavily on device polling. IODVS assumes matched voltages during MCU to device communication periods. Polling during write operations can be avoided by predicting the write completion time.

Figure 18 shows how timing variations affected the testing. The top figure shows one write/verify operation with constant polling for write-complete. The bottom figure shows the average of 2049 operations with polling beginning at 165ms. The non-monotonicity of the device state from the 165ms to 180ms marks indicate that some portion of the writes completed before 165ms had elapsed and advanced to the verify state immediately after polling. The shape also indicates that the majority of writes completed and advanced around the 170ms mark. Tuning the optimal polling time is a topic addressed in the following chapter.

An SD Card must be initialized after power up as shown in Figure 19. The MCU communicates with the SD Card via SPI and the initialization process typically takes 250ms. Not all SD Cards support power down modes. IODVS enables the device to transition to the 2.7V “Initialized” state, rather than undergoing a complete power-cycle and incurring the 250ms penalty as would be typical with DPM.

From the initialized state, the device was sent a write command to a random address with random data. The device driver then waits a predetermined amount of time (the prediction) before beginning to poll the device for write complete which can take up to 250ms. After the write finishes, the device driver reads the data back in order to verify that it committed properly before returning to the idle state.

The SD Card has the highest current consumption of the devices tested and therefore requires a bulk capacitor at the load in order to ensure sufficient supply at the device. The point at which domain capacitance is detrimental to IODVS is dependent on the demands of the loads. Larger loads allow the domain to transition to lower voltages faster, while larger capacitances cause the domain to transition more slowly.

**Table 9: Generic Micro-SD Memory Card Peripheral Power Profile**

| State              | Voltage (Control) | Duration (IODVS) | Duration      |
|--------------------|-------------------|------------------|---------------|
| Idle               | 3.3v              | 2.7v             | Steady State  |
| Write Cmd (Polled) | 3.3v              | 3.3v             | ~10ms         |
| Write Data         | 3.3v              | 3.3v             | ~10us         |
| Waiting            | 3.3v              | 2.7v             | ~(10 – 150ms) |
| Write Complete?    | 3.3v              | 3.3v             | ~10us         |
| Verifying          | 3.3v              | 3.3v             | ~10ms         |

#### *3.4.3.1 Sandisk SDSC 1.0GB Micro-SD Memory Card*

Initial experiments with the Sandisk Micro-SD indicated that the majority of write operations completed approximately 150-170ms after they began. Based on this data and as shown in Figure 20, the card was not polled until the test reached the 180ms mark (which is approximately 165ms after the write command completed successfully). After write-complete polling begins, it was found that all of the writes had already completed and were eligible to transition into the verification stage.

Idle energy consumption dropped by 11.5% and the idle duration accounted for 10ms of the 184.1ms test, yielding a duty cycle of 94.6%. A duty cycle of 100% (constant write/verify) would yield an energy decrease of 27.54%. The write and verify stages of the test were relatively unchanged, though this could be due to insufficient resolution. Based on previous tests with higher resolution, charging the domain took between 5-10uJ and therefore is negligible compared to the total decrease of 3893uJ.

**Table 10: Sandisk Micro-SD Card Energy Consumption**

| <b>State</b>      | <b>Static (uJ)</b> | <b>IODVS (uJ)</b> | <b>Delta</b>   |
|-------------------|--------------------|-------------------|----------------|
| Idle              | 157.07             | 138.95            | -11.54%        |
| Write             | 26.48              | 26.23             | -0.93%         |
| Wait              | 14021.97           | 10126.95          | -27.78%        |
| Verify            | 89.86              | 91.88             | 2.25%          |
| <b>Test Total</b> | <b>14295.38</b>    | <b>10384.02</b>   | <b>-27.36%</b> |

#### *3.4.3.2 Lexar SDSC 1.0GB Micro-SD Memory Card*

The Lexar Micro-SD card had a higher average power draw and a different write-completion characteristic than the Sandisk Micro-SD Card. The majority of writes completed between 140-180ms after the test began. This result can also be inferred from the drop in current consumption in Figure 21 beginning at the 140ms mark. Polling for the completion did not begin until 160ms after the test began.

Despite the higher current draw, the system still benefited from a decrease in wait state energy consumption by 4049 uJ. The duty cycle was the same as the Sandisk card at 94.6% yielding an energy decrease of 24.12%.

Both the Sandisk and Lexar cards are older technology (manufactured in 2007 and 2009 respectively) and when compared with other cards, show higher energy consumption and slower performance. Newer implementations are better in both categories.

**Table 11: Lexar Micro-SD Card Energy Consumption**

| <b>State</b>      | <b>Static (uJ)</b> | <b>IODVS (uJ)</b> | <b>Delta</b>   |
|-------------------|--------------------|-------------------|----------------|
| Idle              | 124.09             | 102.41            | -17.47%        |
| Write             | 34.52              | 34.42             | -0.28%         |
| Wait              | 16608.43           | 12558.83          | -24.38%        |
| Verify            | 39.45              | 56.87             | 44.17%         |
| <b>Test Total</b> | <b>16806.48</b>    | <b>12752.54</b>   | <b>-24.12%</b> |



Figure 20: Sandisk Micro-SD Card IODVS Test



Figure 21: Lexar Micro-SD Card IODVS Test

### 3.4.3.3 Swissbit S-200U 512MB Micro-SD Memory Card

The SwissBit Micro-SD Card is unique in that it uses 4x 4KB buffers to cache reads and writes to the memory card in order to speed up transaction times. The method is effective in that the worst case test time for the SwissBit card is less than half the best case test time for the previous two cards.

The card is equipped with power-fail circuitry that flushes the buffers to non-volatile memory once a voltage threshold has been reached. This functionality is seen at the moment just before the 70ms mark where the peripheral voltage reaches approximately 2.5V coinciding with a current spike of approximately 9mA.

The write-completion time varies much more significantly than the other cards. Current consumption of the device shown in Figure 22 indicates that writes begin completing at approximately the 35ms mark.

**Table 12: Micro-SD Card Energy Consumption**

| <b>State</b>      | <b>Static (uJ)</b> | <b>IODVS (uJ)</b> | <b>Delta</b>   |
|-------------------|--------------------|-------------------|----------------|
| Idle              | 66.25              | 43.53             | -34.30%        |
| Write             | 25.01              | 25.72             | 2.85%          |
| Wait              | 3726.20            | 2839.78           | -23.79%        |
| Verify            | 36.31              | 31.68             | -12.74%        |
| <b>Test Total</b> | <b>3853.76</b>     | <b>2940.71</b>    | <b>-23.69%</b> |

### 3.4.3.4 Kingston SDHC 2.0GB Micro-SD Memory Card

The Kingston Micro-SD Card was manufactured in 2014. Initial experiments with the device indicated that writes completed nearly 20x faster than the models previously tested. Furthermore, the maximum wait state duration appeared to be slightly over 1ms with a very high current consumption throughout the state. The test used a 2us sample time.

The write operation appears as a staircase between the 4ms and 6ms mark indicating that the device was ready for the write to a random address immediately in most cases, but after a 1ms delay in others.

Despite the fast characteristics of the device, IODVS was able to decrease the idle energy consumption by 31.4% and the current consumption of the wait state by 20.46%. The device was idle for 4ms out of the total test time of 12ms, yielding a duty cycle of 67%. The energy costs of the write, wait and verify states are relatively close. If the duty cycle were increased to 100%, the energy decrease would converge on 7.45%.

**Table 13: Kingston Micro-SD Card Energy Consumption**

| State             | Static (uJ)   | IODVS (uJ)    | Delta         |
|-------------------|---------------|---------------|---------------|
| Idle              | 24.63         | 16.89         | -31.40%       |
| Write             | 89.74         | 91.45         | 1.90%         |
| Wait              | 122.44        | 97.39         | -20.46%       |
| Verify            | 54.00         | 57.53         | 6.53%         |
| <b>Test Total</b> | <b>290.81</b> | <b>263.26</b> | <b>-9.47%</b> |



Figure 22: SwissBit Micro-SD Card IODVS Test



Figure 23: Kingston Micro-SD Card IODVS Test

### 3.4.4 Honeywell HIH6130 Temperature / Humidity Sensor

The MCU communicates with the temperature and humidity sensor [29] via I<sup>2</sup>C. The interface communicates in an open-drain fashion and therefore logic-high levels are accomplished simply by changing the MCU pin direction from output-low to input. The I<sup>2</sup>C bus was pulled to match the voltage level of the domain and therefore, when the MCU is sending data to the peripheral, it is not necessary to match the voltage of the MCU and peripheral domain. However, when the MCU is retrieving data from the peripheral, the voltages must be matched in order to ensure that input logic-level requirements are satisfied on the MCU.

The primary benefit of IODVS in the case of this peripheral is that the rate of I<sup>2</sup>C communication is highly dependent on the magnitude of the pull-up resistors enabling it and the signaling voltage. By allowing the voltage to increase to 3.3V during the read, larger pull-up resistors can be used, thus decreasing static power dissipation while maintaining the same communication frequency.

Because the device operates using open-collector signaling, the PPP is slightly different. Again, the control PPP is 3.3V in all states, but the IODVS PPP is 2.5VIRyTW (idle / ready / transmitting / waiting). Transmitting is denoted as seen from the MCU perspective. So the profile effectively mandates that only when the MCU is receiving data from the peripheral should it raise the device voltage to an MCU compatible level.

The test begins in the Idle state as shown in Figure 24 and the MCU issues a “Measure” command to the sensor. The peripheral takes up to 4ms to wake from sleep [30] and then transitions to the temperature measurement and humidity measurement states in sequence. There is a noticeable drop in current in Figure 25 upon the completion of the measurement and the MCU begins to read the data soon afterward.

This peripheral automatically enters an internal sleep mode described in its data sheet which drops the current consumption when a measurement has completed but has not yet been read. IODVS functions separately and provides additional energy savings. The first state has slightly higher energy power consumption because the device does not have a known measurement available.

IODVS consistently yields approximately 38% energy savings that is nearly duty-cycle independent. The duration of the reading state was unaffected by the optimization because voltages are equal.



**Figure 24: HIH-6130 State Transition Diagram**



**Figure 25: HIH-6130 Temperature / Humidity Sensor IODVS Test**

**Table 14: HIH-6130 Peripheral Power Profile**

| State       | Voltage (Control) | Duration (IODVS) | Duration     |
|-------------|-------------------|------------------|--------------|
| Idle        | 3.3v              | 2.5v             | Steady State |
| Measure Cmd | 3.3v              | 3.3v             | ~100us       |
| Waiting     | 3.3v              | 2.5v             | ~45ms        |
| Reading     | 3.3v              | 3.3v             | ~1ms         |

**Table 15: HIH-6130 Energy Consumption**

| State             | Static (uJ)   | IODVS (uJ)    | Delta          |
|-------------------|---------------|---------------|----------------|
| Idle              | 10.28         | 6.28          | -38.87%        |
| Command           | 1.68          | 1.05          | -37.60%        |
| Waiting           | 399.07        | 245.89        | -38.38%        |
| Reading           | 4.30          | 4.42          | 2.62%          |
| <b>Test Total</b> | <b>415.33</b> | <b>257.64</b> | <b>-37.97%</b> |

### 3.5 Conclusion

IODVS has been shown to decrease energy consumption on a typical group of external peripherals by 10-40% with no decreases in either performance or accuracy. The efficacy of the technique tends to increase with low-duty cycles which is typical of external non-volatile memory. The CPU overhead of performing IODVS is negligible through the use of pre-defined peripheral power profiles.

Minimal additional circuitry is required to implement IODVS. In many cases the decrease in power budget or increase in performance may offset the additional cost. These experiments made use of a DAC because of the flexibility it offered in voltage modulation for a wide variety of devices. Simpler implementations would benefit from switching SMPS feedback resistors into and out of the circuit.

The technique would be most effective if it were used in a system with minimal domain capacitance. This would allow for faster changes in domain voltage which would reduce the response time of the SMPS and reduce the inrush current when charging a

domain. Ideal domain capacitance is a balancing act between IODVS dynamics and against the peripheral load dynamics.

All of the devices tested provide a mechanism for testing operation-complete. We used the timing specifications contained within the device datasheet for all devices except the Micro-SD Cards (where polling was mandatory). Based on the current profiles of the devices, it can be inferred that most operations completed earlier than the datasheets specified. It would be worthwhile to pursue further research combining operation-completion prediction heuristics with IODVS to further minimize energy consumption.

Manipulating the voltage across a domain of devices is bound to impact some devices more than others. For instance, if the domain voltage drops below 2.7V, some SD Cards may revert from an initialized state to the idle state. Therefore, before adjusting a particular device on the domain, IODVS should determine if that would cause an overall benefit or detriment to devices on the domain. This could be determined by majority vote of devices on the domain and DPM-inspired analysis could be used to aid in decision making. If indeed the voltage is manipulated out of bounds for a particular device, the driver for each device on the domain needs to be notified of the voltage change so that re-initialization can take place if necessary.

# CHAPTER 4: PRIME

## Precise Real-time In-Circuit Energy-Management-System

### 4.1 Introduction

The PEGMA system described in Chapter 3 was sufficient to explore a typical IODVS implementation. In order to further evaluate the benefits of IODVS, it was necessary to develop a system that could provide the following features:

- Simultaneous voltage, input current and output current measurements
  - By measuring these three values, the efficiency of the SMPS can be calculated.
- Actionable analog measurements
  - Current and voltage measurements need to have a high signal to noise ratio such that they can be used to trigger state changes with minimal digital signal processing (DSP).
- Peripheral device isolation
  - The previous results were collected by measuring the total current consumption of the domain. It would be beneficial evaluate the benefits of IODVS on a per-device basis.
- Programmable load banks
  - Programmable load banks can allow a supervisor to create an efficiency model for the SMPS in-system.
- Increased measurement memory
  - Previous tests required a decrease in sample rate in order to accommodate a longer test length. It is important for the accuracy of digital integration that the sample rate be maximized.
- Higher communication bandwidth
  - By increasing the sample rate and test memory available, the test fixture would be considerably limited by the previously used 492Kbps baud rate.
- Additional peripheral devices under test

- Any device with a voltage-independent state can be optimized with IODVS.  
Sensors, memory and communications peripherals are eligible.
- Design Modularity
  - While investigating the application of IODVS, it is often beneficial to quickly remove variables from the system. Power supplies and peripheral devices should be available to easily remove from consideration.

The knowledge gained from experience with the PEGMA (Peripheral Energy Generation, Measurement and Allocation) board heavily influenced the requirements and purpose of its successor, PRIME, the Precise Real-Time, In-Circuit Energy-Management-System. The development of PRIME begins by developing the most essential modules in isolation and then integrating all of components into a host that satisfies all design requirements.

## 4.2 Adjustable Step-Down Module (ASDM-300F)

Previous implementations used an adjustable SMPS to provide power to peripheral devices. While this device is very efficient, it also produces a significant amount of noise. The noise does not affect operation of the device, but could lead to a misinterpretation of the analog measurements. This problem is addressed through the development of a module that incorporates an adjustable SMPS, the current measurement circuitry and an onboard LDO for noise reduction. The “MIC94325 Ripple Blocker” features noise rejection of >50dB throughout the 10Hz to 5MHz frequency band. The majority of the peripheral devices under consideration produce current dynamics within this range and therefore a clean voltage signal is of the utmost importance.

The complete schematic for the ASM-300F can be found in Appendix B. The device borrows heavily from the previous circuit that was tested on the PEGMA board. The primary new feature is simultaneous adjustment of both the SMPS and LDO regulators. This simultaneous adjustment is achieved via the following equations:

$$TPS62240: V_{ref} = 0.6V ; V_{InMin} = 2V ; V_{InMax} = 6V ; V_{outMin} = 0.6V$$

$$V_o = V_{ref} * \left(1 + \frac{R_1}{R_2}\right)$$



Figure 26: TPS62240 Reference Circuit [22]

*MIC94325:  $V_{ref} = 1.1V$  ;  $V_{InMin} = 1.8V$  ;  $V_{InMax} = 3.6V$  ;  $V_{OutMin} = 0.6V$*

$$V_o = V_{ref} * \left( 1 + \frac{R_1}{R_2} \right)$$



Figure 27: MIC94325 Reference Circuit [32]

Because it is desirable to adjust the voltage to both devices via the same source and it is also desirable to modulate the MIC94325 such that  $V_{out} - V_{in} = V_{DropOut} + V_{ripple}$  from the TPS62240. In order to achieve both requirements, a feedback resistor  $R_f$  is added to the node in between  $R_1$  and  $R_2$  on both devices. The purpose of this resistor is to synchronize the voltage modulation between both devices while minimizing power sink through the LDO and providing the ability to modulate  $V_{out}$  throughout the range of 1.8V – 3.3V.

Based on the experience in Chapter 3 the TPS62240, will typically yield 20mV of peak to peak switching noise. Therefore, the MIC94325 circuit is designed to adjust at the same slope as the TPS62240 circuit, but offset by  $V_{DropOut} + V_{ripple} = 10mV + 20mV$ . In order to achieve an output voltage swinging from 1.8V to 3.3V as modulated by an analog input voltage in the range of 0 – 3.3V, an ideal voltage slope for both devices is 0.5909

$\frac{V_f}{V_o}$ . Where  $V_f$  is the feedback voltage applied to each  $R_f$ . It was also designed for convenience such that by grounding the feedback input (driving 0V to  $V_f$ ), that the device would yield the commonly used 3.3V at its output. The resistor configuration yielding a close approximation of this slope and offset is shown:

TPS62240: R1 = 330k, R2 = 82k, Rf = 560k

MIC94325: R1 = 130k, R2 = 91k, Rf = 220k

**Table 16: SMPS and LDO Output Voltages for Various Feedback Inputs**

| $V_f$ | $V_{TPS62240-Out}$ | $V_{MIC94325-Out}$ |
|-------|--------------------|--------------------|
| 0     | 3.368206           | 3.3214286          |
| 0.2   | 3.250348           | 3.2032468          |
| 0.4   | 3.132491           | 3.0850649          |
| 0.6   | 3.014634           | 2.9668831          |
| 0.8   | 2.896777           | 2.8487013          |
| 1     | 2.77892            | 2.7305195          |
| 1.2   | 2.661063           | 2.6123377          |
| 1.4   | 2.543206           | 2.4941558          |
| 1.6   | 2.425348           | 2.375974           |
| 1.8   | 2.307491           | 2.2577922          |
| 2     | 2.189634           | 2.1396104          |
| 2.2   | 2.071777           | 2.0214286          |
| 2.4   | 1.95392            | 1.9032468          |
| 2.6   | 1.836063           | 1.7850649          |
| 2.8   | 1.718206           | 1.6668831          |
| 3     | 1.600348           | 1.5487013          |
| 3.2   | 1.482491           | 1.4305195          |

It is important to note that because the feedback circuit is purely resistive, that  $V_f$  must be driven from a low impedance output such as an op-amp. The module exposes 8 pins which are compatible with a DIP-8 standard package configuration which allows it to be used easily in both breadboard and socketed applications.



**Figure 28: ASDM-300F**

The ASDM-300F is also equipped with a dual output current sense amplifier, the MAX4377 [23]. This is the same amplifier that was used in the PEGMA board, however the dual-circuit version is applied on the ASDM-300F such that both input and output current can be sampled simultaneously. Specifically, the MAX4377HAUA+ was chosen because it provides the highest current sense gain at an acceptable gain-bandwidth product suitable for measuring switching by the TPS62240.

The amplifier provides a gain of 100 at a frequency of 1.2MHz while the TPS62240 switches at a frequency range of 2.0MHz – 2.5MHz while in PWM (Pulse Width Modulation) mode and a variable switching frequency while in PFM (Pulse Frequency Modulation) mode. The modes are automatically switched by the TPS62240 depending on the output load.



**Figure 29: Gain-Bandwidth Characteristics of the MAX4377HAUA+**

It should be noted that the output gain falls from 40dB (G=100) to 36dB (G=63.1) along the transition from 1.2MHz to 2MHz, which is the switching frequency of the TPS62440. Therefore, as designed, the input and output currents cannot be purely compared against each other in order to determine the efficiency of the SMPS. In order to accomplish efficiency measurements and compensate for the gain-bandwidth drop, one can use an active filter which approximately counteracts the decrease in gain. Alternatively, the MAX4377T device could be used which provides a gain of 20, however it experiences the gain-bandwidth drop as well. At 2MHz, the effective gain of the part would be about G=15 and an active digital or analog filter would be required.

The device is designed so as to be able to provide 300mA of maximum output current. With a gain of G=100, the sense resistors were chosen to be  $0.1\Omega$ . In this fashion, an output current of 300mA yields a voltage drop across the sense resistor of:

$$V = IR, V = .3 * .1 = 30mV$$

Therefore, the maximum power dissipated by the sense resistor is:

$$P = I^2R = (.03^2) * .1 = 90\mu W$$

Applying the gain of the MAX4377 yields:

$$V = G * V = 100 * .03 = 3V$$

Therefore, at maximum output current, the output of the ASDM-300F current circuitry should be observable by a standard analog to digital converter (ADC) operating at

3.3V. The intended application of the ASDM-300F is to be sampled by a 12-bit ADC which has a maximum of 4095 codes and is operating at a reference voltage of 3.3V. In that configuration, each output bit from the ADC yields approximately  $80.566\mu\text{A}$  per bit, where  $4095 = 3.3\text{V}$ .

Initial measurements with the ASDM300F yielded higher-than-expected levels of noise on the current sense outputs. According to [33], this problem is common and could be related to the type of resistor chosen for the current sense circuits. Indeed, this problem was mitigated by changing the sense resistor on the input and output current circuits to a metal-foil type rather than the thick-film type. The high intrinsic inductance of thick-film resistors was causing the circuit to report high levels of noise that in fact did not exist.



Figure 30: ASDM-300F Output Voltage and Feedback Voltage Testing

Testing the ASDM-300F yielded mostly successful results. The output ripple is imperceptible by most oscilloscopes ( $<20\text{mV}$  pk-pk). The current measurements are accurate as checked by a Fluke, 7-digit benchtop ammeter. One caveat was encountered, which is that the MIC94325 is very effective at mitigating ripple voltage. Unfortunately, this feature also applies to intentional voltage transitions. IODVS tests using the ASDM-300F were, in some cases, unable to achieve the slew-rate necessary to quickly transition from voltage-

independent to voltage-dependent states. For these tests as described later in Chapter 5, the MIC94325 was removed and the output of the SMPS was used instead.

## 4.3 Peripheral Power Switch (PPS-330D)

A limitation of previous experiments is that current measurements were taken at the supply of the voltage domain. While conducting experiments on one device, the effect of that voltage change was actually seen as to how it affected every device on the domain. This issue is addressed through the development of the Peripheral Power Switch (3x Poles, 3.0Amp Max, with Disconnect) – PPS-330D.

The PPS-330D is 8-Pin DIP package compatible and provides inputs for 3 voltages and ground. A peripheral device or group of peripheral devices are connected to the output voltage and the output ground. The remaining two pins are used to select between which of the three domains (or disconnection) that the device is connected to. Selection is done using a 2 $\rightarrow$ 4 decoder and the selected domain is indicated via LEDs on the module.

In addition to load isolation, the device provides another option for load management. Given that a peripheral device may operate via IODVS in the range of 1.8 $\rightarrow$ 3.3V, some domains may already be at the requested voltage. Therefore, rather than adjusting a voltage domain, the device could simply be switched to a more compatible domain for its current mode of operation.

The schematic for the PPS-330D is provided in Appendix C. One requirement of the PPS-330D is that V0 must be connected to the domain with the highest voltage at all times. As such, it is suitable to be connected to a non-modulating rail typically at 5V or 3.3V. The SN74LVC1G139 decoder is capable of operating in from 1.65V to 5.5 while providing a low propagation delay of only 4.9ns. This enables the PPS-330D to quickly move peripherals among the available voltage domains.

Three MOSFET packages containing a total of 5 PMOS and 1 NMOS transistors are used to route the voltage domains to peripheral devices. Because V0 is always the highest voltage, the device assures each transistor is operating completely in either the cutoff or saturation modes. By selecting 0/0 on the decoder, Y0 falls low, Y3 remains high, U2P and U2N are activated (note that the decoder is active low), thus routing current from V0 to Vout.

Likewise, when selecting 0/1 or 1/0, either Y1 or Y2 fall low, thus activating U3 or U4 while U2N remains active due to Y3 remaining high. Therefore, the selected voltage domain is routed to Vout.

Finally, there are cases where it may be advantageous to virtually remove a device completely from the circuit. This is achieved by selecting Y3, which drops the gate voltages to U2N, thus turning off the low-side ground transistor and therefore disconnecting the device.



**Figure 31: PPS-330D**

Tests on the PPS-330D yielded favorable results. The voltage domains are quickly routed to Vout upon selection. In fact, this transition occurs so quickly that it is suitable for transitioning peripherals mid-operation. The limitation on this functionality would be for extremely high load-currents with low-capacitance domains. This was not observed to be a problem in testing.

There is one aspect of the device which limits the utility of virtually disconnecting a peripheral device. Although power is disconnected from the peripheral device, the communication lines remain intact. The problem was observed on the MCP-65AA512 EEPROM, where despite the lack of a power ground, 4 communications lines remain connected: MOSI, MISO, MCLK and CS (Chip Select). Chip select is active low and MCLK is low when not transmitting (to other devices on the bus). It was observed that the device was accumulating enough charge to intermittently power up via the power/ground circuit on the communications lines.

These events caused a significant amount of interference when attempting to communicate on the SPI bus. Therefore, when not in use, it is recommended to place peripheral devices on the default domain and leave them in a powered-on state. This issue could also be solved through the application of voltage translators, but a key aspect of the IODVS research is eliminating the need for such devices.

## 4.4 Programmable Load Regulator (PLR-5010D)

Through IODVS the current consumption requirements of various operations on multiple different devices was characterized. The PLR-5010D (Programmable Load Regulator, 5V, 1.0A, Dual Output) allows the operator to place loads of varying amplitudes and durations with a high degree of accuracy. For example, it is possible to recreate the exact current profile of a EEPROM write cycle as shown in the MCP65AA512 experiments, or of a temperature/humidity measurement cycle as shown in the HIH613X experiments.

The device was originally designed to sink up to 1A at 5V by modulating the feedback circuit of a LT3080 LDO. The circuit board was designed to achieve high thermal conductivity and the schematic is provided in Appendix D. Through further development, the device was refined to instead modulate the base-current of a large BJT which is sourcing current into an external resistor. The configuration provides a much finer grained control of output currents after linearization. The modifications to the Rev0 board are shown in Figure 33. This test fixture confirmed the theoretical operation of the device.



Figure 32: PLR-5010D Rev0 Assembly as Designed



Figure 33: PLR-5010D Rev0 Assembly with Rev1 Test Modifications

Ultimately, the device is realized in Rev1 by using a high-accuracy, dual-channel, 16-bit DAC. The outputs of the DAC are each attached to the base of one FZT849 bipolar junction transistor. The schematic of the PLR-5010D Rev1 is provided in Appendix E. The transistors are configured in such a way as to operate in as linear a fashion as possible. Specifically, the  $27\text{K}$  bias resistors and the  $3.9\text{k}\Omega$  base resistors, when combined with the

3.3Ω load resistors from the DEB-429A create a nearly-linear voltage to current output function.

The driver software uses the linearization curve shown in Figure 34 as a ‘best-guess’ for where to begin when acting upon a request for a change in load current. After applying this estimate, the software adjusts the bias-current by means of a binary search algorithm. The algorithm stops adjustment after the output current is within a specified the margin of error.



Figure 34: PLR-5010D Current Output Linearization

The PLR-5010D circuit was simulated against varying input voltages and output currents to yield Figure 34. A linear best-fit line was calculated determined and the equations are shown in the figure. However, these best-fit lines were created at discrete input voltage intervals of 3.3V, 2.8V, 2.3V and 1.8V. Each input voltage change vertically offsets the curve to some degree, and that factor needs to be made continuous. Therefore, the set of linear equations were further adjusted in order to take input voltage into account. Linearization of channel voltage ultimately allows for a linear collector-current to base-voltage relationship as follows:

$$V_{chan} = 3.3; I = .047507V_b - .00781$$

$$V_{chan} = 1.8; I = .044948V_b - .01508$$

$$\therefore I_c = [(0.001753 * (V_{chan} - 1.8)) + .0049] * V_b - [0.0151 - (0.004867 * (V_{chan} - 1.8))]$$

$$\therefore V_b = \frac{I_c + [0.0151 - (V_{chan} - 1.8) * 0.004867] - .0449}{.001733 * (V_{chan} - 1.8)}$$

These equations provide a fairly accurate best-guess for collector currents greater than 5mA and a binary search is employed in order accurately achieve lower load settings. With these equations and modifications in mind, Rev1 of the PLR5010D was designed. The new device was designed with two primary changes in mind. The first was to make it DIP-8 compatible so that the device could be easily socketed. The second feature was to use an external resistor in order to help dissipate heat externally from the device. The final embodiment of the device is shown in Figure 35.



**Figure 35: Three PLR-5010D Units Installed on the DEB-429A**

Tests of the PLR-5010D were successful after a few design modifications that are reflected in the schematic. The Texas Instruments DAC8562 16-bit DAC was chosen because it is a high-precision, low-noise ‘buffered’ output. Unfortunately, the buffer is digital and the output drive capability was measured to be fairly low. Therefore, the  $3.9\text{k}\Omega$  base resistors were used, rather than the originally intended  $390\Omega$  resistors. The linearization algorithm was optimized to operate despite known noise.



Figure 36: Current Output Sweep of the PLR-5010D as Measured by ASDM-300F

## 4.5 Discovery Expansion Board (DEB429A)

The DEB429A is used in combination with the previously discussed modules to achieve the goals set forth at the beginning of the chapter. It integrates two ASDM-300F modules, eight PPS-330D modules (1 per peripheral device) and 3 PLR-5010D modules (1 per voltage domain). Peripherals from the PEGMA board are included, as well as four new ones in order to test and broaden the scope of IODVS.

The primary function of the DEB429A is to serve as a host to the STM32F429 Discovery board (ST-DISCO). This device is developed and manufactured by ST Microelectronics and provides many features that are useful for IODVS research. A few of these features are:

- STM32F429 microcontroller (MCU) with 2MB FLASH, 512KB RAM at 180MHz
- On board 64MB SDRAM
- On board ST-Link in-circuit debugger
- High-accuracy external oscillator
- On board push buttons, LEDs and LCD with touch-screen
- Access to every MCU pin via a 2x 100mil spaced dual row headers.

Before deciding on the ST-DISCO board to host further IODVS research, it was prudent to test its capabilities. It was expected that the device should be capable of simultaneously sampling 3 ADCs as well as 1 memory location and writing these results to SDRAM at 1us intervals. Because SDRAM is a more complex memory protocol than a purely random access device such as SRAM, tests were undertaken to ensure that it could handle the memory pressure.



**Figure 37: STM32F429 Discovery Front**



**Figure 38: STM32F429 Discovery Back with Peripheral Modules on Breadboard**

Following successful feasibility test results, development began on the DEB429A. The final schematic with post-development annotations for use in Rev1 (future board named the DEB429B) is included in Appendix F. The DEB429A is responsible for integrating all of the previously discussed modules and external peripherals in order to achieve the goals outlined at the beginning of the chapter.

#### **4.5.1 System Architecture**

The STMicroelectronics DISCO board is outfitted with a variety of sensors and onboard peripherals. In order to design a host board such as the DEB429A, it was necessary to work around or in conjunction with existing circuitry. The DISCO board schematic [34] was used in conjunction with the STM32CubeMX software to define the function of each pin on the MCU. Most importantly, it was necessary to account for 6 analog inputs, 1 analog output the onboard SDRAM, USB 2.0 Hi-Speed (SRAM interface) and the ILI9341 LCD controller.

Note that from Figure 39, because pin PA4 is used for VSYNC on the ILI9341 LCD controller, that it cannot be used as a DAC output. Therefore, a combination of GPIO outputs on VADJ1\_0 and VADJ1\_1 on pins PB4 and PB7 respectively, allow for selection of four discrete voltage levels as described in the next section.



**Figure 39: Finalized Pinout of the STM32F429 on the STMicroelectronics DISCO Board**

### 4.5.2 Analog Design

There are four voltage domains provided by the combination of the DISCO board and the DEB429A. A 5V domain is provided directly from the USB bus that powers the DISCO board. It is important to note that the USB voltage provided to the DISCO board from the USB host is usually not 5V. In most cases, the voltage was observed to be about 4.5V via powered USB hubs. Among unpowered USB hubs, the 5V domain was observed to be as low as 3.6V in some cases.

The DISCO board provides a 3.3V domain as the result of an onboard LDO regulator. This 3.3V domain powers all of the devices on the DISCO board, such as the MCU,

SDRAM, LCD, etc. Note that as described on the DEB429A schematic, that D3 was removed from the DISCO board. At typical loading for this domain, removing the voltage drop across D3 results in the domain yielding 3.3V rather than approximately 3.0V.



Figure 40: USB 5V to 3.3V Translation on the DISCO Board [34]

Because the MCU on the DISCO board is tasked with communicating to peripherals on the DEB429A, the onboard 3.3V domain is brought onto the DEB429A board. It is metered via a MAX4376 current sense amplifier as is used on the ASDM-300F and a voltage divider as shown in Figure 41: DISCO 3V3 Voltage and Current Sense Circuit.



Figure 41: DISCO 3V3 Voltage and Current Sense Circuit

Eight analog input signals are also delivered to the STM32F429 for ADC sampling. These analog signals provide the information necessary to determine the efficiency of each ASDM-300F module. Input current, output current, output voltage are provided, but we must know the input voltage a-priori. Fortunately, this is routed directly from the USB bus and is

typically a stable 4.5 volts when drawn from a powered USB hub. Figures Figure 41 and Figure 42 show from where each signal is derived.



**Figure 42: ASDM-300F Implementations on the DEB429A**

All of the voltage measurements are done through a voltage divider so that an ADC can properly evaluate high voltages without exceeding Vref. The output voltages of the ASDM-300F modules may be changing rapidly and therefore they are buffered through a voltage-follower op-amp in order to make sure that a receiving ADC is of performing accurate samples at a minimum of 1MSPS. More on this subject is discussed in the following chapter. On the other hand, the 3V\_DISCO signal should not be changing rapidly because it is provided by an LDO without adjustable output voltage. Therefore, it does not need the assistance of a buffer in order to supply a downstream ADC. The Maxim current sense amplifiers provided on the ASDM-300F units are sufficiently low-impedance that they, also, do not need a similar buffer.

**Table 17: Analog Signals Provided by the DEB429A**

| Signal    | Description                           |
|-----------|---------------------------------------|
| ADC1_PII1 | Peripheral Input Current on Domain 1  |
| ADC1_PII2 | Peripheral Input Current on Domain 2  |
| ADC2_PV0  | Voltage Domain 0 (divided by two)     |
| ADC2_PV1  | Voltage Domain 1 (divided by two)     |
| ADC2_PV2  | Voltage Domain 2 (divided by two)     |
| ADC3_PI00 | Peripheral Output Current on Domain 0 |
| ADC3_PI01 | Peripheral Output Current on Domain 1 |
| ADC3_PI02 | Peripheral Output Current on Domain 2 |

Two ASDM-300F units are designed into the DEB429A and provide two additional voltage domains to which peripheral devices can be switched. The output voltage of each ASDM-300F module is set via an analog input. As previously noted, only one analog output is available from the STM32F429, therefore a combination of GPIO outputs were used to allow for selection of discrete voltage levels at approximately 3.3V, 2.8V, 2.3V and 1.8V. The analog feedback circuitry is shown in Figure 43.



**Figure 43: ASDM-300F Modulation Circuitry**

The DAC output of the STM32 is simply buffered via a voltage-follower op-amp configuration. This ensures that the ASDM-300F feedback circuitry is driven by a

sufficiently low-impedance source. This circuit ( $\text{DAC\_OUT2} \rightarrow \text{VADJ2}$ ) supplies the feedback signal for the second voltage domain (V2). The combination of  $\text{VADJ1\_0}$  and  $\text{VADJ1\_1}$  GPIO outputs result in a low-impedance signal emitted from  $\text{VADJ1}$ . The resistor sizes were calculated using simple circuit analysis such that the input to the op-amp would correspond to the values noted in the figure. This circuit drives the first voltage domain (V1).

Each of the peripheral devices is attached to a PPS-330D unit which allows the controller to switch the voltage domain that is applied to the peripheral, or disconnect the peripheral entirely.



**Figure 44: A PPS-330D Controlling Power to a Peripheral Device**

There are four voltage domain options to select for each peripheral (V0, V1, V2 and disconnected) that are selected by two control pins. There are a total of 8 peripherals on the DEB429A and thus, controlling the PPS-330D units requires 16 GPIO pins. The STM32F429 on the DISCO board has very few unused pins and therefore it was necessary to design an I/O expansion circuit by making use of SN74HC259 [35] addressable latches.

As shown in Figure 45, the two addressable latches require only 6 GPIO pins in order to control the 16 pins necessary for all PPS-330D units. In exchange for the pin reduction, the ability to modulate individual units simultaneously is sacrificed. Instead, it is necessary to latch in the selection for each PPS-330D on a unit by unit basis. One important exception to this rule is that the PV\_CLR pin can be activated which will return all PPS-330D devices to the disconnected state. Bias resistors are installed in order to make this functionality the default state upon power application.



Figure 45: I/O Expansion Enabling PPS-330D Selection

#### 4.5.3 Digital Design

The original PEGMA design was limited to the UART for external communication. With modern UART→USB converters, the system was able to communicate at approximately 492kbps. While that was sufficient for previous needs, PRIME seeks to increase both sample rate and test duration which results in a very significant increase in payload. For example, a 4 channel test running for 1 second at 1 MSPS yields 8MB of information that must be downloaded upon completion of each test. This would take approximately 130 seconds to download using previous methods and PRIME seeks to minimize test duration.

The need was satisfied via the UM232H USB2.0 Hi-Speed module [35] that was sourced from FTDI. This is a multi-protocol module (UART / SPI / I2C / MSPEE / FIFO), however the maximum speeds (480Mbps) are achieved via the FIFO interface. The FIFO interface is similar to a 1-bit addressable, 8-bit word SRAM device and is therefore compatible with the STM32F429 onboard flexible memory controller (FMC). The device coexists on the same controller as the onboard 64MB SDRAM.



**Figure 46: The UM232H Hi-Speed USB 2.0 Module [35]**



**Figure 47: The UM232H Module as Connected to the STM32F429 via the DEB429A**

The module was programmed using proprietary software from FTDI in order to set the FIFO mode and commission a vendor identifier and a device identifier. Raw bandwidth testing of the device using standard Microsoft Windows virtual comm-port (VCP) drivers yielded approximately 112Mbps. In order to achieve the higher 480Mbps speeds, it would be necessary to write custom drivers that do not result in a virtual comm-port and would thus be incompatible with existing software. Regardless, 112Mbps is capable of transmitting (as per the previous example) 8MB in approximately 500ms which is acceptable per the system requirements.



**Figure 48: UM232H Location on the DEB429A**

All of the peripheral devices that were tested on the PEGMA board are also included on the DEB429A. Refer to Chapter 3 for further details of each device. These include:

- Microchip MCP25AA512 512K EEPROM
- Numonyx M25PX NOR Serial Flash
- Lexar microSD Card
- Sandisk microSD Card
- Swissbit microSD Card
- Honeywell HIH-6131 Temperature / Humidity Sensor

The DEB429A adds the following additional peripheral devices:

- Microchip SST26VF NAND Serial Flash
- SiLabs Si1143 Optical Proximity Sensor
- Adafruit ESP12 WiFi module
- STMicroelectronics SBT263C1A Bluetooth module
- PLR-5010D (3x)

#### 4.5.3.1 Microchip SST26VF Serial Flash

It was desirable to test a NAND serial flash in order to complement the previously tested NOR serial flash. The two implementations have topological differences, however the internal memory controllers will determine the timing characteristics of operations on the device. For instance NAND does not provide random-access reads while NOR flash does.



Figure 49: SST25VF064B Control Block Diagram [37]

The SST26VF064B [37] is one of very few available serial flash modules and it was chosen because the datasheet describes several voltage-independent states and voltage-dependent states. Specifically, the device is capable of performing reads and writes at 104MHz from 2.7-3.6V, while the maximum frequency is reduced to 80MHz for the 2.3-3.6V range. Thus, through IODVS, one can perform reads and writes at the maximum frequency of 104MHz, while also recognizing reduced power consumption during the actual operation by dropping peripheral voltage to 2.3V.

#### 4.5.3.2 Silicon Labs Si1143 Optical Proximity Sensor [35]

The Si114x series of sensors are optical proximity / ambient light detectors with onboard LED driving circuitry. These parts operate over an I2C interface and have a wide supply voltage range of 1.71 to 3.6V. Because the device operates via I2C, IODVS can take advantage of the voltage-independent states while writing to the device in addition to while the device is taking measurements.



**Figure 48: Si1141 Typical Application Circuit**

These sensors are typically used for proximity detection in restrooms and more recently are being fitted for use in wearable heart-rate monitors. Both use-cases are battery powered and therefore minimized power consumption is a primary figure of merit. Note that external LEDs may be driven from any voltage source necessary to overcome the forward voltage drop resulting from the desired current setting. LED voltage does not affect the effectiveness of IODVS.

Proximity detection can operate in autonomous mode. In this mode, the device remains asleep during normal operation, but wakes up periodically in order to take optical measurements. Sleep mode draws approximately  $2\mu\text{A}$  and with autonomous sampling, the device will operate the vast majority of its time in this mode. With this in mind, for the purposes of IODVS estimates, one can assume a very low-duty cycle.

#### 4.5.3.3 Adafruit ESP-12E WiFi Module

The ESP-8266 is the first WiFi SoC to date that achieves a price point below \$3. Because of the extremely low-cost, the device is eligible for applications in millions of IoT devices. These devices are typically sensors that sample their environment and then communicate their findings to a server via the internet.



**Figure 49: ESP-12E Module with RF Shield Removed**

The ESP-12E is a good candidate for IODVS because the voltage can be reduced from 3.3V to at least 3.0V during the voltage-independent states. Also, due to the nature of the internet, the voltage-independent states may be extremely long in duration compared to the length of the voltage-dependent states. The length of each operation is non-deterministic and will therefore require an online detection for operation completion.

#### *4.5.3.4 STMicroelectronics SBT263C1A Bluetooth Module*

The SBT263C1A Bluetooth module can be powered from a supply voltage as low as 2.0V. The device communicates to the host via UART and provides a Bluetooth v3.0 stack. The serial port profile is included for UART pass through at up to 560Kbps.



**Figure 50: The SBT263C1A Bluetooth Module**

#### 4.5.3.5 PLR-5010D Programmable Load Regulator

Three PLR-5010D modules are designed into the DEB429A circuit board. One is connected to each power supply. This enables experiments to test the effects of dynamic efficiency-triggered domain switching.

#### 4.5.4 Results

The DEB429A was manufactured and fitted with the peripheral devices described previously in this chapter, as well as with a STM32F429 DISCO board. The final assembly is shown below in Figure 51.



Figure 51: DEB429A Final Assembly and Power-on Self-Test Firmware

Test results of the final design were positive after addressing some issues that required rework. As can be deduced in the figure, the PLR-5010D units were intended to fit onto the DEB429A with the long edge facing north-south. Unfortunately, there was a pinout

error between the PLR-5010D and DEB429A schematics. This issue was addressed by relocating the inner pins and rotating the devices 90 degrees.

# CHAPTER 5: PACER

## Peripheral Activity Completion Estimation and Recognition

### 5.1 Introduction

Intra-Operation Dynamic Voltage Scaling (IODVS) has been shown to significantly reduce the energy consumption of embedded peripherals during their voltage-independent states. These states typically occur during mandatory delay periods as the device completes a specified operation. Peripheral Activity Completion Estimation and Recognition (PACER) seeks to further reduce system-wide energy consumption and decrease peripheral latency by recognizing the completion of the voltage-independent state and thus completing the overall operation early.

Peripheral operations are specified for a worst-case duration by the manufacturer that may depend on a number of factors including age and temperature. Most peripheral devices provide a mechanism for signaling that operations completed earlier than the maximum. PACER develops adaptive timing, current usage and charge consumption heuristics for estimating early completion of peripheral operations.

The estimate is verified upon returning from the voltage-independent state and the heuristic is updated with the results. In this fashion, the algorithms are resistant to variations in behavior that may occur across the lifecycle of the device. PACER is measured against a variety of embedded peripherals and is shown to further decrease peripheral energy consumption decrease peripheral latency with minimal computational overhead.

For example, when writing a page of EEPROM a voltage-independent wait state is encountered that is specified to a maximum duration of 5ms. However, that specification is for the worst case and is more suitable for a timeout value. The current consumption profile of an EEPROM write operation at varying voltages is shown in Figure 52.



**Figure 52: EEPROM Write Current Profile**

As the device transitions through the Idle → Write → Wait → Verify states, it can be inferred from the current profile that the operation completed by the 5ms mark and that it was not necessary to delay until approximately 6.5ms per the specification. In the case of EEPROM and most peripheral devices, there is a register that can be polled and it indicates when the write has completed. Polling this register requires the MCU to communicate with the peripheral and thus results in transitioning to a voltage-dependent state. Thus, accurate estimations can decrease latency and energy consumption, but inaccurate estimates can result in an early transition to a voltage-dependent state and thus increase energy consumption.

There are a wide variety of peripheral devices with a correspondingly wide variety of completion determinism and current profiles. Devices with highly deterministic timing respond best to the timing heuristic while those with variable timing respond best to current or charge heuristics.

PACER seeks to estimate and detect early completion of operations in peripheral devices by applying timing and current usage heuristics. Through early completion detection, PACER is able to decrease both latency and system-wide energy consumption. PACER is particularly advantageous to systems implementing IODVS by decreasing the effective duration of voltage-independent states.

## 5.2 Related Work

### 5.2.1 Timing Heuristic

Peripheral operations can vary in their latency or completion times due to a number of factors. Temperature can significantly affect the completion time for peripherals with fairly deterministic timing requirements such as DRAM [32]. Device aging can also affect timing due to a number of issues resulting from fundamental semiconductor physics [33]. Furthermore, some devices simply have non-deterministic completion times due to features such as MMUs and caches that are implemented in various data storage devices like Micro-SD cards, or age and wear as they effect FLASH storage timing.

Because the latency can vary significantly between operations, it is necessary to develop a timing heuristic that can adapt to slowly changing effects like age and temperature as well as rapidly changing factors like cache hits and misses. Adaptive delay estimation is not a new problem [34] and research continues to compensate for non-deterministic delay with different approaches for wireless communications, control systems and mass storage latency [35].

### 5.2.2 Energy Heuristic

For devices with highly variable timing and dynamic current consumption characteristics, integrating the current consumption of the device throughout an operation can allow for better detection of completion. Some operations can be characterized by the amount of charge necessary to complete them. This technique is referred to as “coulomb counting” and is a common technique used to determine the state of charge in rechargeable batteries [38].

### 5.2.3 Current Heuristic

The completion of some peripheral operations are easily detectable by their current consumption profile. These devices have a distinct and deterministic current profile that can be characterized and used to estimate the moment when an operation completes.

Simple and differential power analysis (SPA and DPA) attacks are performed by monitoring device current consumption with very fine grained detail. These attacks seek to undermine encryption techniques by monitoring the current consumption of the processor and detecting the moment at which the processor executes a branch operation [36]. The attacks have been performed on an ARM Cortex MCU using AES and required an extensive measurement setup to accomplish [37]. PACER is inspired by this previous work using fine-grained in-circuit current measurement and fortunately benefits from much more lenient sampling requirements.

## 5.3 Methods and Materials

### 5.3.1 Development Platform

PACER and IODVS are hosted on a STM32F429 MCU implemented on the STMicroelectronics DISCO board and hosted by the PRIME assembly. The board provides 64MB of SDRAM which allows for simultaneous sampling throughout the test suite at very high speed. All experiments were sampled at 1MSPS and the SDRAM allowed any individual experiment to last up to 1 full second. All of the analog conversions as well as the device state sampling were performed via DMA. Therefore, the test fixture is expected to have had no impact whatsoever on the operation under test.

Each of the peripheral devices under test has some method of verifying whether or not an operation completed successfully. For the memory devices, a simple read-back verification is sufficient to determine correctness. The temperature and humidity sensor provides a status bit indicating if an operation is in progress, thus indicating that a requested operation has not yet completed.

Recall that when implementing IODVS, that the host MCU and peripheral devices are placed on different voltage domains throughout the course of the voltage-independent state. Because of this, it is not possible for the MCU to poll the peripheral device for operation completion. Polling is also shown to be a rather costly operation in and of itself. Without the ability to communicate to the peripheral device, PACER uses other methods to best judge operation completeness.

### 5.3.2 PACER-T

The PACER-T algorithm uses a successive binary approximation algorithm to determine the optimal delay latency for an operation. The algorithm begins by executing an operation with the amount of delay specified in the device datasheet. After each iteration, if the operation was successful, then the amount of delay is halved. Otherwise, the operation resulted in an error and the next delay is increased by half the distance to the last previously successful operation.

The algorithm is executed online and provides the tightest possible timing. In fact, the timing is so precise that it should be considered marginally stable. To account for extremely small variations in timing, for instance due to clock jitter or internal peripheral asynchronous operation, the minimum delay found by PACERT-T is increased by 5% in the following tests. This value was not optimized and may even be much smaller. It would likely be beneficial for a system using this algorithm to re-characterize the peripheral device periodically in order to account for temperature variations.

### 5.3.3 PACER-E

The energy based heuristic was performed in much the same way as the timing heuristic. The system aggregates all output current samples from the power supply consumed by the peripheral device. When the digital integration has reached the test value, the operation is ‘complete’ and checked for correctness.

This algorithm is intended for use in devices that consume a constant amount of energy per operation. It compensates for devices that are energy bounded rather than time-bounded.

The algorithm uses a successive binary approximation in the same fashion as PACER-T in order to determine the exact amount of energy required to perform an operation. PACER-E is somewhat less precise than the timing based algorithm due to the time required to both sample and perform the digital integration necessary for threshold checking.

### 5.3.4 PACER-C

The charge algorithm is also performed online and makes use of the current profile in order to determine if an operation has completed. The algorithm begins by taking a sample of the power supply output current. Next, the operation is executed and is not considered complete until the output current returns to some percentage of its previous state.

For instance, if the output current were measured to be 1mA before the operation began, and assuming that the operation will result in some increase in current, it is logical to wait until the current is once again at 1mA before polling the peripheral device for operation completion.

PACER-C is the most basic method to determine in real time if an operation has completed and may also be prone to false positives in some cases. There are many more advanced algorithms that can suit the purpose such as a multi-layer perceptron that is used in neural networks. It is notable however, that reducing the complexity of the detector is very important so that the algorithm can ensure that it is maintaining pace with incoming samples. Naturally, more complex algorithms could be accommodated by a more powerful host microcontroller.

## 5.4 Results

Initial IODVS results were repeated so as to establish a baseline with which to compare the results of PACER. Previous experiments required the results to be averaged many times over. The PRIME assembly provides high enough signal to noise ratio that averaging multiple test results is only used in order to maximize accuracy.

Note that the “Active Total” items in the following tables encompass the test results ignoring the idle state contributions to both time and energy. The idle state is a byproduct of the test and in actual usage could be any arbitrary value. The value would be incorporated into the duty cycle discussion that was investigated in the results of Chapter 3.

### 5.4.1 Microchip MCP25AA512 EEPROM



Figure 53: IODVS Result Reproduction via PRIME



**Figure 54: EEPROM Write with PACER-T + IODVS**



**Figure 55: EEPROM Write with PACER-C + IODVS**

As shown in Figure 53, the capacitive effect of the domain is much more pronounced in PRIME than on the PEGMA board used for initial IODVS research. This effect reduces the overall effectiveness of IODVS.

**Table 18: EEPROM Operation Energy**

| State               | Control      | IODVS          | PACER-T        | PACER-T + IODVS | PACER-E        | PACER-E + IODVS | PACER-C        | PACER-C + IODVS |
|---------------------|--------------|----------------|----------------|-----------------|----------------|-----------------|----------------|-----------------|
| Idle                | 6.03         | 2.47           | 6.04           | 2.52            | 6.06           | 2.47            | 6.13           | 2.48            |
| Writing             | 2.71         | 2.55           | 2.73           | 2.51            | 2.69           | 2.50            | 2.80           | 2.58            |
| Waiting             | 46.84        | 33.85          | 37.89          | 27.85           | 33.59          | 25.06           | 35.47          | 26.24           |
| Read                | 3.50         | 3.48           | 3.29           | 2.04            | 3.51           | 3.49            | 3.55           | 3.21            |
| Idle                | 5.89         | 5.76           | 15.30          | 13.42           | 14.97          | 14.16           | 14.50          | 14.03           |
| <i>Active Total</i> | <b>64.97</b> | <b>48.11</b>   | <b>65.24</b>   | <b>48.34</b>    | <b>60.82</b>   | <b>47.69</b>    | <b>62.44</b>   | <b>48.54</b>    |
| <b>Delta</b>        | <b>0.00%</b> | <b>-24.83%</b> | <b>-17.24%</b> | <b>-38.93%</b>  | <b>-25.00%</b> | <b>-41.46%</b>  | <b>-21.17%</b> | <b>-39.62%</b>  |

**Table 19: EEPROM Operation Latency**

| State               | Control      | IODVS        | PACER-T        | PACER-T + IODVS | PACER-E        | PACER-E + IODVS | PACER-C        | PACER-C + IODVS |
|---------------------|--------------|--------------|----------------|-----------------|----------------|-----------------|----------------|-----------------|
| Idle                | 1.025        | 1.025        | 1.026          | 1.025           | 1.026          | 1.025           | 1.038          | 1.038           |
| Writing             | 0.429        | 0.43         | 0.429          | 0.429           | 0.429          | 0.43            | 0.429          | 0.429           |
| Waiting             | 5.045        | 5.044        | 3.508          | 3.507           | 3.54           | 3.508           | 3.603          | 3.653           |
| Read                | 0.506        | 0.508        | 0.506          | 0.507           | 0.506          | 0.506           | 0.507          | 0.508           |
| Idle                | 0.994        | 0.992        | 2.53           | 2.531           | 2.498          | 2.53            | 2.422          | 2.371           |
| <i>Active Total</i> | <b>5.98</b>  | <b>5.982</b> | <b>4.443</b>   | <b>4.443</b>    | <b>4.475</b>   | <b>4.444</b>    | <b>4.539</b>   | <b>4.59</b>     |
| <b>Delta</b>        | <b>0.00%</b> | <b>0.03%</b> | <b>-25.70%</b> | <b>-25.70%</b>  | <b>-25.17%</b> | <b>-25.69%</b>  | <b>-24.10%</b> | <b>-23.24%</b>  |

Although the IODVS portion of the test was slightly less effective than shown in the previous chapter, every PACER algorithms performed very well. The results were further enhanced by combining PACER with IODVS. PACER algorithms by themselves reduced the overall energy consumption by 17–25%. When combined with IODVS, the results are all very close to each other and further improve to a total of about 40%.

The PACER algorithms also significantly reduce the latency of each operation. The timing and energy heuristics decrease latency by the largest amount and their results are very close to each other. The current-based heuristic lags slightly due to the amount of overhead necessary to analyze the current and determine if the operation has completed.

### 5.4.2 Numonyx M25PX16 NOR Serial Flash



Figure 56: NOR Serial Flash IODVS Write



Figure 57: NOR Serial Flash IODVS + PACER-T Write

**Table 20: NOR Serial Flash Operation Energy**

| State        | Control      | IODVS         | PACER-T       | PACER-T + IODVS | PACER-E       | PACER-E + IODVS | PACER-C       | PACER-C + IODVS |
|--------------|--------------|---------------|---------------|-----------------|---------------|-----------------|---------------|-----------------|
| Idle         | 6.04         | 4.66          | 6.05          | 4.61            | 6.07          | 4.65            | 6.15          | 4.63            |
| Reading      | 49.99        | 50.08         | 58.08         | 50.21           | 50.13         | 49.89           | 49.87         | 50.14           |
| Erase        | 0.95         | 0.96          | 0.91          | 0.94            | 0.96          | 0.97            | 0.99          | 0.95            |
| Total Write  | 39.17        | 36.31         | 63.05         | 41.26           | 37.68         | 20.40           | 37.36         | 17.81           |
| Total Wait   | 2138.32      | 1713.89       | 1211.99       | 1029.52         | 1501.73       | 1178.32         | 1319.34       | 1040.74         |
| Reading      | 48.60        | 48.78         | 51.91         | 31.72           | 48.59         | 33.76           | 48.47         | 42.02           |
| Idle         | 37.05        | 33.13         | 929.24        | 711.59          | 689.63        | 576.83          | 859.97        | 657.46          |
| Active Total | 2277.02      | 1854.68       | 1391.98       | 1158.26         | 1645.17       | 1287.98         | 1462.18       | 1156.28         |
| <b>Delta</b> | <b>0.00%</b> | <b>18.55%</b> | <b>38.87%</b> | <b>49.13%</b>   | <b>27.75%</b> | <b>43.44%</b>   | <b>35.79%</b> | <b>49.22%</b>   |

**Table 21: NOR Serial Flash Operation Latency**

| State        | Control      | IODVS         | PACER-T       | PACER-T + IODVS | PACER-E       | PACER-E + IODVS | PACER-C       | PACER-C + IODVS |
|--------------|--------------|---------------|---------------|-----------------|---------------|-----------------|---------------|-----------------|
| Idle         | 1.04         | 1.04          | 1.04          | 1.04            | 1.04          | 1.04            | 1.05          | 1.05            |
| Reading      | 4.27         | 4.27          | 4.27          | 4.27            | 4.27          | 4.27            | 4.27          | 4.27            |
| Erase        | 0.08         | 0.08          | 0.08          | 0.08            | 0.08          | 0.08            | 0.08          | 0.08            |
| Total Write  | 3.31         | 3.32          | 3.31          | 3.32            | 3.31          | 3.32            | 3.31          | 3.32            |
| Total Wait   | 231.57       | 231.57        | 69.47         | 66.92           | 104.81        | 120.17          | 80.15         | 79.06           |
| Reading      | 4.64         | 4.64          | 4.27          | 4.63            | 4.64          | 4.73            | 4.64          | 4.64            |
| Idle         | 7.09         | 7.08          | 169.55        | 171.74          | 133.86        | 118.38          | 158.50        | 159.57          |
| Active Total | 243.87       | 244.92        | 82.45         | 80.26           | 118.14        | 133.62          | 93.50         | 92.43           |
| <b>Delta</b> | <b>0.00%</b> | <b>-0.43%</b> | <b>66.19%</b> | <b>67.09%</b>   | <b>51.55%</b> | <b>45.21%</b>   | <b>61.66%</b> | <b>62.10%</b>   |

The control tests of the Numonyx serial flash showed a promising current profile for PACER optimization. Both the erase and write operations appeared to contain an excessive amount of idle wait time (as specified per the datasheet). Indeed, PACER-T was the most effective optimization algorithm and it decreased the overall latency by 66.19%. Equivalently, the algorithm sped up the write-cycle by 204%.

By speeding up the write operation so significantly, the peripheral energy expenditure was also reduced dramatically. The algorithm achieved nearly 50% savings using the time and current based heuristics. The energy based heuristic was slightly less effective, likely due to the cumulative effects of noise in the current measurement over the course of such a long test.

### 5.4.3 Microcochip SST26VB NAND Serial Flash



Figure 58: NAND Serial Flash IODVS Write



Figure 59: NAND Serial Flash IODVS + PACER-C Write

**Table 22: NAND Serial Flash Operation Energy**

| State               | Control        | IODVS          | PACER-T        | PACER-T + IODVS | PACER-E        | PACER-E + IODVS | PACER-C        | PACER-C + IODVS |
|---------------------|----------------|----------------|----------------|-----------------|----------------|-----------------|----------------|-----------------|
| Idle                | 5.62           | 4.12           | 5.54           | 4.07            | 5.58           | 4.08            | 5.64           | 4.14            |
| Reading             | 71.49          | 71.57          | 72.58          | 71.52           | 71.45          | 72.10           | 71.54          | 71.39           |
| Erase               | 1.58           | 1.59           | 1.53           | 1.58            | 1.55           | 1.49            | 1.55           | 1.52            |
| Total Write         | 51.88          | 48.14          | 75.40          | 70.70           | 73.01          | 70.19           | 51.63          | 44.66           |
| Total Wait          | 1052.98        | 806.15         | 802.63         | 584.87          | 817.59         | 596.84          | 887.28         | 670.32          |
| Reading             | 69.97          | 69.81          | 73.11          | 73.27           | 72.90          | 73.11           | 72.75          | 72.85           |
| Idle                | 52.31          | 44.66          | 249.37         | 187.25          | 234.28         | 158.54          | 149.37         | 133.95          |
| <i>Active Total</i> | <i>1247.90</i> | <i>997.26</i>  | <i>1025.25</i> | <i>801.95</i>   | <i>1036.50</i> | <i>813.72</i>   | <i>1084.76</i> | <i>860.73</i>   |
| <b>Delta</b>        | <b>0.00%</b>   | <b>-20.08%</b> | <b>-17.84%</b> | <b>-35.74%</b>  | <b>-16.94%</b> | <b>-34.79%</b>  | <b>-13.07%</b> | <b>-31.03%</b>  |

**Table 23: NAND Serial Flash Operation Latency**

| State               | Control      | IODVS        | PACER-T        | PACER-T + IODVS | PACER-E        | PACER-E + IODVS | PACER-C        | PACER-C + IODVS |
|---------------------|--------------|--------------|----------------|-----------------|----------------|-----------------|----------------|-----------------|
| Idle                | 1.04         | 1.04         | 1.04           | 1.04            | 1.04           | 1.04            | 1.05           | 1.05            |
| Reading             | 5.00         | 5.00         | 5.00           | 5.00            | 5.00           | 5.00            | 5.00           | 5.00            |
| Erase               | 0.08         | 0.08         | 0.08           | 0.08            | 0.08           | 0.08            | 0.08           | 0.08            |
| Total Write         | 3.31         | 3.32         | 3.31           | 3.31            | 3.31           | 3.32            | 3.31           | 3.31            |
| Total Wait          | 57.61        | 57.62        | 19.26          | 19.27           | 25.92          | 25.92           | 36.19          | 36.20           |
| Reading             | 5.28         | 5.28         | 5.29           | 5.29            | 5.28           | 5.28            | 5.28           | 5.29            |
| Idle                | 9.68         | 9.67         | 48.02          | 48.01           | 41.37          | 41.36           | 31.09          | 31.07           |
| <i>Active Total</i> | <i>71.28</i> | <i>71.29</i> | <i>32.94</i>   | <i>32.95</i>    | <i>39.59</i>   | <i>39.60</i>    | <i>49.86</i>   | <i>49.87</i>    |
| <b>Delta</b>        | <b>0.00%</b> | <b>0.01%</b> | <b>-53.79%</b> | <b>-53.78%</b>  | <b>-44.46%</b> | <b>-44.45%</b>  | <b>-30.05%</b> | <b>-30.03%</b>  |

Without any optimization, the Microchip serial flash operates nearly 3.5x faster than the NOR based Numonyx device. Despite the faster dynamics, PACER was able to reduce the operation latency by nearly 54% using the timing heuristic. The energy and current heuristics were less effective. It was observed that the device would sometimes incur a large current spike following the first page write following the erase operation. This is a byproduct of the memory controller. Because the operation has a non-deterministic energy expenditure and fast dynamics, both PACER-E and PACER-C were not ideal.

The PACER-T algorithm achieved a 36% reduction in energy expenditure and a 54% reduction in latency.

#### 5.4.4 MicroSD Memory Card

As was shown in the initial IODVS research, microSD memory cards exhibit non-deterministic write timing. This is due in large part to the presence of caches and memory management units onboard the memory cards, as well as in small part due to the SDCard protocol itself. Because the SDCard protocol is polling based, it was prudent to ensure that the write-wait period is indeed voltage-independent. This was accomplished by constantly polling for write complete (a wait time of 0us).

The results are shown for each SDCard and indeed, the write-wait state is voltage independent. Furthermore, because of the non-deterministic nature of both time and energy for these operations, only the PACER-C algorithm was considered for optimization.

Energy consumption analysis for these devices is particularly difficult due to the non-deterministic nature of the write-wait time. The last operation with a fairly deterministic completion time is the write. The write-wait, readback and return to idle states all occur at different times during each test. It is also important to trigger the caching effect of each device and so more than one test must be run.

Therefore, the analysis is performed such that all phases after the write state are combined. This results in some amount of idle time being accumulated into each analysis. Because of this accumulation, the stated savings are actually lower than what could be achieved through further analysis.

#### 5.4.4.1 Sandisk SDSC 1.0GB Micro-SD Memory Card



Figure 60: A Single Standard Write to the Lexar microSD Memory Card



Figure 61: A Single IODVS Write to the Sandisk microSD Card



Figure 62: Sandisk microSD Card IODVS Write with Cache Hit Detected by PACER-C



Figure 63: Timing Distribution of Standard Writes to the Sandisk microSD Memory Card



**Figure 64: Timing Distribution of IODVS Writes to the Sandisk microSD Card**

The Sandisk microSD card caching statistics are provided in Figure 63 and Figure 64. It is evident that IODVS has no effect on the cache hit rate and therefore the operation can be considered voltage independent. Figure 60 demonstrates a single, typical write to the card, while Figure 61 shows an IODVS enabled write. Figure 62 demonstrates a write to the card with a cache hit and its detection by PACER-C.

**Table 24: Sandisk microSD Card Algorithm / Energy Summary (128 Samples Each)**

| Algorithm       | Energy Consumption (uJ) | Delta   |
|-----------------|-------------------------|---------|
| Control         | 17066.19                |         |
| IODVS           | 12554.76                | -26.43% |
| PACER-C         | 15198.41                | -10.94% |
| IODVS + PACER-C | 11848.78                | -30.57% |

#### 5.4.4.2 Lexar SDSC 1.0GB Micro-SD Memory Card



Figure 65: A Single Write to the Sandisk microSD Memory Card



Figure 66: Timing Distribution of Standard Writes to the Lexar microSD Card



**Figure 67: Timing Distribution of IODVS Writes to the Lexar microSD Memory Card**

The Lexar microSD card has a similar caching mechanism to the Sandisk card. Notably, both the hit and miss lobes are larger which results in more variability in completion. It is again demonstrated that IODVS has no effect on the hit/miss rate.

**Table 25: Lexar microSD Card Algorithm / Energy Summary (128 Samples Each)**

| Algorithm       | Energy Consumption (uJ) | Delta   |
|-----------------|-------------------------|---------|
| Control         | 22707.43                |         |
| IODVS           | 18244.40                | -19.65% |
| PACER-C         | 21427.71                | -5.64%  |
| IODVS + PACER-C | 16976.56                | -25.24% |

#### 5.4.4.3 Swissbit S-200U 512MB Micro-SD Memory Card



Figure 68: A Single Write to the Swissbit microSD Memory Card



Figure 69: Timing Distribution of Standard Writes to the Swissbit microSD Memory Card



**Figure 70: Timing Distribution of IODVS Writes to the Swissbit microSD Memory Card**

Completion detection is especially important for the Swissbit microSD card because of the wide range of completion times. The unit does appear to have some voltage dependence in caching. However, it is counterintuitive that decreasing the voltage seems to have increased the cache-hit rate. The device has the most complex MMU (perhaps fully associative) of the devices tested and appears to flush cache to FLASH on a voltage drop.

**Table 26: Swissbit microSD Card Algorithm / Energy Summary (128 Samples Each)**

| Algorithm       | Energy Consumption (uJ) | Delta   |
|-----------------|-------------------------|---------|
| Control         | 2762.57                 |         |
| IODVS           | 2334.43                 | -15.50% |
| PACER-C         | 913.68                  | -66.93% |
| IODVS + PACER-C | 553.48                  | -79.97% |

#### 5.4.4.4 Kingston SDHC 2.0GB Micro-SD Memory Card



**Figure 71: A Single Write to the Kingston microSD Memory Card**



**Figure 72: Timing Distribution of Standard Writes to the Kingston microSD Memory Card**



**Figure 73: Timing Distribution of IODVS Writes to the Kingston microSD Memory Card**

The Kingston microSD card did not appear to benefit from PACER because the write operations have such low latency and are highly clustered about a central value. The algorithm takes time to process all of the analog samples as they arrive and it appears to catch up at about the same time as the write completes.

**Table 27: Lexar microSD Card Algorithm / Energy Summary (128 Samples Each)**

| Algorithm       | Energy Consumption (uJ) | Delta  |
|-----------------|-------------------------|--------|
| Control         | 942.16                  |        |
| IODVS           | 900.86                  | -4.38% |
| PACER-C         | 933.83                  | -0.88% |
| IODVS + PACER-C | 896.58                  | -4.84% |

### 5.4.5 Honeywell HIH-6130 Temperature/Humidity Sensor



Figure 74: HIH-6130 IODVS Measurement



Figure 75: HIH-6130 IODVS + PACER-C Measurement

**Table 28: HIH-6130 Operation Energy**

| State               | Control       | IODVS          | PACER-T        | PACER-T + IODVS | PACER-E        | PACER-E + IODVS | PACER-C        | PACER-C + IODVS |
|---------------------|---------------|----------------|----------------|-----------------|----------------|-----------------|----------------|-----------------|
| Idle                | 5.19          | 3.60           | 5.28           | 3.64            | 5.24           | 3.62            | 5.33           | 3.67            |
| Writing             | 1.75          | 0.98           | 1.76           | 0.93            | 1.73           | 0.98            | 1.76           | 1.00            |
| Waiting             | 325.95        | 231.17         | 254.14         | 120.39          | 240.29         | 169.62          | 223.65         | 159.00          |
| Idle                | 0.08          | 0.05           | 0.09           | 0.05            | 0.12           | 0.10            | 0.09           | 0.05            |
| Reading             | 2.80          | 2.64           | 2.98           | 3.10            | 3.37           | 3.30            | 2.95           | 2.92            |
| Idle                | 28.04         | 26.18          | 105.49         | 83.50           | 106.34         | 84.62           | 105.31         | 83.60           |
| <i>Active Total</i> | <i>330.50</i> | <i>234.79</i>  | <i>258.88</i>  | <i>124.41</i>   | <i>245.39</i>  | <i>173.89</i>   | <i>228.36</i>  | <i>162.91</i>   |
| <b>Delta</b>        | <b>0.00%</b>  | <b>-28.96%</b> | <b>-21.67%</b> | <b>-62.36%</b>  | <b>-25.75%</b> | <b>-47.39%</b>  | <b>-30.91%</b> | <b>-50.71%</b>  |

**Table 29: HIH-6130 Operation Latency**

| State               | Control      | IODVS        | PACER-T        | PACER-T + IODVS | PACER-E        | PACER-E + IODVS | PACER-C        | PACER-C + IODVS |
|---------------------|--------------|--------------|----------------|-----------------|----------------|-----------------|----------------|-----------------|
| Idle                | 1.01         | 1.01         | 1.01           | 1.01            | 1.01           | 1.01            | 1.03           | 1.03            |
| Writing             | 0.23         | 0.44         | 0.23           | 0.44            | 0.23           | 0.44            | 0.23           | 0.44            |
| Waiting             | 45.27        | 45.27        | 31.66          | 31.44           | 31.45          | 31.41           | 31.70          | 31.60           |
| Idle                | 0.01         | 0.01         | 0.01           | 0.01            | 0.01           | 0.01            | 0.01           | 0.01            |
| Reading             | 0.49         | 0.95         | 0.49           | 0.95            | 0.49           | 0.95            | 0.49           | 0.95            |
| Idle                | 4.98         | 4.32         | 18.59          | 18.15           | 18.80          | 18.18           | 18.54          | 17.98           |
| <i>Active Total</i> | <i>45.99</i> | <i>46.65</i> | <i>32.38</i>   | <i>32.82</i>    | <i>32.17</i>   | <i>32.80</i>    | <i>32.42</i>   | <i>32.98</i>    |
| <b>Delta</b>        | <b>0.00%</b> | <b>1.44%</b> | <b>-29.59%</b> | <b>-28.63%</b>  | <b>-30.05%</b> | <b>-28.69%</b>  | <b>-29.50%</b> | <b>-28.28%</b>  |

## 5.5 Conclusion

The PACER algorithms successfully achieved not only significant speedups in most of the test cases, but also achieved significant reductions in the energy required to perform operations. For the cases with deterministic timing, the PACER-T algorithm is superior to the other options because it has the lowest computational overhead and is thus able to respond most quickly.

None of the devices tested use an external pin that could trigger an interrupt which would signal operation completion. Such devices do exist, such as the SiLabs Si114x series of optical detectors. Bonding an additional pin out of the device package increases package cost and size. Additionally, design choices such as open-collector or push-pull and active-high or active-low must be made in hardware which limits the utility of such a pin. While

potentially less deterministic, PACER presents an alternative means to achieve similar functionality.

The devices exhibiting non-deterministic timing were more challenging. Of the devices tested, the PACER-C algorithm was able to decrease energy consumption significantly. None of the devices benefited from the PACER-E algorithm because none of the operations were energy bound and analog noise results in less optimal wake-up times.

The tests generally saw higher power usage than the original IODVS numbers due to the higher domain capacitance on PRIME rather than the original PEGMA board. The domain capacitance is likely higher than it needs to be for the load dynamics seen in the tests and so more energy reductions are very likely possible.

# CHAPTER 6: CONCLUSIONS

## 6.1 Conclusions

Embedded systems are naturally evolving to incorporate multiple voltage domains in order to take advantage of Dynamic Voltage Scaling on microcontrollers. GPIODVS is able to utilize of the peripheral voltage domain in order to achieve reduced latency and energy consumption by exploiting the slack in peripheral voltage and timing specifications.

This work presented two specialized embedded systems that were designed to validate two novel approaches to peripheral energy management. Each approach accomplished significant energy savings on their own individually, and even greater energy savings when the techniques were combined.

Ultimately the energy savings found through the combination of IODVS and PACER result in either increased system performance or decreased cost. For the case of a battery-backed system where energy consumption is dominated by EEPROM writes, the algorithms reduced energy consumption by 40%. These savings would be likely to directly result in a 40% longer battery life depending on battery chemistry. Alternatively, the designer could take advantage of the power reductions and include a battery with 40% less capacity. At current prices, the Illinois Capacitor 500mAh vs 300mAh lithium coin batteries are priced at \$27.29 vs \$12.15. Thus, the energy savings could yield a price decrease of over \$15 per unit by decreasing the price of the battery by 55.47%.

## 6.2 Future Work: PRIME Enhancements

Combining the ASDM-300F with the DEB429A appears to have resulted in a higher-than-ideal domain capacitance. Further testing should be done to determine what the minimum amount of capacitance necessary for the ASDM-300F to supply the needs of peripherals on the DEB429A. It is worth noting that if the capacitance is too low on the ASDM-300F, that the power supply will need more time to charge peripheral devices as they are switched on to the domain under test.

The ASDM-300F is using a current sense amplifier with a falloff frequency of 1.2MHz. Unfortunately the SMPS designed into the ASDM-300F switches at approximately 2MHz. It was calculated that the input current to the device may indeed be significantly higher than measured which limits the ability to glean actionable information from them. No integrated device currently exists which can fulfill this need. It would be useful to design a new circuit without this limitation.

## 6.3 Future Work: Supervised IODVS

IODVS has been shown to considerably reduce energy consumption in embedded peripherals. Throughout the course of previous work, the problem of interfering voltage changes was encountered. That is, where two or more peripherals coexist on the same voltage domain and one peripheral is more tolerant of voltage changes than others.

For example, the EEPROM in previous experiments was capable of operating at 1.8V while the SDCard on the same domain would undergo a reset condition if the domain voltage were switched temporarily to 1.8V. Not only would the SDCard need to undergo a lengthy reset procedure, but the device driver expects the SDCard to be in an operational state when the next device access is issued. It is unlikely that the device driver would be able to handle the condition. Designing the device driver to handle random state changes outside of its control would result in a very inefficient driver.

A number of options are available to address the problem of domain voltage interference. The trivial solution would be to put each device on its own individual voltage domain. Of course, implementing a voltage domain for each peripheral in an embedded system would be both cost and size prohibitive. Additionally, this method inevitably operates an SMPS in a very inefficient voltage translation region (the very lightly loaded region).

A voltage supervisor implemented at the driver or OS level is a natural fit for this type of problem. A simple mitigating option is to notify the drivers of all devices on a domain about voltage changes that are taking place. If two devices are on the same voltage domain and an IODVS voltage change is requested, then the other device driver is notified of the change. In this way, at least the device and driver can maintain consistency and the potential for devastating faults is reduced.

In fact, the supervisor opens up a number of options for mitigating inter-device interference. By registering the peripheral power profile with the supervisor, individual drivers can provide feedback to the supervisor as to how a voltage change would affect the device. If a device on the voltage domain would be affected by the requested change, then that change would be vetoed by the supervisor. Furthermore, rather than a simplistic binary decision, a temporal cost is evaluated against each voltage change. That is, for instance the time required to reinitialize an SDCard after a voltage change that causes a reset.

Ultimately, usage statistics like those used in DPM implementations are to be investigated in order to both optimize energy consumption and minimize response time. These usage statistics are contrasted against the temporal cost of voltage changes on the domain.

## 6.4 Future Work: PACER Missed Prediction Analysis

The PACER algorithms detect timing slack in peripheral operations. The detection is used to perform follow-up operations such as verification earlier than is specified by the datasheet. At best, predictions will always be slightly early or slightly late due to minuscule timing variations on the MCU or analog noise in the case of PACER-E or PACER-C.

It would be useful to investigate the energy costs of early vs late missed predictions. In the case of early predictions, the peripheral and MCU usually transition into a voltage-dependent communicating state. The operation can still usually complete correctly by polling onboard operation-complete status bits. For late predictions, the peripheral causes unnecessary (typically idle) energy consumption and increases overall latency. Prediction analysis would be particularly interesting as peripherals are analyzed across the temperature curve such that the operation timing is changing. A per-device cost model could be created wherein the costs of early vs late predictions are analyzed and thus further optimizing the energy and latency reductions offered by PACER.

## References

- [1] C. Spurlin, "Voltage Translation Between 3.3-V, 2.5-V, 1.8-V, and 1.5-V Logic Standards With the TI AVCA164245 and AVCB164245 Dual-Supply Bus-Translating Transceivers," July 2004. [Online]. Available: <http://www.ti.com/lit/an/scea030a/scea030a.pdf>. [Accessed 26 September 2015].
- [2] M. I. S. Inc., "MAX1595," October 2011. [Online]. Available: <http://www.maximintegrated.com/en/products/power/charge-pumps/MAX1595.html>. [Accessed April 2015].
- [3] P. LeFevre, "Digital Power control delivers efficiency in high-power applications," *EE-Times*, p. [http://www.eetimes.com/document.asp?doc\\_id=1273277](http://www.eetimes.com/document.asp?doc_id=1273277), 29 October 2008.
- [4] B. Brock and K. Rajamani, "Dynamic power management for embedded systems [SOC design]," in *SOC Conference, 2003. Proceedings. IEEE International [Systems-on-Chip]*, 2003.
- [5] R. Jejurikar and R. Gupta, "Dynamic Voltage Scaling for Systemwide Energy Minimization in Real-Time Embedded Systems," in *Proceedings of the 2004 International Symposium on Low Power Electronics and Design, ISLPED*, 2004.
- [6] A. Kahng, R. Kumar, S. Kang and J. Sartori, "Enhancing the Efficiency of Energy-Constrained DVFS Designs," *IEEE Transactions on VLSI Systems*, vol. 21, no. 10, pp. 1769 - 1782, 2013.
- [7] H. Cheng and S. Goddard, "Online energy-aware I/O device scheduling for hard real-time systems," in *Design, Automation and Test in Europe, 2006. DATE '06. Proceedings*, 2006.
- [8] C. Kumar, M. Sindhwan and T. Srikanthan, "Profile-based technique for Dynamic Power Management in embedded systems," in *Electronic Design, 2008. ICED 2008. International Conference on*, 2008.
- [9] V. Swaminathan and K. Chakrabarty, "Energy-conscious, deterministic I/O device scheduling in hard real-time systems," in *Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on*, 2003.

- [10] S. Irani, S. Shukla and R. Gupta, "Competitive analysis of dynamic power management strategies for systems with multiple power saving states," in *Design, Automation and Test in Europe Conference and Exhibition, 2002. Proceedings*, 2002.
- [11] W.-C. Chen, S.-Y. Ping, T.-C. Huang, Y.-H. Lee, K.-H. Chen and C.-L. Wey, "A Switchable Digital-Analog Low-Dropout Regulator for Analog Dynamic Voltage Scaling Technique," *IEEE Journal of Solid State Circuits*, vol. 49, no. 3, pp. 740-750, 2014.
- [12] S. Lee and J. Kim, "Using Dynamic Voltage Scaling for Energy-Efficient Flash-based," in *ISOCC*, Incheon, 2010.
- [13] D. Li, P. Chou and N. Bagherzadeh, "Mode selection and mode-dependency modeling for power-aware embedded systems," *Design Automation Conference, 2002. Proceedings of ASP-DAC 2002. 7th Asia and South Pacific and the 15th International Conference on VLSI Design. Proceedings*, pp. 697-704, 2002.
- [14] T.-Y. H. C.-H. T. J.-J. C. T.-W. K. Edward T.-H. Chu, "A DVS-assisted hard real-time I/O device scheduling algorithm," *Real-Time Systems*, vol. 41, pp. 222-255, February 2009.
- [15] Y.-S. Hwang, S.-K. Ku and K.-S. Chung, "A predictive dynamic power management technique for embedded mobile devices," in *Consumer Electronics, IEEE Transactions on*, 2010.
- [16] W. Dargie, "Dynamic Power Management in Wireless Sensor Networks: State-of-the-Art," *IEEE Sensors Journal*, vol. 12, no. 5, pp. 1518 - 1528, 2012.
- [17] Coronis, "Waveflow Wireless Smart Meter Transceiver and Data Logger," 2009. [Online]. Available: [http://www.elster.com/assets/products/products\\_elster\\_files/CS-COMM-SPRD-WFL2-E01.pdf](http://www.elster.com/assets/products/products_elster_files/CS-COMM-SPRD-WFL2-E01.pdf). [Accessed 25 October 2015].
- [18] U. Kulau, F. Busching and L. Wolf, "A Node's Life: Increasing WSN Lifetime by Dynamic Voltage Scaling," in *IEEE International Conference on Distributed Computing in Sensor Systems*, Cambridge, 2013.
- [19] L. Hormann, P. Glatz, C. Steger and R. Weiss, "Evaluation of component-aware dynamic voltage scaling for mobile devices and wireless sensor networks," in *World of*

*Wireless, Mobile and Multimedia Networks (WoWMoM), 2011 IEEE International Symposium*, 2011.

- [20] L. Hormann, P. Glatz, C. Steger and R. Weiss, "Energy Efficient Supply of WSN Nodes using Component-Aware Dynamic Voltage Scaling," in *European Wireless*, Vienna, 2011.
- [21] T. Instruments, "Selecting the Right Level-Translation Solution," June 2004. [Online]. Available: <http://www.ti.com/lit/an/scea035a/scea035a.pdf>.
- [22] Texas Instruments Incorporated, "Texas Instruments Website," September 2007. [Online]. Available: <http://www.ti.com/product/tps62240>.
- [23] Maxim Integrated Solutions, "Maxim Integrated Solution Website," 10 2012. [Online]. Available: <http://datasheets.maximintegrated.com/en/ds/MAX4376-MAX4378.pdf>.
- [24] Microchip Technology Inc., May 2010. [Online]. Available: <http://www.microchip.com/wwwproducts/Devices.aspx?dDocName=en530926>.
- [25] Micron Technology Inc., 2012. [Online]. Available: <http://www.micron.com/partsnor-flash/serial-nor-flash/m25px16-VMN6P>.
- [26] S. Association, January 2013. [Online]. Available: [https://www.sdcard.org/downloads/pls/simplified\\_specs/part1\\_410.pdf](https://www.sdcard.org/downloads/pls/simplified_specs/part1_410.pdf).
- [27] N. Semiconductors, April 2013. [Online]. Available: [http://www.nxp.com/documents/application\\_note/AN10911.pdf](http://www.nxp.com/documents/application_note/AN10911.pdf).
- [28] S. Corporation, May 2011. [Online]. Available: <http://www.farnell.com/datasheets/1633579.pdf>.
- [29] S. Corporation, April 2012. [Online]. Available: [http://www.supertalent.com/datasheets/5\\_112.pdf](http://www.supertalent.com/datasheets/5_112.pdf).
- [30] Honeywell International Inc., 2013. [Online]. Available: [http://sensing.honeywell.com/product-page?pr\\_id=142040](http://sensing.honeywell.com/product-page?pr_id=142040).
- [31] Honeywell International Inc., "Technical Note: Honeywell Sensing and Control," 7 June 2012. [Online]. Available: [http://sensing.honeywell.com/i2c%20comms%20humidicon%20tn\\_009061-2-en\\_final\\_07jun12.pdf](http://sensing.honeywell.com/i2c%20comms%20humidicon%20tn_009061-2-en_final_07jun12.pdf).

- [32] Micrel Inc., "Micrel Incorporated," 2012. [Online]. Available: <http://ww1.microchip.com/downloads/en/DeviceDoc/MIC94325.pdf>. [Accessed 27 October 2016].
- [33] Resistor Guide, "Resistor Noise," Resistor Guide, [Online]. Available: <http://www.resistorguide.com/resistor-noise/>. [Accessed 27 October 2016].
- [34] STMicroelectronics, "STMicroelectronics Inc.," [Online]. Available: [http://www.st.com/resource/en/schematic\\_pack/stm32f429i-disco\\_sch.zip](http://www.st.com/resource/en/schematic_pack/stm32f429i-disco_sch.zip). [Accessed 26 October 2016].
- [35] Texas Instruments, "Texas Instruments," 2003. [Online]. Available: <http://www.ti.com/lit/ds/symlink/sn54hc259.pdf>. [Accessed 28 October 2016].
- [36] Future Technology Devices International Ltd., "UM232H Datasheet," 2012. [Online]. Available: <http://www.ftdi.com>.
- [37] Microchip Technology Inc., "SST26VF064B," 2015. [Online]. Available: <http://ww1.microchip.com/downloads/en/DeviceDoc/20005119G.pdf>. [Accessed 28 October 2016].
- [38] Silicon Laboratories, "Si114x Ultraviolet (UV) Index, Gesture, Proximity and Ambient Light Sensor ICs," [Online]. Available: <http://www.silabs.com/products/sensors/infraredsensors/Pages/si114x.aspx>. [Accessed 29 October 2016].
- [39] D. Lee, Y. Kim, G. Pekhimenko, S. Khan, V. Seshadri, K. Chang and O. Mutlu, "Adaptive-latency DRAM: Optimizing DRAM timing for the common-case," in *IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)*, 2015.
- [40] S. Sadeghi-Kohan, M. Kamal, J. McNeil, P. Prinetto and Z. Navabi, "Online self adjusting progressive age monitoring of timing variations," in *10th International Conference on Design & Technology of Integrated Systems in Nanoscale Era (DTIS)*, 2015.
- [41] D. S. S. Etter, "Adaptive Estimation of Time Delays in Sampled Data Systems," in *IEEE Transactions on Acoustics Speech and Signal Processing*, 1981.

- [42] S. G. P. A. Z. E. Tarasov V, "Efficient I/O Scheduling with Accurately Estimated Disk Drive Latencies," in *The Proceedings of OSPERT 2012*, 2012.
- [43] H. Macicior, M. Oyarbide, O. Miguel, I. Cantero, J. Canales and A. Etxeberria, "Iterative capacity estimation of LiFePO4 cell over the lifecycle based on SoC estimation correction," in *Electric Vehicle Symposium and Exhibition (EVS27)*, 2013.
- [44] H. Mahanta, A. Azad and A. Khan, "Power analysis attack: A vulnerability to smart card security," in *International Conference on Signal Processing And Communication Engineering Systems (SPACES)*, 2015.
- [45] M. Petrvalsky, M. Drutarovsky and M. Varchola, "Differential power analysis attack on ARM based AES implementation without explicit synchronization," in *Radioelektronika 2014 24th International Conference*, 2014.

## **APPENDIX**

## APPENDIX A: PEGMA SCHEMATIC

1. Microcontroller pinout and SRAM connection
2. Renewable input boost circuitry, measurement and modulation
3. Energy storage and peripheral boost circuitry
4. Stepdown power supplies (peripheral domains)
5. Peripheral domain current measurement
6. Peripherals under test
7. Communications peripherals
8. Analog domain

## 7.1 Microcontroller pinout and SRAM connection



## 7.2 Renewable input boost circuitry, measurement and modulation



## 7.3 Energy storage and peripheral boost circuitry



## 7.4 Stepdown power supplies (peripheral domains)



## 7.5 Energy storage and peripheral boost circuitry



## 7.6 Peripheral domain current measurement



## 7.7 Communications peripherals



## 7.8 Analog domain



## APPENDIX B: ASDM-300F SCHEMATIC



## APPENDIX C: PPS-330D SCHEMATIC



## APPENDIX D: PLR-5010D (REV0) SCHEMATIC



## APPENDIX E: PLR-5010D (REV1) SCHEMATIC



# APPENDIX F: DEB429A SCHEMATIC





