

# Auto-Tuning Aging Sensor Validated Under Burn-In, Temperature, and Voltage Variations

Lucas Nogueira\*, Mirailton F.\*, Danilo Alencar\*, Alisson J.\*, Jardel Silveira\*, Jarbas Silveira\*, Fabian Vargas†

\*Universidade Federal do Ceará, Fortaleza, Brazil; †IHP – Leibniz Institute for High Performance Microelectronics, Frankfurt (Oder), Germany

lucas.nogueira@lesc.ufc.br, mirailtonfo@gmail.com, danilo.alencar@lesc.ufc.br, alissonjsb4@gmail.com, jardel@ufc.br, jarbas@lesc.ufc.br, vargas@ihp-microelectronics.com

**Abstract**—The increasing integration of electronic systems in critical applications has made the prediction and mitigation of circuit aging effects a key challenge for ensuring long-term reliability. Traditional solutions such as guard-banding compromise performance, power, area and ultimately cost, highlighting the need for better solutions. This work proposes a self-tuning on-chip aging sensor capable of accurately tracking in-mission critical path time slack erosion, without the need for manual recalibration. The proposed solution enhances a baseline sensor architecture by incorporating a dynamic phase shift mechanism for autonomous slack detection during operation. The sensor was implemented in an FPGA and validated using a comprehensive experimental commercial platform, enabling controlled variation of supply voltage, core temperature, and accelerated aging through burn-in stress. Experimental results showed that the sensor tracks path delay increase under voltage reduction and temperature rise, while continuously monitoring aging effects. Functional failures were observed shortly after the sensor indicated a time slack approaching zero and so, confirming its predictive capability. This approach allows in-mission mode aging monitoring, which is a paramount condition for Silicon Lifecycle Management (SLM) strategies.

**Index Terms**—On-Chip Sensor, Reliability Evaluation, Silicon Lifecycle Management (SLM), FPGA Aging

## I. INTRODUCTION

With a high demand for performance and efficiency, device reliability has become a critical concern, specifically in dependable or safety-critical systems. The constant exposure of electronic systems to harsh environments results in significant hardware degradation, increasing the likelihood of failure [1], reducing the reliability of long-life operating products and also affecting the performance of these systems.

One of the main issues associated with this degradation is the aging of electronic devices, caused by the occurrence of Bias Temperature Instability (BTI) and Hot Carrier Injection (HCI) effects [2]. These effects are responsible for increasing the threshold voltage of the transistor, leading to a degraded drive current and an increase in propagation delay, determining the aging rate of the said device [3].

With the downscaling of transistors, these aging effects become progressively more problematic. The increase in power densities and reduced voltage margins contribute to the accelerated degradation through BTI and HCI. Moreover, this is followed by an increase in sensitivity to process, supply voltage, and temperature (PVT) variations [4], [5].

It is worth noting that in space applications, total-ionizing-dose (TID) radiation is a primary concern for IC designers, as it accelerates aging and thus increases circuit response time

(due to threshold-voltage shifts and leakage-current rises). In advanced aging states, desynchronization may occur, leading first to transient faults and, ultimately, permanent failures at end-of-life [6]–[8].

To mitigate these effects, guard-banding is heavily relied upon as a traditional practice in the industry, in an attempt to prolong the useful life of the device [9]. Although applying this technique may provide higher robustness for the device, reducing the clock could also reduce the performance of the system [10]. On the other hand, a higher operating voltage will directly increase the aging rate through BTI degradation [9], [10].

As an alternative, a more sophisticated approach is the prediction of failures through aging sensors [11]. Instead of compromising system performance, these sensors allow for the anticipation of transient and permanent failures by monitoring circuit response time degradation, and thus, enabling proactive actions before a malfunction occurs.

The works presented in [2], [4], [5], [12], [13] address aging detection techniques through the monitoring of critical path delays, generally employing flip-flops, phase-shifted clocks, delay elements, and XOR gates. In these approaches, the sensors require prior calibration specifically tailored to the monitored critical path in order to detect when it surpasses a predefined aging threshold. Consequently, these sensors operate only as indicators that the circuit has reached a fixed degradation point, effectively functioning as a *checkpoint*.

Within this context, the empirical validation of aging sensors and aging studies through FPGA-based implementations emerges as a cost-effective and versatile approach, due to the reconfigurability and accessibility of these platforms.

Several studies have investigated aging effects in FPGAs, including [5], [14]–[16]. The aging induction methods employed are primarily based on burn-in procedures, involving elevated temperatures typically ranging from 85°C to 125°C, often complemented by concurrent voltage stress, with supply voltages increased by 10% to 50% above the nominal  $V_{DD}$ . Standard aging procedures based on MIL-STD-883 and JEDEC A108D specifications [17]–[19] are frequently used as procedural references.

### A. Key Contributions of This Work

Despite the advances observed in these works, there is still a lack of practical, runtime-capable solutions that provide self-calibration mechanisms and empirical validation under real

aging stress conditions. This work addresses these gaps by proposing a sensor architecture capable of autonomous tuning during operation, validated through voltage, temperature, and accelerated aging stress experiments.

The key contributions of this work are as follows:

- 1) A self-calibration mechanism for the aging sensor, enabling precise mission-mode on-chip monitoring to support Silicon Lifecycle Management (SLM).
- 2) Empirical validation of the sensor under real FPGA implementation conditions, including testing across voltage and temperature variations, as well as accelerated aging experiments.

## II. METHODOLOGY

### A. Sensor Architecture

The sensor architecture employed in this work is based on the structure proposed in [4], and is illustrated in Figure 1.



Fig. 1. Logical structure of the aging sensor, composed of flip-flops, XOR gate, and phase-shifted clocks.

The functional goal of the sensor is to estimate the *slack* of the monitored critical path, defined as the difference between the clock period and the path propagation delay. To determine this slack, the sensor undergoes a calibration process. Initially, *psclk* is aligned with *sys\_clk* and then progressively shifted to the left (i.e., advanced). This shifting continues until the alarm is triggered. The instant at which the alarm is activated indicates that *psclk* has reached the boundary of the available slack, meaning that the applied phase shift corresponds to the slack value.

Figures 2 and 3 illustrate the timing behavior of the signals involved in the sensor operation. Figure 2 presents a scenario in which the sensor is calibrated and the circuit has not yet undergone aging, so the alarm is not triggered. Conversely, Figure 3 depicts a situation in which the circuit has aged, increasing the combinational delay of the critical path and leading to alarm activation under the same calibration conditions.

It is important to note that the proposed sensor detects any critical-path delay increase regardless of origin, whether (1) a rise in operating-environment temperature; (2) a localized hotspot or techniques like overclocking/over-volting (overVDD); or (3) time-dependent wear-out mechanisms such as NBTL/PBTI, HCI, or electromigration.

Moreover, power-gating the sensor during idle periods virtually eliminates any residual aging exposure and should therefore be employed whenever possible to prevent the sensor from aging itself.



Fig. 2. Timing diagram of the sensor operation when the circuit is not aged, showing the combinational delay and the available slack. Alarm not triggered.



Fig. 3. (1) The input flip-flops capture distinct values, resulting in (2) the activation of the alarm, which remains active until (3) the reset signal is asserted.

### B. Dynamic Phase Shift and Sensor Auto-Tuning Mechanism

In the conventional approach, sensor calibration is performed statically by defining the phase of *psclk* during the FPGA configuration stage. In this scenario, once the alarm is triggered, it is only possible to conclude that the circuit has aged and that the available *slack* has decreased, without the ability to recalibrate the sensor unless the FPGA is reprogrammed. Based on this limitation, this work proposes an adaptation and enhancement of the original sensor architecture.

By employing the dynamic phase shift mechanism available in the MMCMs (Mixed-Mode Clock Managers) of Xilinx 7 Series FPGAs and newer, it becomes possible to perform highly precise runtime adjustments to the phase of a clock relative to a reference. The configuration procedure and the calculation of phase resolution per increment are detailed in [20]. In this work, each increment was configured to correspond to a phase shift of 19.85 ps. The greater the number of increments applied, the larger the resulting phase shift.

Applying this mechanism to *psclk*, using *sys\_clk* as a reference, enables the use of internal control signals to automatically estimate the *slack*. Moreover, it allows the sensor to be recalibrated at runtime whenever a variation in the critical path delay is detected.

The control flow of the sensor's self-calibration mechanism is illustrated in Figure 4. Initially, *psclk* is aligned with *sys\_clk*. The controller continuously monitors the sensor's alarm signal. While the alarm remains inactive (logic LOW), the controller issues commands to the MMCM to increment the phase of *psclk*, gradually shifting it to the left. This process is repeated until the alarm is triggered. At that point, the controller records the number of increments applied. By multiplying this number by the phase resolution per increment, the slack value is calculated in picoseconds.

Once the calibration process is complete, the sensor is considered calibrated, and the controller remains in an idle

state until a *reset* signal is received. This signal reinitializes the state machine, realigns the clocks, and restarts the calibration procedure. The frequency of the *reset* signal should be defined according to the specific requirements of the monitored system.



Fig. 4. State machine diagram of the sensor's self-calibration controller.

### C. Considerations for FPGA Implementation of the Sensor

It is important to highlight the design considerations and necessary adaptations for the physical implementation of the sensor on FPGAs. First, it is essential that the sensor is placed as close as possible to the output of the monitored critical path. The greater the distance, the higher the interconnection delay will be. Ideally, the total delay observed should be dominated by the logical delay of the critical path itself. Increased routing delay reduces the accuracy of the *slack* estimation.

Additionally, the sensor logic requires one LUT2 to implement the XOR gate and three flip-flops, each operating under a distinct clock domain. Since each FPGA *slice* can only be driven by a single clock, the flip-flops must be placed in separate *slices*. Therefore, at least three distinct *slices* are required to implement the sensor.

Finally, it is expected that the routing of the signal from the critical path to flip-flops *FF\_sys\_clock* and *FF\_psclk* will differ, potentially introducing a natural skew in the signal arrival times, typically in the range of 200 to 300 ps.

### D. Experimental Setup: On-Chip Sensor Testing Platform

To conduct the physical evaluation of the sensor, exposing it to temperature variations, voltage variations, and burn-in procedures, a test environment was set up as illustrated in Figure 5. The objective of the experiments was to verify whether the sensor's behavior remained within the expected results under different stress conditions.

The test environment comprised a digital temperature controller (1), used to precisely regulate the temperature of a thermal aging oven (5); a programmable power supply (2), employed to adjust the core voltage of the FPGA; a computer (3), responsible for collecting experimental data via serial JTAG communication, including readings of temperature, core voltage, and the number of phase shift increments of the sensor; and the FPGA under test (4).

Figure 6 presents the general architecture of the experimental setup implemented within the FPGA. The aging sensor was connected to the critical path, modeled, for simplicity, as a chain of 50 inverter gates in series. The alarm signal generated

by the sensor is sent to the Self-Calibration and Control Block (SCC Block), which dynamically adjusts the sensor phase by issuing commands to the MMCM. Additionally, the temperature and voltage data obtained from the integrated Xilinx XADC [21] are transmitted to the test host computer, together with the number of phase shift increments reported by the SCC Block.

The FPGA used for the experimental tests is a device from the Artix-7 family, implemented on the Nexys 4 DDR development board. It is manufactured using a 28nm CMOS process and operates at a nominal  $V_{DD}$  of 1V. The recommended operating ranges are provided in the manufacturer's documentation [22], and these intervals were used as references to define the corners evaluated in the aging study of the device.



Fig. 5. Physical testing setup for the aging sensor.



Fig. 6. General architecture of the experimental setup implemented on the FPGA.

## III. RESULTS AND DISCUSSION

### A. Voltage and Temperature Sweep Tests

As demonstrated by the semi-empirical model proposed in [23], validated through electrical simulations in 65 nm,

45 nm, and 32 nm CMOS technologies, supply voltage and temperature exert opposite effects on gate propagation delay.

A reduction in VDD leads to an increase in delay, whereas a rise in temperature results in a decrease in delay, primarily due to the non-linear behavior of carrier mobility.

Figure 7.A presents the results obtained by the sensor when maintaining a constant supply voltage while varying the FPGA core temperature. As expected, the increase in temperature results in an increase in critical path delay, indicating slower operation. Similarly, Figure 7.B shows the results obtained by maintaining a constant temperature while varying the core supply voltage. As anticipated, reducing the supply voltage leads to an increase in critical path delay.



Fig. 7. Critical path slack variation: (A) with temperature increase at constant supply voltage, and (B) with supply voltage reduction at constant temperature.

#### B. Accelerated Aging through Burn-in Testing

The accelerated aging procedure was conducted as follows. Initially, baseline measurements (Week 0) of the critical path performance, as observed by the aging sensor, were collected for each of the six defined operating corners.

Subsequently, the FPGA was subjected to a temperature of 115 °C and a +30% increase in core supply voltage (VDD) for one week. At the end of this first stress period (Week 1), the delay measurements for each corner were recorded again.

In order to induce a more significant level of aging, the stress conditions were intensified: the FPGA was exposed to a temperature of 125 °C and a +40% VDD increase for an additional week. At the end of this second burn-in period (Week 2), a final set of measurements was performed.

The slack evolution for each corner, along with the associated measurement conditions, is summarized in Table I. It is important to highlight that the aging conditions applied during the burn-in phases were different from the conditions under which each corner was measured.

The results obtained are consistent with theoretical expectations: corners combining low supply voltage and high temperature exhibited the lowest slack values (indicating slower paths), while corners with higher supply voltage and lower temperature presented the highest slack values (indicating faster paths).

#### C. Functional Failure Point Characterization

Critical path failure occurs when a timing violation occurs, that is, when the calculation time of a combinational circuit exceeds the clock period. This is crucial for our sensor, as detecting a timing violation only after the sensor indicates a 0ps slack time serves as an effective validation method.

TABLE I  
SLACK IN PICoseconds (ps) FOR EACH CORNER, MEASURED AT WEEK 0, WEEK 1, AND WEEK 2.

| Corner  | Temp. (°C) | Voltage (V) | Week 0 (ps) | Week 1 (ps) | Week 2 (ps) |
|---------|------------|-------------|-------------|-------------|-------------|
| Corner1 | 85         | 0.950       | 744         | 613         | 357         |
| Corner2 | 85         | 1.000       | 1518        | 1355        | 1113        |
| Corner3 | 85         | 1.050       | 1975        | 1897        | 1681        |
| Corner4 | 33         | 0.950       | 704         | 649         | 234         |
| Corner5 | 33         | 1.000       | 1607        | 1488        | 1208        |
| Corner6 | 33         | 1.050       | 2233        | 2139        | 1887        |

Figure 8 illustrates the signal states at the start and the end of the critical path, just before the final flip-flop. A D flip-flop, acting as the *failure catcher*, is employed for detecting the failure point. This element captures the XOR result between the Q outputs at the start and end of the critical path. The *failure catcher* triggers on the positive edge of the Q output at the critical path's end. If the end goes HIGH, the *failure catcher* will only register HIGH if the starting flip-flop is LOW, due to the XOR gate.



Fig. 8. Illustration of the signal behavior at the start and end of the critical path. A mismatch detected by the failure catcher indicates a timing violation due to insufficient slack.

As demonstrated, the signals differ only in the event of a timing violation, where the signal has not fully propagated by the time the positive clock edge occurs.

This setup was validated experimentally by inducing a timing failure via voltage reduction, and the failure point was correctly detected by our sensor, with a small margin of error due to clock jitter and flip-flop routing skew.

#### IV. CONCLUSIONS AND FUTURE WORK

The performance and behavior of the proposed sensor were consistent with expectations based on the validation tests and procedures discussed in this work. These results provide strong support for the use of the sensor as an important tool for Silicon Lifecycle Management (SLM) applications.

Within this context, the information regarding the evolution of circuit aging could be leveraged by applying predictive techniques similar to those employed by [24], but adapted to forecast the aging rate and estimate the time to failure based on the sensor's data.

Furthermore, these results open the possibility for future integration of predictive models aimed at degradation mitigation and lifetime forecasting, following strategies explored in works such as [25], where on-chip sensors (e.g., temperature, soft-error rate, and aging monitors) are used to guide system reconfiguration and activity management to reduce aging effects. Compared to such approaches, our proposed sensor provides more accurate slack information and enables direct estimation of the aging rate, offering enhanced capabilities for predictive maintenance.

## REFERENCES

- [1] L. Lanzieri, G. Martino, G. Fey, H. Schlarb, T. C. Schmidt, and M. Wählisch, "A review of techniques for ageing detection and monitoring on embedded systems," *ACM Comput. Surv.*, vol. 57, no. 1, Oct. 2024. [Online]. Available: <https://doi.org/10.1145/3695247>
- [2] Z. Ghaderi, M. Ebrahimi, Z. Navabi, E. Bozorgzadeh, and N. Bagherzadeh, "SENSIBLE: A highly scalable SENsor DESign for path-based age monitoring in FPGAs," *IEEE Transactions on Computers*, vol. 66, no. 5, pp. 919–926, May 2017, doi: 10.1109/TC.2016.2622688.
- [3] A. Amouri and M. Tahoori, "High-level aging estimation for fpga-mapped designs," in *22nd International Conference on Field Programmable Logic and Applications (FPL)*, 2012, pp. 284–291, doi: 10.1109/FPL.2012.6339194.
- [4] F. Vargas, V. Galstyan, G. Harutyunyan, and Y. Zorian, "On-chip sensor to monitor aging evolution in finfet-based memories," in *2024 IEEE 30th International Symposium on On-Line Testing and Robust System Design (IOLTS)*. Rennes, France: IEEE, 2024, pp. 1–6, doi: 10.1109/IOLTS60994.2024.10616091.
- [5] M. D. Valdes-Peña, J. Fernández Freijedo, M. J. Moure Rodríguez, J. J. Rodríguez-Andina, J. Semião, I. M. C. Teixeira, J. P. C. Teixeira, and F. Vargas, "Design and validation of configurable online aging sensors in nanometer-scale fpgas," *IEEE Transactions on Nanotechnology*, vol. 12, no. 4, pp. 508–517, 2013, doi: 10.1109/TNANO.2013.2253795.
- [6] J. Benfica and et al., "Analysis of sram-based fpga seu sensitivity to combined emi and tid-imprinted effects," *IEEE Transactions on Nuclear Science*, vol. 63, no. 2, pp. 1294–1300, Apr. 2016, doi: 10.1109/TNS.2016.2523458.
- [7] T. Calin, F. L. Vargas, and M. Nicolaidis, "Upset-tolerant cmos sram using current monitoring: Prototype and test experiments," in *Proceedings of the IEEE International Test Conference (ITC)*, Washington, DC, USA, 1995, pp. 45–53, doi: 10.1109/TEST.1995.529816.
- [8] M. Nicolaidis, F. Vargas, and B. Courtois, "Design of built-in current sensors for concurrent checking in radiation environments," *IEEE Transactions on Nuclear Science*, vol. 40, no. 6, pp. 1584–1590, Dec. 1993, doi: 10.1109/23.273553.
- [9] L. Zhang and R. P. Dick, "Scheduled voltage scaling for increasing lifetime in the presence of nbtii," in *2009 Asia and South Pacific Design Automation Conference (ASP-DAC)*, Yokohama, Japan, 2009, pp. 492–497, doi: 10.1109/ASPDAC.2009.4796528.
- [10] S. Sadeghi-Kohan, M. Kamal, and Z. Navabi, "Self-adjusting monitor for measuring aging rate and advancement," *IEEE Transactions on Emerging Topics in Computing*, vol. 8, no. 3, pp. 627–638, 2020, doi: 10.1109/TETC.2017.2771441.
- [11] M. Agarwal, B. C. Paul, M. Zhang, and S. Mitra, "Circuit failure prediction and its application to transistor aging," in *25th IEEE VLSI Test Symposium (VTS'07)*, Berkeley, CA, USA, 2007, pp. 277–286, doi: 10.1109/VTS.2007.22.
- [12] F. Vargas and A. Balakrishnan, "On-chip aging sensor core for silicon lifecycle management," in *2025 IEEE 26th Latin American Test Symposium (LATS)*, 2025, pp. 1–6, doi: 10.1109/LATS65346.2025.10963953.
- [13] D. Ernst, J. Henkel, R. Gupta, L. Benini, G. De Micheli, R. Ernst, and V. Kumar, "Razor: circuit-level correction of timing errors for low-power operation," *IEEE Micro*, vol. 24, no. 6, pp. 10–20, Nov. 2004, doi: 10.1109/MM.2004.85.
- [14] A. Amouri, F. Bruguier, S. Kiamehr, P. Benoit, L. Torres, and M. Tahoori, "Aging effects in fpgas: an experimental analysis," in *2014 24th International Conference on Field Programmable Logic and Applications (FPL)*, 2014, pp. 1–4, doi: 10.1109/FPL.2014.6927390.
- [15] F. Ahmed, M. Shintani, and M. Inoue, "Accurate recycled fpga detection using an exhaustive-fingerprinting technique assisted by wid process variation modeling," *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, vol. 40, no. 8, pp. 1626–1639, 2021, doi: 10.1109/TCAD.2020.3023684.
- [16] T. Gaskin, H. Cook, W. Stirk, R. Lucas, J. Goeders, and B. Hutchings, "Using novel configuration techniques for accelerated fpga aging," in *2020 30th International Conference on Field-Programmable Logic and Applications (FPL)*, 2020, pp. 169–175, doi: 10.1109/FPL50879.2020.00037.
- [17] U.S. Department of Defense, "MIL-STD-883, Method 1015.50: Burn-In Test," <https://quicksearch.dla.mil/>, 2016, accessed: 2025-04-27.
- [18] JEDEC Solid State Technology Association, "Jedec standard no. 22-a108d: Temperature, bias, and operating life," <https://www.jedec.org/standards-documents/docs/jesd22-a108>, 2022, accessed: 2025-04-25.
- [19] U.S. Department of Defense, "Mil-std-883, method 1005.9: Steady-state life," <https://quicksearch.dla.mil/>, 2010, accessed: 2025-04-25.
- [20] AMD Xilinx, *7 Series FPGAs Clocking Resources User Guide*, 2023, uG472 (v1.16), October 30, 2023. [Online]. Available: [https://docs.amd.com/v/u/en-US/ug472\\_7Series\\_Clocking](https://docs.amd.com/v/u/en-US/ug472_7Series_Clocking)
- [21] ———, *7 Series FPGAs and Zynq-7000 SoC XADC User Guide*, AMD, 2023, user Guide UG480 (v1.14), March 1, 2023. [Online]. Available: [https://docs.amd.com/r/en-US/ug480\\_7Series\\_XADC/XADC-Overview](https://docs.amd.com/r/en-US/ug480_7Series_XADC/XADC-Overview)
- [22] ———, *Artix-7 FPGA Data Sheet: DC and AC Switching Characteristics*, 2023, document Number: DS181 (v1.30), March 29, 2023. [Online]. Available: [https://docs.amd.com/v/u/en-US/ds181\\_Artix\\_7\\_Data\\_Sheet](https://docs.amd.com/v/u/en-US/ds181_Artix_7_Data_Sheet)
- [23] J. F. Freijedo, J. Semião, J. J. Rodriguez-Andina, F. Vargas, I. C. Teixeira, and J. P. Teixeira, "Modeling the effect of process, power-supply voltage and temperature variations on the timing response of nanometer digital circuits," *Journal of Electronic Testing*, vol. 28, no. 4, pp. 421–434, August 2012, doi: 10.1007/s10836-012-5297-0. [Online]. Available: <https://doi.org/10.1007/s10836-012-5297-0>
- [24] J. Chen, T. Lange, M. Andjelkovic, A. Simevski, L. Lu, and M. Krstic, "Solar particle event and single event upset prediction from SRAM-based monitor and supervised machine learning," *IEEE Transactions on Emerging Topics in Computing*, vol. 10, no. 2, pp. 564–580, Apr. 2022, doi: 10.1109/TETC.2022.3147376.
- [25] R. T. Syed, F. L. Vargas, M. Andjelkovic, M. Ulbricht, and M. Krstic, "Aging and soft error resilience in reconfigurable cnn accelerators employing a multi-purpose on-chip sensor," in *2024 IEEE 25th Latin American Test Symposium (LATS)*. Maceio, Brazil: IEEE, 2024, pp. 1–6, doi: 10.1109/LATS62223.2024.10534625.