

# A Resistance Drift Compensation Scheme to Reduce MLC PCM Raw BER by Over 100 $\times$ for Storage Class Memory Applications

Win-San Khwa, Meng-Fan Chang, Jau-Yi Wu, Ming-Hsiu Lee, Tzu-Hsiang Su, Keng-Hao Yang, Tien-Fu Chen, Tien-Yen Wang, Hsiang-Pang Li, Matthew Brightsky, Sangbum Kim, Hsiang-Lan Lung, and Chung Lam

**Abstract**—For multilevel cell (MLC) phase change memory (PCM), resistance drift (R-drift) phenomenon causes cell resistance to increase with time, even at room temperature. As a result, the fixed-threshold-retention (FTR) raw-bit-error-rate (RBER) surpasses practical ECC correction ability within hours after being programmed. This study proposes a resistance drift compensation (RDC) scheme to mitigate R-drift issue. The proposed RDC scheme realizes PCM drift compensation and features RDC pulse to suppress ECC decoding failure. The proposed approach was validated using a 90-nm 128M cells PCM chip and an FPGA-based memory controller verification system. The MLC PCM FTR RBER has been suppressed by over 100 $\times$ , thereby bringing it within ECC capability. The effectiveness of the RDC scheme was verified up to 10<sup>6</sup> cycles.

**Index Terms**—MLC, multilevel cell, PCM, PCRAM, resistance drift, write driver.

## I. INTRODUCTION

THE ever-growing performance gap between traditional storage and the rest of the memory hierarchy is exacerbating the need for storage class memory (SCM) [1]. Phase change memory (PCM) is a promising candidate for SCM due to its scalability, bit-alterability, non-volatility, and high program speed. PCM is a simple two-terminal device with chalcogenide material sandwiched between two electrodes. The chalcogenide composition is made from germanium, antimony, and tellurium ( $Ge_2Sb_2Te_5$ ), commonly abbreviated as GST. The storage mechanism is based on GST's ability to switch between amorphous and crystalline phases when an electrical pulse is applied on the selected cells.

Manuscript received April 21, 2016; revised June 19, 2016; accepted July 24, 2016. Date of publication August 24, 2016; date of current version January 4, 2017. This paper was approved by Guest Editor Atsushi Kawasumi.

W.-S. Khwa is with Macronix International Co., Ltd, Hsinchu, Taiwan and also with National Tsing Hua University, Hsinchu, Taiwan (e-mail: s102061802@m102.nthu.edu.tw).

M.-F. Chang is with National Tsing Hua University, Hsinchu, Taiwan.

J.-Y. Wu, M.-H. Lee, T.-Y. Wang, H.-P. Li, and H.-L. Lung are with Macronix International Co., Ltd, Hsinchu, Taiwan.

T.-H. Su is with Macronix International Co., Ltd, Hsinchu, Taiwan and also with National Chiao Tung University, Hsinchu, Taiwan.

K.-H. Yang and T.-F. Chen are with National Chiao Tung University, Hsinchu, Taiwan.

M. Brightsky, S. Kim, and C. Lam are with IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 USA.

Color versions of one or more of the figures in this paper are available online at <http://ieeexplore.ieee.org>.

Digital Object Identifier 10.1109/JSSC.2016.2597822

Previous studies on PCM emphasized on expanding memory capacity, increasing bandwidth, and enabling embedded applications using novel circuits and architectural techniques [2]–[5]. However, for PCM to be a true contender, multilevel cell (MLC) topology is required to increase capacity and reduce cost-per-bit. One critical challenge in realizing this goal is overcoming resistance-drift (R-drift), wherein cell resistance ( $R_{CELL}$ ) increases over time, even at room temperature. Our measurement results indicate that the MLC PCM fixed-threshold retention (FTR) raw bit-error rate (RBER) could exceed the ECC correction ability within hours after being programmed. Previous R-drift mitigation approaches based on reference-cell-based resistance tracking (RCRT) [6] and DRAM-like refresh (DR) [7] are feasible, but they compromise distinguished PCM traits, such as random write, low latency, and low power. This work proposes a resistance drift compensation (RDC) scheme to alleviate the impact of R-drift without such compromises, while minimizing the speed and power consumption penalties. The proposed RDC scheme reduces MLC PCM FTR RBER by more than two orders of magnitude, suppressing it below practical ECC capability limits.

This paper expands the RDC scheme proposed in [8] by addressing three new practical considerations. First, the accumulative and repetitive RDC pulse experiments reveal hints behind its physical mechanism and its impact on cell endurance, respectively. Second, we conducted detailed MLC PCM measurements to provide insights on distribution evolution with and without the proposed RDC scheme. We also verify the effectiveness of the RDC scheme up to 10<sup>6</sup> cycles. Finally, we present a comparison of error patterns for MLC PCM with and without the RDC scheme.

The remainder of this paper is organized as follows. Section II provides the background of R-drift and previous R-drift mitigation approaches. Section III describes the proposed RDC scheme. Section IV presents the distinct features of the proposed RDC scheme as well as macro measurements, error pattern analysis, and simulation results. Conclusions are drawn in Section V.

## II. BACKGROUND

### A. Phase Change Memory and Resistance Drift

One critical challenge in the commercialization of MLC PCM is overcoming the reliability concern caused by



Fig. 1. PCM (a) cell structure and (b) measured resistance drift coefficient.



Fig. 2. Four MLC PCM distributions (a) before and (b) after a program-to-read time ( $T_{P2R}$ ) of five hours at room temperature.

the R-drift phenomenon associated with structural relaxation (SR). Several theories for explaining SR had been proposed, including defect annihilation [10]–[13] and volume change [14]–[16]. The former postulated that structural relaxation is caused by atomic rearrangement to minimize the internal energy by reducing defect density. The defect annihilation results in an increase of hopping distance and activation energy. The latter model is based on the difference in atomic density between crystalline and amorphous GST. The GST material in PCM cells is confined within a space corresponding to the volume of as-deposited crystalline GST. The transformation to a lower density amorphous phase leads to the development of considerable stress within the cell, and stress relief over time causes a widening of the band gap.

The measured  $R_{\text{CELL}}$  as a function of time follows a power-law equation  $R_{\text{CELL}}(t) = R_0(t/t_0)^\gamma$ , where  $R_0$  and  $t_0$  are normalization constants and  $\gamma$  is the drift coefficient. Fig. 1 illustrates the PCM cell structure and the measured  $\gamma$  as a function of  $R_0$  [17]. The chalcogenide material (GST) is sandwiched between the top and bottom electrodes (TE and BE). The BE diameter is 40 nm and GST thickness is 100 nm. The increasing trend of  $\gamma$  suggests that the most dominate source of errors will be coming from the higher resistance states and that the time to first failure could be as short as 1000 seconds.

Making the best use of the available resistance margin requires the careful selection of the MLC resistance level. Fig. 2(a) illustrates the measured MLC PCM distributions from an array of 64k cells. The RESET distribution is nearly a

vertical line at approximately 5 M $\Omega$ , due to the fact that this is the upper boundary of our sensing range. Placing the RESET state out-of-range means that only three MLC states have to be allocated within the sensible region. Fig. 2(b) illustrates the same MLC PCM distributions measured after a program-to-read time ( $T_{P2R}$ ) of five hours at room temperature, with the overlapping regions highlighted by shaded boxes. The RBER reached  $10^{-2}$  after only five hours of  $T_{P2R}$  and the most dominate source of error is the MLC2 state merging toward the RESET state. These findings are in agreement with the results in previous researches [13], [18]. Reducing this type of error is the primary focus of the proposed RDC scheme.

Fig. 3 compares two conventional R-drift mitigation schemes and highlights the associated challenges. In the RCRT scheme, the resistance of reference cells is used as adaptive-sensing threshold to compensate for R-drift. A group (i.e., a page) of PCM cells must share these reference cells to reduce overhead; however, programming cells at different time (i.e., random write) could induce overlapping among the MLC states. As shown in Fig. 3(a), updating only cell B induces read errors. This means that every cell within a given memory management group must have the same time stamp. Therefore, the RCRT scheme is incompatible with the concept of random writes and introduces long write latency, based on the fact that any update inevitably triggers a read and rewrite of the entire group. In contrast, the DR scheme is limited by cell-state dependency. Each state requires a different action, because cells must be reprogrammed to their original states, which necessitates the read-before-refresh requirement. As a result, periodical refreshes must be executed while the memory cells still possess the correct information. Applying the refresh operation on erroneous cells would only meaninglessly reprogram them back to erroneous states. This potentially shortens the refresh interval ( $T_{\text{Refresh}}$ ) to a few thousand seconds at room temperature. Furthermore, the situation is further exacerbated by small concurrent write size, long MLC program latency (545 ns), and high power consumption (27 pJ/cell) [19].

### B. Previous Resistance Drift Mitigation Schemes

Previous publications approached the R-drift challenge from a variety of perspectives, including GST material, cell structure and read/write operation. For material engineering, Cheng *et al.* designed a “golden composition” utilizing isoelectronic tie-line along Ge-Sb<sub>2</sub>Te<sub>3</sub> in conjunction with nitrogen doping and variations in Ge/N concentration to achieve fast programming speeds and high crystallization temperature [20], [21]. For cell structure, Kim *et al.* proposed an innovative approach by incorporating a metallic surfactant layer surrounding the memory cell as a resistance drift stabilizer [22]. Several read and write schemes have also been proposed to improve the reliability of MLC PCM. Hwang *et al.* suggested using reference cells [23]. Wu *et al.* approached the R-drift challenge from a different perspective. Exploiting PCM’s R-V non-linearity, they were able to accommodate eight resistance levels (TLC) in three independent 10  $\times$  sensing windows [24]. Lastly, Chien *et al.* exploited the



Fig. 3. Challenges in conventional resistance drift mitigation approaches: (a) reference cell resistance tracking, and (b) DRAM-like refresh.



Fig. 4. (a) Block diagram of the PCM chip, (b) circuit design relevant to RDC pulse generation, and (c) comparison of conventional SET pulse and two-step RDC pulse.

self-convergence property of low current SET operation in the design of a novel MLC programming scheme [19].

The fact that a wide range of schemes has been proposed with no method being universally accepted suggests that solving the problem of R-drift may require interdisciplinary engineering efforts. The concept of RDC scheme does not conflict with most of the previous methods. In fact, the proposed scheme could and should be implemented with other schemes to bring MLC PCM one step closer to commercialization.

### III. PROPOSED RESISTANCE DRIFT COMPENSATION SCHEME

Fig. 4(a) and (b) presents the block diagrams of the PCM chip and the circuit designs relevant to RDC pulse generation,

respectively. This 128-Mb PCM chip consists of 64 2-Mb tiles, separated between the top and bottom halves of the chip. Each 2-Mb tile comprises two 1-Mb sub-tiles and a set of shared read/write circuits with digital current mirrors (DCM). DCMs are employed to provide reference currents for write drivers (WD) and sense amplifiers (SA). The DCM modifies its output current ( $I_{DCM}$ ) by changing the number of activated PMOS and NMOS transistors according to a 7-b trim code. There is one SET WD and one RESET WD within each 1-Mb sub-tile; however, only the former includes an RDC generator. The SET WD, which is controlled by the SET<8:1> signals, provides a slow-quenching current pulse by issuing eight copies of  $I_{DCM}$  in a diminishing manner. The purpose of the RDC pulse generator is to enhance flexibility by providing the option



Fig. 5. Measured Shmoo plot illustrating various operating regions on one drifted MLC PCM cell.

to skip some of the slow-quenching steps, and thereby gain the freedom to investigate intermediate pulse shapes between SET and RESET. Fig. 4(c) presents an example comparing a conventional SET pulse with a two-step RDC pulse.

The measured RDC characteristics are represented using a Shmoo plot in Fig. 5. In the experiment, we first programmed a PCM cell to an intermediate resistance level and waited for a  $T_{P2R}$  of 5000 seconds before sweeping the two-step RDC current (x-axis) and step pulse width (y-axis). The Shmoo plot indicates three important regions. First, in the lower left corner, the current amplitude is small and step pulse width is short, such that the impact of RDC should theoretically be minimal. However, the color in this region is between dark green and brown, which indicates an increase in cell resistances from its initial level caused by R-drift during  $T_{P2R}$ . Second, there are also a blue and a red region representing the typical SET and RESET operations, respectively. Finally, there are some regions on the Shmoo plot that are green. Recall that we enforced a  $T_{P2R}$  of 5000 seconds, such that all cell resistances before sweeping RDC pulse should exceed their initial values, as evidenced by the drifted region in the lower left corner. Therefore, all of the green regions on the Shmoo plot provide indications of resistance drift compensation. For robustness against cell variation, the selected RDC pulse conditions should avoid the red and blue regions. Fortunately, this RDC region is sufficiently large for mass production.

#### IV. MACRO MEASUREMENTS AND PERFORMANCES

A 90-nm 128M cell MLC PCM chip with mushroom-type doped-GST and RDC on/off function was fabricated and validated using an FPGA-based memory-controller verification system. All of the measurements presented in this section were obtained using this verification platform.

##### A. Time and Cell State Independence

Fig. 6 illustrates the measured impact of RDC pulses on drifted and non-drifted MLC PCM cells. The experiment began with the programming of PCM cells to six different

resistance levels ( $R_{init}$  [5:0]) using ISPP via RESET. The cells then rested for a  $T_{P2R}$  of either <1 second or >5000 seconds. Finally, the cell resistance values were recorded while sweeping the two-step RDC pulse current amplitude ( $I_{RDC}$ ). In the case of  $T_{P2R}$  less than one second, the RDC pulses had minimal impact on cells that had not yet drifted significantly. In the case of  $T_{P2R}$  exceeding 5000 seconds, considerable R-drift compensation was evident when the  $I_{RDC}$  amplitude was between 80 and 120 AU. Under all tested resistance levels, the drifted cells were able to return to  $R_{init}$ . This demonstrates that the application of RDC has little impact on cells with a shorter  $T_{P2R}$ , while providing effective R-drift compensation (up to within 25% of  $R_{init}$ ) for cells with a longer  $T_{P2R}$ .

Fig. 7 illustrates the measured RDC time ( $T_{P2R}$ ) and cell-state independence. As mentioned previously, RDC pulses have little impact on cells prior to drift; this preserves random-write capability, and eliminates the need for write amplification. The first RDC time-related experiment emphasized the MLC2 state because it is the most dominate source of errors. Two groups of cells were inspected: 1) cells updated now and 2) cells updated 3 days ago. The measured distribution shows that both cell groups were at similar resistance levels after RDC treatment. The average  $\text{Log10 } R_{CELL}$  value of the latter group dropped from 3.44 to 3.11. The second experiment was meant to reveal the cell-state independence of RDC at the array level. All of the memory cells are rested at room temperature for five hours prior to the application of RDC pulses. Cells in all four MLC states are able to return to their  $R_{init}$  states under the same RDC conditions. Specifically, the average  $\text{Log10 } R_{CELL}$  value of the MLC2 state reduced from 3.26 to 3.09. Furthermore, due to the fact that RDC is able to perform error correction, the erroneous bits (shaded box) could be corrected without either read-before-refresh or ECC. There is no need to periodically perform RDC while all cells still hold the correct information. Please note that the average  $\text{Log10 } R_{CELL}$  values before the application of RDC pulses differed between the two experiments due to differences in data retention time.

##### B. Impact of RDC Pulses on Cell Resistance and Endurance

Accumulative and repetitive RDC experiments were conducted to investigate the nature of RDC pulses and their impact on cell endurance, respectively. Fig. 8 illustrates the setup of the accumulative RDC experiment. The cells were first programmed to their MLC resistance levels and then subjected to regular read operations (60 iterations) to map the R-drift projection (Region A). Read operations with accumulative RDC pulses were then executed (3000 iterations) to amplify the impact of RDC pulses (Region B). The resulting cell resistance values are plotted in Fig. 8. The dotted lines show the extrapolated  $R_{CELL}$  from R-drift measurements in Region A, indicating how the cell should drift without RDC. Region B illustrates two transitions under accumulative RDC pulses: (1) steady  $R_{CELL}$  and (2) decreased  $R_{CELL}$ . Reference [25] reported behavior similar to that of transition (1), suggesting that it may be related to trap dynamics without the involvement of phase transformation. The decreased  $R_{CELL}$



Fig. 6. Demonstration of the measured RDC impact on drifted and non-drifted MLC PCM cells.



Fig. 7. Demonstration of the measured RDC's time ( $T_{P2R}$ ) and cell-state independence.

observed in transition (2) suggests that this may be related to crystallization. The unique time and cell state independences outlined previously corresponds to the behavior observed in transition (1). In contrast, the behavior in transition (2) suggests that the RDC scheme should not be used for every

read operation or implemented periodically, but rather only upon ECC failure.

Fig. 9 illustrates the impact of RDC pulses on PCM cell degradation and array endurance. A group of 64k cells was first programmed in SLC mode. One million repetitive RDC



Fig. 8. Experiment flow of accumulative RDC experiment with measured impact on cell resistances.



Fig. 9. Impact of RDC pulses on PCM: (a) cell degradation and (b) array endurance.

pulses were then issued before reprogramming and reading them in SLC mode. This number of RDC pulses should be more than sufficient to ensure reliability over a period of ten years, even when adopting the most naïve approach of periodic refresh (i.e., every 1000 seconds). Fig. 9(a) presents the SLC distributions before and after the application of one million RDC pulses on the x-axis and y-axis, respectively. The diagonal dashed line was added as a reference. If the main distributions reside along the diagonal line, then there is no cell degradation. Our result revealed minor degradation in the SET state and no cell failure. The average SET resistance dropped from 31.3 kΩ to 21.5 kΩ. The same decrease in SET resistance has also been reported by other researchers [26], [27] and in our own work [9]. This effect has been attributed to the segregation of GST material. Fig. 9(b) presents the cell failure rate associated with cycling stress in two 16k cell groups; cells that were subjected to one million RDC pulses and cells that were not. The cycling stress consisted of repetitive SET and RESET pulses. The criteria for cell failure were R<sub>CELL</sub> > 100kΩ for SET operations and R<sub>CELL</sub> < 1MΩ for RESET operations. These results demonstrate that RDC has only minimal effect on endurance.

### C. RDC With Cycling-Induced Degradation Consideration

Fig. 10 illustrates the distribution of the four MLC states with and without R-drift. To best utilize the available resistance margin, we placed the RESET state outside our sensing range at approximately 5 MΩ. Drift is more pronounced in cells with a higher initial cell resistance (i.e., MLC2 cells undergo drift sooner than do MLC1 cells). Without RDC, the MLC2 cells begin merging with the RESET cells at the first data



Fig. 10. Evolution of MLC PCM distributions versus time (a) with RDC and (b) without RDC.

point, whereupon the number of erroneous cells increases significantly with time.

Fig. 10(a) illustrates the impact of the RDC scheme on MLC distribution with R-drift. RDC was issued once every ten data points (i.e., 1, 11, 21, etc.), and is highlighted using shaded boxes. The style of this plot with an empty space inserted before each RDC trigger point was selected to clearly represent the impact of RDC. At the RDC trigger points, the same RDC treatment was applied to all four MLC states. The effects of drift compensation are easily observed. Following every RDC treatment, the distribution of MLC2 and MLC1 shifted downward, whereas the SET and RESET states remained unaltered. The SET state remained unchanged because the RDC pulse is essentially a weak SET, such that it has no impact on cells that are already in SET state. The RESET state is also unaffected by RDC pulses because the change in cell resistance is beyond the sensing range. This feature also makes it possible to eliminate the requirement of read-before-refresh. Another distinguishing characteristic of the RDC scheme is its error correction ability. Despite the fact that some of the MLC2 cells merged with the RESET state between every RDC triggers, these erroneous cells were corrected after RDC treatment. This is a clear indication that the RDC scheme does not need to be executed before cells drifted to another state, as in the case of the DR scheme.

The next experiment was meant to elucidate the impact of cycling-induced degradation (CID) on the RDC effectiveness. Fig. 11 illustrates the MLC distributions from four cell groups that had been stressed by 10<sup>0</sup>, 10<sup>2</sup>, 10<sup>4</sup>, and 10<sup>6</sup> cycles. This cycling stress consisted of repetitive SET and RESET pulses. Fig. 11(b) shows that without the RDC scheme, errors begin appearing after the first data point. Fig. 11(a) illustrates the MLC distributions with RDC triggering at every sampling point (every 10<sup>4</sup> seconds). Clearly, the RDC scheme is effective on all cell groups.



Fig. 11. Evolution of MLC PCM distributions versus cycling stress (a) with RDC and (b) without RDC.



Fig. 12. FTR RBER comparison of MLC PCM versus cycling stress and retention time.

We observed three intriguing behaviors. First, there is a gradual descent of the MLC1 state. This is due to the fact that the RDC pulses were designed specifically to reduce errors generated by MLC2 cells merging with RESET cells, and may be too strong for the MLC1 state. The second notable behavior was an increase in the number of the SET state outliers, which was far more obvious with RDC than without RDC. This may be attributed to the failure mechanism proposed in our previous work [9], in which a leftward shift in the resistance-current curve was observed in cells that underwent extended cycling stress. RDC conditions that act as a weak SET operation may act as a weak RESET in these outlier cells. The third behavior was the diminishing RDC impact in the cell group that underwent 10<sup>6</sup> stress cycles. This can be attributed to the CID that originates from GST segregation, which reduces the effectiveness of the SET and RESET operations on the main PCM cell distribution [9], [26], [27]. The same RDC conditions that were intended for the MLC2 state and were too strong for the MLC1 state, may become less effective for the former and more suitable for the latter after CID. It should be mentioned that in this subsection, the cycling stress was



Fig. 13. Comparison of error pattern breakdown between cases with and without RDC: (a) MLC2-to-RESET, (b) MLC1-to-MLC2, (c) SET-to-MLC1, and (d) RESET-to-MLC2.



Fig. 14. FTR RBER of MLC PCM cells at both room temperature and 85 °C over a period of 48 hours.

performed in SLC mode to eliminate the discrepancies among the various MLC program approaches.

Fig. 12 presents a comparison of the FTR RBER in all cell groups with and without RDC scheme. We selected fixed resistance threshold values of 60 kΩ, 500 kΩ, and 4 MΩ. A horizontal dashed line is drawn at RBER = 5 × 10<sup>-3</sup> to indicate the upper bound correction ability of practical ECC. Without the proposed RDC scheme, the FTR RBER exceeds this upper bound at the second data point. Almost no errors were generated before 10<sup>6</sup> stress cycles under RDC.

Fig. 13 presents an error pattern breakdown comparison with and without the RDC scheme. There are a total of six possible error patterns, and two of these rarely occur:



Fig. 15. (a) FTR RBER of MLC PCM cells at 150 °C over a period of 6 hours and MLC PCM distributions at (b) 4 hours and (c) 6 hours.

*MLC1-to-SET* and *MLC2-to-MLC1*. This is because the direction of R-drift is upward, such that downward error types are unlikely. The error trend in the case without RDC is expected, and could be explained using Fig. 1(b). The increase in  $\gamma$  with cell resistance means that *MLC2-to-RESET* and *MLC1-to-MLC2* are the most common sources of error. In contrast, the error pattern under the RDC scheme is very different, wherein the only significant type of error is *SET-to-MLC1*.

#### D. RDC With Retention Time Consideration

Fig. 14 presents a comparison of FTR RBER under extended duration and elevated temperature. Two arrays of 256k cells were programmed to four MLC states, one with RDC and the other without RDC. Before every measurement, an RDC pulse was applied to each cell in the RDC cell group. The measured FTR RBER revealed an improvement of over two orders of magnitude at room temperature and at 85 °C. Specifically, the FTR RBER in the cells with RDC remained below  $5 \times 10^{-3}$  (practical ECC capability limit) after 48 hours of baking at 85 °C.

Under extended retention time (i.e., months or years at room temperature), PCM cells are likely to enter the resistance decay (R-decay) region, in which PCM R<sub>CELL</sub> decreases, due to the crystallization of amorphous GST at grain boundaries [28]. To investigate the impact of R-decay on reliability, we measured the FTR RBER at 150 °C. We selected this temperature to enable the observation of R-drift as well as R-decay within a reasonable period of time. Initially, the RDC scheme was able to reduce FTR RBER by approximately two orders of magnitude. The effectiveness of RDC began to drop after four hours, whereupon the error rate (even with RDC) began increasing rapidly. Fig. 15(a) presents a comparison of FTR RBER at 150 °C over a period of six hours. Because RDC pulses tend to lower R<sub>CELL</sub> slightly toward the SET state, they are ineffective in the R-decay region dominated by crystallization. Nonetheless, without the RDC scheme,



Fig. 16. (a) Overheads of RDC and DR schemes, (b) simulated RDC trigger rate, and (c) simulated overhead analysis of RDC and DRAM-like refresh schemes on measured FTR RBER.

the error rate would greatly exceed the correctable range far earlier. Fig. 15(b) and (c) presents the MLC distributions at four hours and six hours. In the case of four hours, the MLC2 distribution with RDC is noticeably lower than that without RDC, due to the repeated application of RDC pulses at each measurement point. Under longer retention time, the two cell groups present similar distributions and error rates.

#### E. Simulation Results for Latency and Power Overhead

Fig. 16 compares the system-level performance between RDC and DR schemes for MLC PCM based on the measured FTR RBER shown in Fig. 14. To minimize the effects on speed and power overhead, the RDC scheme should only be triggered when necessary, as in the case of ECC



Fig. 17. Proposed MLC PCM die, chip, verification system, and summary table.

decoding failure, rather than before every read operation. In contrast, the DR scheme requires a refresh after every refresh interval ( $T_{\text{Refresh}}$ ). The equations used to calculate overhead using these two schemes are presented in Fig. 16(a). The overhead for the RDC scheme includes the regular read-ECC plus RDC pulse and read-ECC again upon ECC decoding failure, as follows:  $RDC = (\text{Read} + \text{ECC}) + (RDC + \text{Read} + \text{ECC})(R_{\text{RDC}})$ . The overhead for DR scheme includes the regular read-ECC plus the number of MLC refreshes within the program-to-read time, as follows:  $DR = (\text{Read} + \text{ECC}) + (\text{Read} + \text{ECC} + \text{MLCPGM})(T_{\text{P2R}}/T_{\text{Refresh}})$ . The blue shaded regions indicate the overhead associated with maintaining data integrity using the two schemes. Note that when using the RDC scheme, the data integrity overhead remains nearly constant, whereas it scales up with  $T_{\text{P2R}}$  when using the DR scheme.

The RDC trigger rate ( $R_{\text{RDC}}$ ) is the probability of ECC decoding failure as a function of  $T_{\text{P2R}}$ , which can be estimated using  $R_{\text{RDC}} = 1 - \sum_{\alpha=0}^t nC_\alpha \times (\text{RBER})^\alpha \times (1 - \text{RBER})^{n-\alpha}$ , where  $n$  is the code length,  $t$  is the number of correctable errors,  $\alpha$  is the number of errors, and  $\text{RBER}$  is the raw bit-error rate as a function of  $T_{\text{P2R}}$  in Fig. 14.

The simulated  $R_{\text{RDC}}$  is presented in Fig. 16(b). Given an acceptable ECC decoding failure rate (such as  $10^{-12}$ ), the required  $T_{\text{Refresh}}$  is estimated to be 450 and 1100 seconds for BCH(511,475,4) and BCH(511,457,6), respectively. Fig. 16(c) illustrates the average latency and energy overhead associated with the two BCH configurations with and without the proposed RDC scheme. The simulation assumes a Gaussian  $T_{\text{P2R}}$  workload with varying mean  $T_{\text{P2R}}$  values on the x-axis. These results demonstrate that the RDC scheme is far superior to the DR scheme with regard to latency as well as energy overhead, when  $T_{\text{P2R}}$  mean is above  $10^4$  seconds. Specifically, under a mean  $T_{\text{P2R}}$  of  $10^6$  seconds, energy overhead was reduced by over three orders of magnitude. This is to be expected, because the RDC scheme incurs nearly fixed overhead associated with each read operation,

whereas the overhead of the DR scheme scales up with increasing  $T_{\text{P2R}}$ . To fully appreciate the advantages of the RDC scheme, one must understand the significance of the  $10^6$  seconds (or 11.6 days) in real-life applications. Despite the fact that no SCM workload is available yet, we can speculate using NAND flash memory workloads. Using the simulation traces of 14 actual workloads on a NAND flash memory device over a period of 7 days, Cai *et al.* showed that most of the retention ages across all NAND pages exceed 6 days [29]. Under such a scenario, the RDC scheme would out-perform the conventional DR scheme. Fig. 17 presents a die photo, the chip, the verification system, and the summary table of the MLC PCM chip.

## V. CONCLUSION

This paper proposes a resistance drift compensation (RDC) scheme to mitigate the issue of resistance drift (R-drift). The RDC scheme provides PCM drift compensation using RDC pulses to suppress ECC decoding failure. The proposed method was validated using a 90-nm 128M cells PCM chip and an FPGA-based verification system. The proposed system reduced the MLC PCM fixed-threshold retention (FTR) raw bit-error rate (RBER) by more than 100 $\times$  to within ECC capability. The RDC scheme has also been verified to be effective up to  $10^6$  cycles.

## REFERENCES

- [1] R. F. Freitas and W. W. Wilcke, "Storage-class memory: The next storage system technology," *IBM J. Res. Develop.*, vol. 52, nos. 4–5, pp. 439–447, 2008.
- [2] Y. Choi *et al.*, "A 20 nm 1.8 V 8 Gb PRAM with 40 MB/s program bandwidth," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2012, pp. 46–48.
- [3] H. Chung *et al.*, "A 58 nm 1.8 V 1 Gb PRAM with 6.4 MB/s program BW," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2011, pp. 500–502.
- [4] G. De Sandre *et al.*, "A 90 nm 4 Mb embedded phase-change memory with 1.2 V 12 ns read access time and 1 MB/s write throughput," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2010, pp. 268–269.
- [5] C. Villa, D. Mills, G. Barkley, H. Giduturi, S. Schippers, and D. Vimercati, "A 45 nm 1 Gb 1.8 V phase-change memory," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2010, pp. 270–271.
- [6] Y. N. Hwang *et al.*, "MLC PRAM with SLC write-speed and robust read scheme," in *Symp. VLSI Technol. Dig. Tech. Papers.*, Jun. 2010, pp. 201–202.
- [7] G. W. Burr *et al.*, "Phase change memory technology," *J. Vac. Sci. Technol. B*, vol. 28, no. 2, pp. 223–262, 2010.
- [8] W.-S. Khwa *et al.*, "A resistance-drift compensation scheme to reduce MLC PCM raw BER by over 100 $\times$  for storage-class memory applications," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, 2016, pp. 134–135.
- [9] W. S. Khwa *et al.*, "A novel inspection and annealing procedure to rejuvenate phase change memory from cycling-induced degradations for storage class memory applications," in *IEEE Int. Electron Devices Meeting (IEDM) Dig. Tech. Papers*, Dec. 2014, pp. 29.8.1–29.8.4.
- [10] M. Boniardi, D. Ielmini, S. Lavizzari, A. L. Lacaita, A. Redaelli, and A. Pirovano, "Statistics of resistance drift due to structural relaxation in phase-change memory arrays," *IEEE Trans. Electron Devices*, vol. 57, no. 10, pp. 2690–2696, Oct. 2010.
- [11] A. Pirovano, A. L. Lacaita, F. Pellizzer, S. A. Kostylev, A. Benvenuti, and R. Bez, "Low-field amorphous state resistance and threshold voltage drift in chalcogenide materials," *IEEE Trans. Electron Devices*, vol. 51, no. 5, pp. 714–719, May 2004.

- [12] D. Ielmini, D. Sharma, S. Lavizzari, and A. L. Lacaia, "Physical mechanism and temperature acceleration of relaxation effects in phase-change memory cells," in *Int. Rel. Phys. Symp. Tech. Dig.*, 2008, pp. 597–603.
- [13] D. Ielmini, S. Lavizzari, D. Sharma, and A. L. Lacaia, "Temperature acceleration of structural relaxation in amorphous Ge<sub>2</sub>Sb<sub>2</sub>Te<sub>5</sub>," *Appl. Phys. Lett.*, vol. 94, no. 19, 193511, 2008.
- [14] W. K. Njoroge, H.-W. Wöltgens, and M. Wuttig, "Density changes upon crystallization of Ge<sub>2</sub>Sb<sub>2.04</sub>Te<sub>4.74</sub> films," *J. Vac. Sci. Technol. A*, vol. 20, no. 1, 230, 2002.
- [15] I. V. Karpov, M. Mitra, D. Kau, G. Spadini, Y. A. Kryukov, and V. G. Karpov, "Fundamental drift of parameters in chalcogenide phase change memory," *J. Appl. Phys.*, vol. 102, no. 12, 124503, 2007.
- [16] J. Im, E. Cho, D. Kim, H. Horii, J. Ihm, and S. Han, "A microscopic model for resistance drift in amorphous Ge<sub>2</sub>Sb<sub>2</sub>Te<sub>5</sub>," *Current Appl. Phys.*, vol. 11, no. 2, pp. e82–e84, 2011.
- [17] W. S. Khwa *et al.*, "A procedure to reduce cell variation in phase change memory for improving multi-level-cell performances," in *Proc. IEEE Int. Memory Workshop (IMW)*, May 2015, pp. 1–4.
- [18] H. Pozidis *et al.*, "A framework for reliability assessment in multilevel phase-change memory," in *Proc. IEEE Int. Memory Workshop (IMW)*, May 2012, pp. 1–4.
- [19] W. C. Chien *et al.*, "A novel self-converging write scheme for 2-bits/cell phase change memory for storage class memory (SCM) application," in *Symp. VLSI Technol. Dig. Tech. Papers*, 2015, pp. T100–T101.
- [20] H. Y. Cheng *et al.*, "A high performance phase change memory with fast switching speed and high temperature retention by engineering the Ge<sub>x</sub>Sb<sub>y</sub>Te<sub>z</sub> phase change material," in *IEEE Int. Electron Devices Meeting (IEDM) Dig. Tech. Papers*, Dec. 2011, pp. 3.4.1–3.4.4.
- [21] H. Y. Cheng *et al.*, "A thermally robust phase change memory by engineering the Ge/N concentration in (Ge,N)<sub>x</sub>Sb<sub>y</sub>Te<sub>z</sub> phase change material," in *IEEE Int. Electron Devices Meeting (IEDM) Dig. Tech. Papers*, Dec. 2012, pp. 31.1.1–31.1.4.
- [22] S. Kim *et al.*, "A phase change memory cell with metallic surfactant layer as a resistance drift stabilizer," in *IEEE Int. Electron Devices Meeting (IEDM) Dig. Tech. Papers*, Dec. 2013, pp. 30.7.1–30.7.4.
- [23] Y. N. Hwang *et al.*, "MLC PRAM with SLC write-speed and robust read scheme," in *Symp. VLSI Technol. Dig. Tech. Papers*, 2010, pp. 201–202.
- [24] J. Y. Wu *et al.*, "Greater than 2-bits/cell MLC storage for ultra high density phase change memory using a novel sensing scheme," in *Symp. VLSI Technol. Dig. Tech. Papers*, 2015, pp. T94–T95.
- [25] A. Pirovano, A. L. Lacaia, F. Pellizzer, S. A. Kostylev, A. Benvenuti, and R. Bez, "Low-field amorphous state resistance and threshold voltage drift in chalcogenide materials," *IEEE Trans. Electron Devices*, vol. 51, no. 5, pp. 714–719, May 2004.
- [26] M. Bonardi, A. Redaelli, A. Ghetti, and A. L. Lacaia, "Study of cycling-induced parameter variations in phase change memory cells," *IEEE Electron Device Lett.*, vol. 34, no. 7, pp. 882–884, Jul. 2013.
- [27] C. Kim *et al.*, "Direct evidence of phase separation in Ge<sub>2</sub>Sb<sub>2</sub>Te<sub>5</sub> in phase change memory devices," *Appl. Phys. Lett.*, vol. 94, no. 19, 193504, 2009.
- [28] N. Ciocchini, E. Palumbo, M. Borghi, P. Zuliani, R. Annunziata, and D. Ielmini, "Modeling resistance instabilities of set and reset states in phase change memory with Ge-rich GeSbTe," *IEEE Trans. Electron Devices*, vol. 61, no. 6, pp. 2136–2144, Jun. 2014.
- [29] Y. Cai, Y. Luo, E. F. Haratsch, K. Mai, and O. Mutlu, "Data retention in MLC NAND flash memory: Characterization, optimization, and recovery," in *Proc. IEEE Int. Symp. High Perform. Comput. Archit. (HPCA)*, Feb. 2015, pp. 551–563.



**Win-San Khwa** received the B.S. and M.S. degrees from the University of California, Los Angeles, CA, USA, and the University of Michigan, Ann Arbor, MI, USA, in 2007 and 2010, respectively. He joined Macronix International (MXIC) in 2012 and worked on the IBM/Macronix Phase Change Memory Joint Project. He is currently pursuing the Ph.D. degree in electrical engineering at National TsingHua University, Hsinchu, Taiwan, under the guidance of Prof. Meng-Fan Chang. His research interests include characterizations, circuit-device interactions, and circuit designs of emerging memories.



**Meng-Fan Chang** (M'05–SM'14) received the M.S. degree from The Pennsylvania State University, State College, PA, USA, and the Ph.D. degree from National Chiao Tung University, Hsinchu, Taiwan.

Currently, he is a Full Professor at National TsingHua University (NTHU), Taiwan. Before 2006, he worked in industry over 10 years. From 1996 to 1997, he designed memory compilers at Mentor Graphics, New Jersey, USA. From 1997 to 2001, he designed embedded SRAMs and Flash in Design Service Division (DSD) at TSMC, Hsinchu, Taiwan. During 2001–2006, he was a co-founder and Director at IPLib Company, Taiwan, where he developed embedded SRAM and ROM compilers, Flash macros, and flat-cell ROM products. His research interests include circuit designs for volatile and nonvolatile memory, ultra-low-voltage systems, 3D-memory, circuit-device interactions, and memristor logics for neuromorphic computing.

Dr. Chang is the corresponding author of numerous ISSCC and VLSI Symposia papers. He is an associate editor for IEEE TVLSI, IEEE TCAD, and IEICE Electronics. He has served on the technical program committees for ISSCC, IEDM, A-SSCC, ISCAS, VLSI-DAT, and numerous other international conferences. He has been serving as the Associate Executive Director for Taiwan's National Program of Intelligent Electronics (NPIE) since 2011. He received the Academia Sinica (Taiwan) Junior Research Investigators Award in 2012, and the Ta-You Wu Memorial Award of the National Science Council (NSC-Taiwan) in 2011. He has also received numerous awards from Taiwan's National Chip Implementation Center (CIC), NTHU, MXIC Golden Silicon Awards, and ITRI.



**Jau-Yi Wu** received the B.S. degree from the Department of Electrical Engineering, Feng Chia University, Taichung, Taiwan, in 1995, and the M.S. and Ph.D. degrees from the Department of Electrical Engineering and Institute of Microelectronic Engineering, National Cheng Kung University, Tainan, Taiwan, in 1998 and 2002, respectively.

He joined the Nano-Device R&D Department, Macronix International, Hsinchu, Taiwan, in 2002. His current research areas include high-density memory development, nitride-trapping memory devices, and advanced non-volatile memory technologies.



**Ming-Hsiu Lee** received the B.S. and M.S. degrees in electrophysics from National Chiao-Tung University, Hsinchu, Taiwan, in 1991 and 1993, respectively.

He has been with Macronix International (MXIC) since 1995, with works covering process integration, device characterization, product engineering, and emerging memory device R&D. His major research interests include floating gate memories, SONOS devices, 3-D memories, phase change memory, and various resistive memories.



**Tzu-Hsiang Su** received the B.S. degree in computer engineering from the University of California, Irvine, CA, USA, in 2002, and the M.S. degree in computer science from National Chiao Tung University (NCTU), Hsinchu, Taiwan, in 2011. He is currently pursuing the Ph.D. degree in computer science from NCTU and serving as the project manager at Macronix's emerging system lab. His current research interests include computer architectures, memory system, non-volatile memories, and embedded systems.



**Keng-Hao Yang** received the B.S. degree in computer science from National Chung Cheng University, Chiayi, Taiwan, in 2011. He is currently pursuing the Ph.D. degree in computer science with National Chiao Tung University, Hsinchu, Taiwan. His current research interests include computer architectures, memory system, nonvolatile memories, and embedded systems.



**Tien-Fu Chen** received the B.S. degree in computer science from National Taiwan University, Taipei, Taiwan, in 1983, and the M.S. and Ph.D. degrees in computer science and engineering from the University of Washington, Seattle, WA, USA, in 1991 and 1993, respectively.

He is currently a Professor with the Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan. He has authored several widely cited papers, and was recognized for contributions on memory prefetching and optimizations of embedded systems. His recent research results include multithreading/multicore media processors, on-chip networks, low-power architecture techniques and related software support tools, and system-on-chip design environment. His current research interests include computer architectures, system-on-chip design, and embedded systems.



**Tien-Yen Wang** received the B.S. degree from the Department of Electrical Engineering, Nation Taiwan Ocean University, Keelung, Taiwan, in 2001, and the M.S. degree from the Department of Engineering and System Science, National Tsing Hua University, Hsinchu, Taiwan, in 2003.

He joined the Product Development Department, Macronix International, Hsinchu, Taiwan, in 2004. He has experience with product development in nitride-trapping, floating gate, phase change, and resistive memories.



**Hsiang-Pang Li** received the B.S. and M.S. degrees from Chung-Yuan Christian University, Taiwan, in 1994 and 1996, respectively.

He has been with Macronix International Company, Taiwan, since 1998. He has experience in designing nonvolatile memory, such as EPROM, mask-ROM, and flash memory. He is currently in charge of the Macronix Emerging System Lab. His research interests include memory system architectures, embedded systems, and NVM applications.



**Matthew Brightsky** received the B.S. degree in physics, mathematics, and astrophysics from the University of Wisconsin at Madison, Madison, WI, USA, in 1994, and the Ph.D. degree in physics from Iowa State University, Ames, IA, USA, in 1999.

He is a Research Staff Member at the IBM T. J. Watson Research Center, Yorktown Heights, NY, USA. He was with IBM at the IBM Microelectronics Center, Essex Junction, VT, USA, where he worked on developing low-standby-power SRAM, and on dc and RF compact models for CMOSFETs

and passive devices. In 2005, he joined the exploratory memory group at the T. J. Watson Research Center and has since worked on integration schemes for phase change memory devices. He is an author or coauthor of 14 patents, 31 technical papers, and a book chapter.



such as storage-class memory, embedded memory, and brain-inspired neuromorphic computing.

**Sangbum Kim** (S'05–M'11) received the B.S. degree from Seoul National University, Seoul, Korea, in 2001, and the M.S. and Ph.D. degrees from Stanford University, Stanford, CA, USA, in 2005 and 2010, respectively, all in electrical engineering.

He is currently a Research Staff Member with the IBM T. J. Watson Research Center, New York, USA. His current research interests are characterization and modeling of phase change memory devices for various memory applications



**Hsiang-Lan Lung** (M'01–SM'07) received the Ph.D. degree from National Tsing Hua University, Hsinchu, Taiwan.

He has more than 20 years of experience in semiconductor memory technologies. More than 10 years of his experience was dedicated to phase change memory (PCM) technology. He has been working at IBM T. J. Watson Research Center, New York, USA, as a joint project manager of phase change memory since 2004. In the last 12 years, he has been deeply involved in the R&D of materials, devices, process integration, chips design, and system applications of PCM. Currently, he is a Deputy Director of Emerging Central Lab of Macronix. He started his career at Macronix, Hsinchu, Taiwan, in 1996 as a process integration engineer in charge of the 0.6/0.5  $\mu\text{m}$  logic foundry processes. After that, he was involved in the production and R&D of SRAM, embedded Flash, embedded MROM, SONOS Flash, embedded SONOS Flash, FeRAM and PCM technologies. He has authored or co-authored 38 IEDM/VLSI/ISSCC papers and served as an IEDM memory technology committee member (2012–2013).

Dr. Lung is a Master Inventor who has been granted 230 US patents. He has given talks, tutorials, and short courses at VLSI, IEDM, ITRS, SEMITEC, IMW, MRS, ICMTD, ICSSICT, and CSTIC.



**Chung Lam** received the B.Sc. degree in electrical engineering from the Polytechnic University of New York, NY, USA, in 1978, and the M.Sc. and Ph.D. degrees, both in electrical engineering, from Rensselaer Polytechnic Institute, Troy, NY, USA, in 1987 and 1988, respectively.

Since joining IBM, Yorktown Heights, NY, USA, in 1978, he has taken responsibilities in various disciplines of semiconductor research, development, and manufacturing including circuit and device designs as well as process integrations for memory

and logic applications in IBM's Microelectronic Division. In 2003, he transferred to the IBM Research Division at the T. J. Watson Research Center. In 2007, he was named an IBM Distinguished Engineer. Currently, he manages Phase Change Memory Research Joint Projects. He has more than 200 granted U.S. patents and has published more than 100 technical papers.