

Everything is confidential  
except when explicitly  
agreed upon

# Intro regarding Read sensing schemes

**IMEC CORE CMOS**



**Stefan Cosemans**

# Outline

- Introduction
- Cell resistance distribution
  - Memory element + access transistor = “Rcell”
- Array organization
- Read schemes
- Reference scheme
- Notes

# Introduction

- Forget all you know about amplifiers
  - sense amplifiers used in memory are event-triggered devices
  - $\square$  transient analysis, not ac analysis!
  - Typically consist of an analog first stage followed by a strobed comparator (latch, “voltage <sup>comparator</sup> SAI”)



# Outline

- Introduction
- Cell resistance distribution
  - Memory element + access transistor = “Rcell”
- Array organization
- Read schemes
- Reference scheme
- Notes

Notation:  
 $N(\mu, \sigma)$

# Read operation: Variations



Mainly worried about local variations.  
Global variations can be accommodated to some extent



Convolution with SA distribution

Assessment indicates that a read scheme designed for  $4.5\sigma$  margin when reading a  $4.5\sigma$  WC cell is about right if limited column redundancy is assumed



# Outline

- Introduction
- Cell resistance distribution
- Array organization
  - Local mux
  - SL // BL ; SL // WL ; SL plane [partial or full]
- Read schemes : some options
  - Sensing options
  - Critical transistor sizing
  - Sensing challenges
  - Sense scheme topologies
- Reference scheme
- Notes

# Matrix organization

**SRAM**



Half-selected BLs must be precharged for cell stability

→ High energy consumption

Not compatible with WL boost write / 8T cell



**STT-MRAM**



Cells on WL with  $BL=SL$  are de-selected



Can mux columns

Can use better SA  
Shield wires block crosstalk

# 1T-1MTJ read



## 1T-1MTJ read

- Read bit line is precharged (high)
- Depending on cell state
  - RBL is discharged by Iread
  - RBL stays at initial value
- Strobed SA senses voltage difference



*Similar to 8T-SRAM read*

*[better options exist, but this is good enough for current analysis]*



strob

e

V<sub>reference</sub>

e

Effective signal =  
ILRS - IHRS

Must set reference correctly.

- IHRS not 0, but still small
  - BL voltage signal is actually RC discharge instead of current source
  - Timing becomes more
- circuit designers are concerned about ILRSmin and IHRSmax, not in mean values.
- Min and max are  $\sim 10\text{-}9$  percentiles for SRAM

# Array organizations (1/2) [prior art]

- Large cell size [Wire Backplane]  
needed  $\square$  2x area]
- Each SL and BL carry current of 1 selected cell
  - $\square$  No I\*R voltage drop and electro-migration issues
- Can de-select WL-half-selected-cells by setting BL and SL to same potential
- Single phase write
- Either one SL or one BL switches per bit



- Small cell size [only one wire track //BL]
- **The SL carries current of all selected cell**
  - $\square$  I\*R voltage drop and electro-migration makes this organization impossible except for ultra-low current cells
- All cells on selected WL consume write energy
- Dual phase write
- One SL switches for all bits combined ; one BL switches per write-0



# Array organizations (2/2) [new?]

## Full SL plane [ $\sim 18F2$ ]

- SL//WL, but connect all SLs together once every so many cells [haven't seen this yet]
- **Two-phase write required**
  - Can be acceptable in some cases
    - Not time-critical
    - One operation much faster than other (e.g. reset versus set for RRAM)

## Partial SL plane [ $\sim 18F2$ ]

- SL is drawn // WL, but every N cells they are combined into a SL//BL
- only one cell out of these N is accessed at a time
  - [for write, this requires that non-selected BLs are driven to the same potential as SL]
- Small cell size [ only one wire track //BL] + shared //BL track
- The SL carries current of one selected cell
  - (first small piece//WL, then //BL)
- All cells on selected WL consume write energy
  - $E=CBL * V_{write} * V_{dd}$
- Single phase write

## Partial SL plane



# Operation - full SL plane



- All SLs connected together
- Can choose whether all cells on selected WL are activated, or only subset (e.g. only 0, 2, 4, 6)

|                     | write                               |                                      |
|---------------------|-------------------------------------|--------------------------------------|
|                     | First phase<br>Write 0 ; SL=Low     | Second phase<br>Write 1 ; SL=High    |
| Cell:<br>write 0    | BL=High [write]                     | BL=High [deselect]                   |
| Cell:<br>write 1    | BL=Low [deselect]                   | BL=Low [write]                       |
| Cell:<br>Keep state | BL=Low [deselect]<br>[BL tracks SL] | BL=High [deselect]<br>[BL tracks SL] |

| Read             |                     |
|------------------|---------------------|
| SL=0             |                     |
| Column to read   | BL=Vread            |
| Column to ignore | BL=0 or<br>BL=Vread |

# Operation - Partial SL plane



- SLs connected together in sets of N
- Can access one cell out of N for write. Can write 0s and 1s at the same time.

|                     | Write                      |                                                                                                |
|---------------------|----------------------------|------------------------------------------------------------------------------------------------|
|                     | Single phase               |                                                                                                |
| Cell:<br>write 0    | SL=Low ; BL=High [write 0] |                                                                                                |
| Cell:<br>write 1    | SL=High; BL=Low [writ 1]   | SL is fixed by SL of cell to be written on same partial SL plane; $V(BL)=V(SL)$ [BL tracks SL] |
| Cell:<br>Keep state |                            |                                                                                                |

| Read             |                  |
|------------------|------------------|
| SL=0             |                  |
| Column to read   | BL=Vread         |
| Column to ignore | BL=0 or BL=Vread |

# Outline

- Introduction
- Cell resistance distribution
- Array organization
- Read schemes : some options
  - Sensing options
  - Critical transistor sizing
  - Sensing challenges
  - Sense scheme topologies
- Reference scheme
- Notes

# Sensing options

- BL load
  - High impedance [voltage sensing]
    - Open (precharge, and discharge by cell)
    - Current source
  - Intermediate Rload
  - Low impedance [current sensing]
    - Small load resistor
    - Current conveyor ( $1/gm$ )
- When to sample?
  - During transient [difficult to create reference signal]
  - After the analog signal has settled down
    - Must wait  $\sim 2\tau$ ,  $\tau = RBL \cdot CBL$ 
      - RBL is combination of cell resistance and BL load resistance
      - CBL is cell capacitances, wire capacitances and periphery capacitance
- Reference generation/selection scheme
  - Mimic operation
    - Not all mimicking schemes are equally good  $\Rightarrow$  must average mismatch, track vdd,T,Rpath
  - Reference not obtained from mimicked operation
    - With or without adjustment to temperature (and operation vdd)

## Select rload

- As large as possible
  - Limited by sensing latency  $\tau = RBL \cdot CBL$
  - Need low-mismatch load
- Combine low CBL with SA shared over many cells
- BL mux, global BLs

# Read latency

- Steps

- Decoder
- Precharge BL, activate BL load
- Active cell WL
- BL settles to either VBL,HRS or VBL,LRS

-  $\tau = R_{BL} C_{BL}$

- RBL includes cell resistance and BL
- $2\tau = 86\%$ ,  $3\tau = 95\%$  of signal developed
- Beware of cross-talk and signal feed-through during first stage  $\Rightarrow$  best to not live on the edge



Current injection read, example point A



- SA with positive feedback is triggered



# Critical transistor sizing for voltage SA cross-coupled pair

## Regular SA



| $V_{BL}$ (HRS-LRS) [mV] | required $\sigma V_{offset}$ [mV] | #fins |
|-------------------------|-----------------------------------|-------|
| 50                      | 5.6                               | 156   |
| 25                      | 2.8                               | 623   |
| 12.5                    | 1.4                               | 2492  |

$\pm 4.5\sigma$  design

For differential pair: need 2x

- 25mV probably OK if we can share one SA over >8K cells
- impacts max parallelism for given memory size
- Beware of large peak currents and

data-dependent effects

## Calibration can help

Example:  
 select optimal voltage out of N options



| N | Theoretical reduction in required BL signal | SA size reduction |
|---|---------------------------------------------|-------------------|
| 2 | 2x                                          | 4x                |
| 4 | 4x                                          | 16x               |
| 8 | 8x                                          | 64x               |

### Challenges:

- Need accurate control of  $V_{ref}$
- Difficult to track  $T, V_{dd}$  variations
- Beware of BTI, RTN and thermal noise

### Alternatives:

- Add skew transistors, ...
- Skew mimic-generated reference
  - Using MTJs for this would be nice

# Sensing Challenges

## Transistor imperfections

|                                                                                                                                                                     |                                                                                                                                                                                                                                                                    |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Process variations                                                                                                                                                  | Use differential circuits<br>calibration                                                                                                                                                                                                                           |
| Time-0 mismatch                                                                                                                                                     | Upsize critical transistors<br>Calibration                                                                                                                                                                                                                         |
| Degradation<br>e.g.<br>NBTI/PBTI<br>negative/positive<br>Bias and<br>Temperature<br>Instability<br><b>Work ongoing at<br/>imec for SRAM<br/>(Ben Kaczer's team)</b> | Avoid voltage stress (e.g. disconnect SA during write)<br>Make voltage stress common-mode:<br>SA random toggle input/vref terminal<br>Select most robust transistor type<br>Calibration in-the-field (requires error detection)<br>ce II Repair in-the-field ; ECC |
| RTN                                                                                                                                                                 | Upsize critical transistors<br>Screen weak SA/cells up-front<br>Error detection + retry                                                                                                                                                                            |

Cannot solve degradation by upsizing!!

## Environment factors

|                        |                                                                                                                         |
|------------------------|-------------------------------------------------------------------------------------------------------------------------|
| Temperature variations | Use reference scheme that tracks these vdd/temperature in same way as actual signal. Be careful with strong calibration |
| Voltage variations     | Can have complicated transient behavior                                                                                 |
| Cross-talk             | Sense after settling. Control SA activation carefully                                                                   |

Thermal noise □ don't do stupid things...

**Preamplification with switches+capacitance**  
would help for mismatch, degradation, RTN in sense amplifier transistors  
[JP00519010A, NEC, 1983]

### Abstract:

(JP00519010)  
PURPOSE: To increase a minute input voltage, and to obtain a comparator having a high sensitivity by switching and controlling plural capacitors from parallel to series.  
CONSTITUTION: A sampled input voltage V is applied to a terminal 1, an output S(sub 1) and S(sub 2) of a controlling circuit 23 become "1" and "0", respectively, by a clock signal C applied to the terminal by synchronizing with said voltage, and switches SW9-17 are switched to the connection side. As a result, capacitors C4-8 are connected in parallel and charged to the voltage V. Subsequently, when a prescribed time passes, the output S(sub 1) of the circuit 23 becomes "0", and the C4-8 are detached from the parallel state and hold the voltage V, respectively. Next, the output S(sub 2) become "1", the SW18-21 are switched to the connection side, the C4-8 are connected in series, and Vc=5V is applied to the positive terminal of a comparator amplifier circuit 22. By repeating this step by a sample period, the sampled input voltage is brought to five times, respectively, applied to the input side of the circuit 22, amplified and outputted from a comparison output terminal 2.



# Sense scheme topologies



Read current polarity?

- $BL \rightarrow SL$  or  $SL \rightarrow BL$
- (set or reset polarity)

A MOSFET is a voltage-controlled component. Even if we say we sense current, we do not...

# Read latency - voltage precharge + discharge only

| Example        |           |                     |                                                     |
|----------------|-----------|---------------------|-----------------------------------------------------|
| LRS state      | 12.5kΩ    | 20µA @ 0.25V        | 100% TMR                                            |
| HRS state      | 25kΩ      | 10µA @ 0.25V        |                                                     |
| CBL per cell   | 75aF/cell |                     |                                                     |
| SA sensitivity | 50mV      | ~64x upsized for 6σ | Affordable with BL mux<br>Better solutions possible |



Note: only RMTJC discharge. Must add RMOSFET to both RLRS and HRS

# Outline

- Introduction
- Cell resistance distribution
- Array organization
- Read schemes : some options
- Reference scheme
- Notes

# Reference scheme

- Crucial!
  - Must track
    - process corners [transistors and resistors]
    - temperature variations
    - Vdd/Vss fluctuations
    - gradual variations in ME (if applicable)
  - Should avoid to double ( $\sqrt{2}$ ) the cell mismatch

# Outline

- Introduction
- Cell resistance distribution
- Array organization
- Read schemes : some options
- Reference scheme
- Notes

# Some notes

- Things you could look into
  - Different kinds of read schemes
  - Impact of WC ME RLRS,RHRS
  - Impact of array organization
    - also #cells / BL ,
  - Different access transistors [e.g. up to 4 fins]
  - Different read direction
  - Maximal allowed ME read current (or voltage) to avoid disturbs
  - Different reference schemes
  - ....