



Workshop on Innovative Memory Technologies June 29, 2011

## What made NAND Flash work?



It takes developments at all levels to achieve a commercially viable product. This presentation concentrates on the design aspects of RRAM.



#### Outline

- CMOx technology
- Array construction
- WL length
- BL length
- Disturb
- Power and Speed
- Results



#### Outline

- CMOx technology
- Array construction
- WL length
- BL length
- Disturb
- Power and Speed
- Results

#### Unity CMOx<sup>™</sup>: Movement of lons



#### © Unity Semiconductor Corporation

🖑 Unity

CMOx cell to date:

- IV curve with program/erase
- Cycling to over 100,000 cycles
- 6 months data retention at 110C
- 1 year at 70C spec







🖗 l Jnity

#### Where is the Select Device?

Cross-point array dimensions are severely limited by stray currents unless some sort of select device is introduced



# Unity CMOx Cross-point Memory™



4 layers are fabricated in approximately the same number of masking steps as NAND



#### Outline

- CMOx technology
- Array construction
- WL length
- BL length
- Sensing
- Disturb

😤 Unity

#### **Cross Point Array Half-Voltage Selection**



 half-selected cells do not program, but draw current



🔆 Unity

#### **Read Selection**



- Low current cell
- Longer latency



Read



## **Cross Point Array Limitations**

#### • Array Lines:

- Total current (electro migration limit)
- IR drop during program/erase
- Disturb of unselected cells
- Loss of sensing margin (on Bit Lines)

- Maximum Array size = 4096 x 128 x 4 layers
  - Row length (4096) limited by IR drop and EM during program and erase
  - Column length (128 x 2 layers) limited by cumulative stray currents thru non-selected cells during read



#### **Cross Point Array with Local Bit Lines**



🔅 Unity

## **Memory Physical Organization**



🔆 Unity

## New Array Advantages

#### • Short Bit Lines:

- Controlled read leakage
- No IR drop issues on BL
- No Electro Migration issues on BL
- reduced Program Disturb
- Some floating bit lines
- Y decoders are laid out under the memory arrays

#### • Long Word Lines:

- IR drop will be cancelled out
- Limited by electro-migration
- Limited by program disturb
- Small total X-decoder area

🔆 Unity

#### Read



🎯 Unitv

#### Program



- Selected cells at 2.5V
- Unselected cells at 1.5V (BL) or 1V (WL).



🔆 Unity

#### Erase





## Array with Local Bitlines – Layout



## Onity 512Gb/1Tb Storage Chip





#### Die size estimates

- 20nm cell, 45nm CMOS
- 512Gb 178 mm2

- 15nm cell, 32nm CMOS
- 1Tb 179 mm2

Onity

## Does it really work?

- After ISSCC '2010 1Mb arrays presentation:
- "I don't think you can program a cell at the end of a line without disturbing cells at the beginning of the line".
- "The cell signal will be lost in the leakage current from other cells".
- "The total current on the lines will be too high, IR drops will be huge".



#### Outline

- CMOx technology
- Array construction
- WL length
  - IR drop and unselected WL biasing
- BL length
- Sensing
- Disturb
- Results

Onity

#### **Row IR Drop**



- characteristics
  - 4096 cells on Row
  - $1\Omega / * 8192 = 8.2 k\Omega$
  - I<sub>prog</sub> = 1uA, I<sub>half\_select</sub> = 26nA @ 1.25v → IR drop over 500 mV at line end
  - Total current at driver is around 200 uA



#### Row IR Drop with ΔV compensation



Added voltage creates higher current and parasitic programming in unselected cells

#### <sup>e Onlify</sup> Row IR Drop with ΔV compensation and location adjustment



- Adjust Driver voltage depending on location of cells being programmed
- Added current in cells near driver is still too high

# Row IR Drop with ΔV and location compensation, plus counter bias

1.5V



Floating lines

- Adjust Driver voltage depending on location of cells being programmed
- Added bias on unselected Word Lines will bias unselected Bit Lines
- Total Word Line current around 100 uA

#### Simulation on IR drop





#### Outline

- CMOx technology
- Array construction
- WL length
- BL length

Unselected cell leakage and data dependency

- Disturb
- Power and Speed
- Results



#### **CMOx Read with No Select Device**









#### Outline

- CMOx technology
- Array construction
- WL length
- BL length
- Disturb
- Power and Speed
- Results



#### **Disturb Times during Program and Read**

- Array 4096 x 128 x 2 layers
- WL during program: 32 cells programmed at a time (128 write cycles per page)
- 10us pulse x 128 write cycles x 2 layers = 2.6 ms disturb at half write voltage – counter bias
- WL during read: 8 read cycles per page to read all bit lines
- 50 us x 2 x 8 x 100000 read cycles = 80 s disturb at read voltage 0.2V



• BL during read: disturb at less than 200 mV is inexistent.

## Measured effect of disturbs





#### **Power and Speed**

- Memory + Non Ohmic Select device: 5V -7V operation
- Memory alone: < 3V operation
- No positive charge pump
- Small negative charge pump
- → much lower total current consumption
- Speed is achieved by parallelism
  - Trading off block size for speed

**White** Unity

## Single Block Throughput Example

Read 16 rows in 1 Block to fill page buffer  $16x50uS = 800\mu S \rightarrow 10MB/sec$ 



= 4096b sub-word read from Tile in one 50us sensing cycle

## 16 Block Throughput Example

Read 1 row in 16 Blocks to fill page buffer 1x50uS = 50uS → 160MB/sec



= 4096b sub-word read from Tile in one 50us sensing cycle

© Unity Semiconductor Corporation



#### Outline

- CMOx technology
- Array construction
- WL length
- BL length
- Disturb
- Power and Speed
- Results



#### Program Mixed Patterns on 4 Kbits arrays without select devices Total Array Size=32Kbits





#### CMOx™100Kcycles measured Read Disturb



- Read cycle = ~10us/bit @ 1V
- $\Delta$ iMedian erase @ 100K reads = -8.8%
- $\Delta iMedian \ program @ 100K \ reads = +18\%$  (includes influence of retention loss)



#### 2bpc & 3bpc Cell Current Distribution

#### <u>2 bpc</u>





<u>3 bpc</u>

Decoded arrays, bitmaps after full array program



## Why CMOx (and not other RRAMs)?

- Embedded non-linearity allows for cell selection with a 1000:1 cell/leakage ratio, allowing for 256 bit wide arrays
  - Non-linearity in both program and erase levels, unlike RRAM
- Low switching current (< 1uA/cell)</li>
- Progressive, time dependent programming



#### Conclusions

- An architecture with Local Bit Lines enables CMOx RRAM technology for high density memories.
- The LBL pass gates can be laid out under the arrays with minimal die size increase.
- Effect of IR drops, disturbs and data sensitivities have been taken into account and minimized.
- Increased parallelism due to small arrays enables fast read and write speeds.