Summary: Each DRAM bank is physically subdivided into 1,024 × 1,024 cell subarrays with local sense amplifiers to keep bitlines and wordlines short; DDR5 uses folded bitline subarrays, where BL and /BL run side-by-side to cancel common-mode noise and enable smaller, more reliable cells.
The problem with full-length lines
A naive implementation of a full bank — 65,536 rows × 8,192 columns — would require lines spanning the entire array:
| Line type | Problem |
|---|---|
| Long bitlines | The bitline wire’s parasitic capacitance is large relative to a single 1T1C cell capacitor (~10–20 fF). The voltage shift caused by one cell connecting to the long bitline falls below ~1 mV — too small for reliable sense amplification. |
| Long wordlines | A wordline must simultaneously switch 8,192 transistor gates. The cumulative capacitive load (proportional to 8,192 × gate capacitance) increases the RC time delay, slowing row activation and increasing tRCD. |
The subarray solution
Each bank is divided into 1,024 rows × 1,024 columns subarrays, with a dedicated row of sense amplifiers at the bottom of each subarray.
TODO: Subarray close-up — 1,024 × 1,024 grid of memory cells, row of sense amplifiers along the bottom edge. Source:
TODO: Bank cross-section — multiple subarray strips side by side, with sense amplifier rows between them. Source:
Benefits:
| Benefit | Mechanism |
|---|---|
| Smaller capacitors feasible | Shorter bitlines have lower parasitic capacitance → smaller voltage shift (~150 mV) is still detectable |
| Faster row activation | Shorter wordlines have lower RC delay → access transistors switch faster, reducing tRCD |
| Local sense amplification | Each subarray’s SA row amplifies locally before signals reach the shared column multiplexer |
The subdivision is physical only — subarrays still share the bank’s row decoder and column multiplexer. The logical address space is unchanged.
Open bitline vs. folded bitline
Modern DDR5 subarrays use the folded bitline architecture for its noise-rejection properties:
| Architecture | SA placement | Noise behaviour |
|---|---|---|
| Open bitline | SAs on one side of the subarray only | BL and /BL run through different physical regions → they couple different noise patterns → differential SA cannot cancel it |
| Folded bitline | SAs in the middle, between two half-subarrays | BL and /BL run side-by-side through the same region → identical common-mode noise on both → differential SA cancels it completely |
How folded bitline works:
- Each subarray is split into two halves (upper and lower)
- The SA row sits between the halves
- BL connects to one cell in the upper half; /BL connects to the SA’s reference path in the lower half
- Any coupling from switching wordlines, power supply noise, or nearby cells hits BL and /BL equally
- The cross-coupled SA subtracts them: only the tiny cell signal (~150 mV differential) remains
TODO: Folded bitline diagram — two half-subarrays above and below a central SA row. Show BL (connecting a cell in the upper half) and /BL (connecting to lower half). Annotate common-mode noise coupling equally to both rails, and how the differential SA cancels it while amplifying the genuine cell signal.
Tradeoff: Folded bitline roughly doubles the die area consumed per bitline vs. open bitline, because the SA sits between the two half-subarrays rather than at one edge. The noise immunity and smaller achievable cell size are worth the area cost at modern process nodes.
Sense amplifier circuit
The SA at the base of each subarray is a regenerative cross-coupled latch:
- M1, M2 (NMOS) and M3, M4 (PMOS): cross-coupled inverter pairs
- M5: power transistor connecting the latch bottom to GND; controlled by SAN (n-enable)
- M6: power transistor connecting the latch top to Vcc; controlled by SAP (p-enable)
- Three precharge transistors (M7, M8, M9): equalise BL and /BL to Vcc/2 before each row open
Activation sequence:
- Precharge (M7–M9 on): BL = /BL = Vcc/2 = 0.55 V
- Precharge off, wordline pulses: charge sharing creates ~150 mV differential
- SAN asserts (M5 on) → NMOS side activates, pulls lower node toward GND
- SAP asserts (M6 on) → PMOS side activates, pushes upper node toward Vcc
- Positive feedback drives BL and /BL to opposite rails (0 V and 1.1 V)
This amplification takes on the order of 1–2 ns and determines the bulk of the tRCD constraint (see dram-row-hits-and-latency).
Isolation transistors
Where adjacent subarrays share a sense amplifier row, isolation transistors separate each subarray’s bitlines from the SA:
- Active subarray: isolation transistors open → bitlines connect to SA
- Inactive subarrays: isolation transistors closed → bitlines disconnected
This allows one SA row to serve two neighbouring subarrays (one above, one below in the folded layout), reducing die area.
Folded SA layout (packing optimisation)
A further optimisation — the folded sense amplifier layout — physically interleaves SA columns between subarray columns rather than placing the entire SA row at one edge. This recovers some of the area cost of the folded bitline architecture. Implementation-specific to each manufacturer; not detailed here.
See also
- dram-memory-cell — 1T1C cell structure; the ~10–20 fF capacitor that sets the bitline signal size
- dram-read-write-refresh — how row open and sense amplifier activation fit into the operation sequence
- dram-row-hits-and-latency — how tRCD is determined by SA settling time
- dram-burst-buffer — the complementary optimisation on the column multiplexer side
Sources
- Branch Education — How Does Computer Memory Work?
- MDPI — Low-Power Single Bitline Load Sense Amplifier for DRAM
- Medium — DRAM: Charge Sharing and Sense Amplifier Operation
- Wikipedia — Sense amplifier

