DRAM Subarrays and Bitline Architecture

Summary: Each DRAM bank is physically subdivided into 1,024 × 1,024 cell subarrays with local sense amplifiers to keep bitlines and wordlines short; DDR5 uses folded bitline subarrays, where BL and /BL run side-by-side to cancel common-mode noise and enable smaller, more reliable cells.

The problem with full-length lines

A naive implementation of a full bank — 65,536 rows × 8,192 columns — would require lines spanning the entire array:

Line type	Problem
Long bitlines	The bitline wire’s parasitic capacitance is large relative to a single 1T1C cell capacitor (~10–20 fF). The voltage shift caused by one cell connecting to the long bitline falls below ~1 mV — too small for reliable sense amplification.
Long wordlines	A wordline must simultaneously switch 8,192 transistor gates. The cumulative capacitive load (proportional to 8,192 × gate capacitance) increases the RC time delay, slowing row activation and increasing tRCD.

The subarray solution

Each bank is divided into 1,024 rows × 1,024 columns subarrays, with a dedicated row of sense amplifiers at the bottom of each subarray.

TODO: Subarray close-up — 1,024 × 1,024 grid of memory cells, row of sense amplifiers along the bottom edge. Source:

TODO: Bank cross-section — multiple subarray strips side by side, with sense amplifier rows between them. Source:

Benefits:

Benefit	Mechanism
Smaller capacitors feasible	Shorter bitlines have lower parasitic capacitance → smaller voltage shift (~150 mV) is still detectable
Faster row activation	Shorter wordlines have lower RC delay → access transistors switch faster, reducing tRCD
Local sense amplification	Each subarray’s SA row amplifies locally before signals reach the shared column multiplexer

The subdivision is physical only — subarrays still share the bank’s row decoder and column multiplexer. The logical address space is unchanged.

Open bitline vs. folded bitline

Modern DDR5 subarrays use the folded bitline architecture for its noise-rejection properties:

Architecture	SA placement	Noise behaviour
Open bitline	SAs on one side of the subarray only	BL and /BL run through different physical regions → they couple different noise patterns → differential SA cannot cancel it
Folded bitline	SAs in the middle, between two half-subarrays	BL and /BL run side-by-side through the same region → identical common-mode noise on both → differential SA cancels it completely

How folded bitline works:

Each subarray is split into two halves (upper and lower)
The SA row sits between the halves
BL connects to one cell in the upper half; /BL connects to the SA’s reference path in the lower half
Any coupling from switching wordlines, power supply noise, or nearby cells hits BL and /BL equally
The cross-coupled SA subtracts them: only the tiny cell signal (~150 mV differential) remains

TODO: Folded bitline diagram — two half-subarrays above and below a central SA row. Show BL (connecting a cell in the upper half) and /BL (connecting to lower half). Annotate common-mode noise coupling equally to both rails, and how the differential SA cancels it while amplifying the genuine cell signal.

Tradeoff: Folded bitline roughly doubles the die area consumed per bitline vs. open bitline, because the SA sits between the two half-subarrays rather than at one edge. The noise immunity and smaller achievable cell size are worth the area cost at modern process nodes.

Sense amplifier circuit

The SA at the base of each subarray is a regenerative cross-coupled latch:

M1, M2 (NMOS) and M3, M4 (PMOS): cross-coupled inverter pairs
M5: power transistor connecting the latch bottom to GND; controlled by SAN (n-enable)
M6: power transistor connecting the latch top to Vcc; controlled by SAP (p-enable)
Three precharge transistors (M7, M8, M9): equalise BL and /BL to Vcc/2 before each row open

Activation sequence:

Precharge (M7–M9 on): BL = /BL = Vcc/2 = 0.55 V
Precharge off, wordline pulses: charge sharing creates ~150 mV differential
SAN asserts (M5 on) → NMOS side activates, pulls lower node toward GND
SAP asserts (M6 on) → PMOS side activates, pushes upper node toward Vcc
Positive feedback drives BL and /BL to opposite rails (0 V and 1.1 V)

This amplification takes on the order of 1–2 ns and determines the bulk of the tRCD constraint (see dram-row-hits-and-latency).

Isolation transistors

Where adjacent subarrays share a sense amplifier row, isolation transistors separate each subarray’s bitlines from the SA:

Active subarray: isolation transistors open → bitlines connect to SA
Inactive subarrays: isolation transistors closed → bitlines disconnected

This allows one SA row to serve two neighbouring subarrays (one above, one below in the folded layout), reducing die area.

Folded SA layout (packing optimisation)

A further optimisation — the folded sense amplifier layout — physically interleaves SA columns between subarray columns rather than placing the entire SA row at one edge. This recovers some of the area cost of the folded bitline architecture. Implementation-specific to each manufacturer; not detailed here.

Sources

Branch Education — How Does Computer Memory Work?
MDPI — Low-Power Single Bitline Load Sense Amplifier for DRAM
Medium — DRAM: Charge Sharing and Sense Amplifier Operation
Wikipedia — Sense amplifier

notes/

DRAM Subarrays and Bitline Architecture

The problem with full-length lines

The subarray solution

Open bitline vs. folded bitline

Sense amplifier circuit

Isolation transistors

Folded SA layout (packing optimisation)

See also

Sources

DRAM Subarrays and Bitline Architecture

The problem with full-length lines

The subarray solution

Open bitline vs. folded bitline

Sense amplifier circuit

Isolation transistors

Folded SA layout (packing optimisation)

See also

Sources

Graph View

Backlinks

Explorer