Summary: A row hit eliminates the most expensive DRAM phases and reduces access latency from ~118 cycles to ~40 cycles; DDR5’s 32-bank, 8-bank-group architecture exists to maximise row hits, enable bank-group interleaving, and support targeted row hammer mitigations.
RAS and CAS
Every DRAM access decomposes into two address phases:
| Phase | Name | Bits | Carries | Triggers |
|---|---|---|---|---|
RAS | Row Address Strobe | 5 + 16 | Bank select + row address | PRE (row close) + ACT (row open) |
CAS | Column Address Strobe | 10 | Column address | RD or WR (column select + data transfer) |
A row, once opened by ACT, stays open until an explicit PRE (precharge) command closes it. Successive CAS commands can target the same open row without re-issuing ACT.
Row hit vs. row miss
| Event | Condition | Steps executed | Latency |
|---|---|---|---|
| Row hit | Next request targets same bank + same open row | CAS only (skip PRE + ACT) | tCL ≈ 40 cycles (~17 ns) |
| Row miss | Next request targets a different row in the same bank | PRE + ACT + CAS | tRP + tRCD + tCL ≈ 118 cycles (~49 ns) |
A row miss costs ~3× more than a row hit. CPU memory controllers, operating systems (memory allocators), and compilers all optimise access patterns to maximise row hits and avoid row thrashing — alternating between two rows in the same bank.
Core latency metrics (DDR5-4800 example)
Image: Latency timing diagram with
| Parameter | Symbol | Cycles | Nanoseconds | Measures |
|---|---|---|---|---|
| Row Precharge | tRP | 39 | 16 ns | Time to close a row and precharge all 8,192 bitlines to Vcc/2 |
RAS to CAS Delay | tRCD | 39 | 16 ns | Time from ACT until sense amplifiers have fully stabilised |
CAS Latency | tCL | 40 | 17 ns | Time from column address to first data bit on the bus |
| Active to Precharge | tRAS | 79 | 33 ns | Minimum time a row must stay open (sense + capacitor restore) |
| Row Cycle Time | tRC | 118 | ~49 ns | Minimum time between successive ACT commands to the same bank; tRC = tRAS + tRP |
- Row miss total: tRP + tRCD + tCL = 39 + 39 + 40 = 118 cycles (~49 ns)
- Row hit total: tCL only = 40 cycles (~17 ns)
Note: tRAS (79 cycles / 33 ns) is also the minimum window required during refresh — sense amplifiers need the full tRAS duration to restore all 8,192 capacitors in a row. The memory controller must not precharge a row before tRAS has elapsed.
Additional timing constraints
| Parameter | Symbol | Typical DDR5 | Measures |
|---|---|---|---|
| Four-Activate Window | tFAW | 40–50 cycles | At most 4 ACT commands are allowed in any rolling tFAWwindow across all banks. Limits simultaneous wordline activation to prevent power-supply droop (charging 8,192 transistor gates at once draws a current spike). |
| Row-to-Row Delay (same bank group) | tRRD_S | 8–12 cycles | Minimum gap between ACT commands to different bankswithin the same bank group |
| Row-to-Row Delay (different bank groups) | tRRD_L | 12–16 cycles | Minimum gap between ACT commands to banks in different bank groups |
CAS-to-CAS Delay(same bank group) | tCCD_S | 8–12 cycles | Minimum gap between CAS commands to banks in the same bank group |
CAS-to-CAS Delay(different bank groups) | tCCD_L | 16–20 cycles | Minimum gap between CAS commands to banks in different bank groups |
Naming gotcha: tRRD_S = Same bank group (shorter gap); tRRD_L = different bank groups (Long delay, despite being a cross-group access). The mnemonic is counter-intuitive.
Bank group interleaving
TODO: Bank group interleaving timing diagram
- two
ACT+CASsequences targeting BG0 and BG1 shown on a timeline.BG0-ACT, thenBG1-ACT(after tRRD_S), thenBG0-CAS, thenBG1-CAS(overlapping within tCCD_L).- Shows how different bank groups can be pipelined, hiding row-open latency.
Commands to different bank groups benefit from a shorter tCCD_S gap because each bank group has its own independent I/O circuitry — two bank groups can have their column-select operations in flight simultaneously. This is the primary motivation for bank groups:
- 4 bank groups (DDR4): up to 4 concurrent
CASpipelines - 8 bank groups (DDR5): up to 8 concurrent
CASpipelines → higher sustained throughput
The IMC exploits this by interleaving RD/WR commands across bank groups, effectively hiding tCCD within overlapping bank-group operations.
Why 32 banks maximise row hits
Each bank independently holds one open row. More banks = more simultaneously open rows = higher probability that the next memory request hits an already-open row.
- 32 banks × 8,192 columns per bank = 262,144 columns accessible without closing any row
- If the CPU workload accesses data spread across ≤32 distinct rows, all can stay open simultaneously — zero row misses
Bank group refresh parallelism
During REFsb (per-bank refresh), only 1 bank per bank group needs to be refreshed at a time. With 8 bank groups × 4 banks each:
- 8 banks refreshing simultaneously (1 per group)
- 24 banks (3 per group) remain accessible for read/write
This interleaving substantially reduces the effective performance penalty of the mandatory 64 ms refresh cycle. See Refresh operation for REFab vs REFsb timing values.
Image: Die photo, 8 bank groups, chip undergoing
REFsb
Row Hammer
Row Hammer is a reliability and security vulnerability: repeatedly activating (hammering) the same row causes electromagnetic disturbance that flips bits in adjacent rows, even though those rows are never directly accessed.
Mechanism:
- Each
ACTcommand stresses the transistor gates and substrate of physically neighbouring rows - Thousands of activations within a single 64 ms refresh window can cause charge to leak into adjacent capacitors, flipping a bit in a row the attacker never touched
- A flipped bit in another process’s memory can enable arbitrary memory corruption or privilege escalation
TODO: Row Hammer mechanism diagram — two adjacent rows shown as cell grids. The aggressor row is hammered (multiple ACT pulses). Charge disturbance is shown leaking into victim rows above and below. Victim-row bit flips illustrated.
DDR5 mitigations:
| Mechanism | How it works |
|---|---|
| On-die ECC | Corrects single-bit errors before data reaches the CPU. Catches some hammer-induced flips, but cannot correct all (especially double-bit or targeted patterns). |
| RFM (Refresh Management) | The DRAM chip tracks activation counts per bank. When activations exceed a threshold (RAAIMT), it signals the IMC. The IMC issues RFM commands giving the DRAM time to refresh nearby victim rows. |
| DRFM (Directed Refresh Management) | Extension of RFM. The IMC identifies the specific aggressor row and issues DRFM targeting the physically adjacent victim rows for immediate targeted refresh. Two variants: DRFMab (all banks) and DRFMsb (same bank across 8 bank groups). |
IMC scheduling
The Integrated Memory Controller (IMC) on the CPU implements scheduling policies that directly affect effective latency and bandwidth:
- Open-page policy: Leave rows open after access. Maximises row hits for sequential or spatially localised workloads.
- Close-page policy: Precharge rows immediately after access. Maximises bank availability for random-access workloads.
- Gear mode (Intel, 12th gen+): At high DDR5 speeds, the IMC runs at half the memory clock (Gear 2), adding ~10–15 ns effective latency but maintaining signal integrity.
- FCLK sync (AMD, Ryzen 7000+): The Infinity Fabric must track the memory clock 1:1 for minimum latency; desynchronisation at very high DDR5 speeds adds latency.
See also
- dram-read-write-refresh — full step-by-step breakdown of read, write, and refresh sequences; REFab and REFsb
- dram-memory-cell — sense amplifier physics; why tRCD and tRAS have the values they do
- dram-burst-buffer — CAS-side optimisation; tCCD and bank-group interleaving on the column path
- dram-overview — IMC, sub-channels, and the full DIMM physical hierarchy
Sources
- Branch Education — How Does Computer Memory Work?
- GamersNexus — Memory Timings: CAS Latency, tRCD, tRP, tRAS
- ChipLog — Fundamental guide to DRAM performance and timing parameters
- Wikipedia — DDR5 SDRAM
- Arxiv — Securing DRAM at Scale: ARFM-Driven Row Hammer Defense

