Covers neural networks, model architectures, training dynamics, inference, and AI safety/policy.
Foundations
- neural-network — Overview: nodes, edges, weights, activations; links to specific architectures
- multilayer-perceptron — Fully-connected feed-forward network (MLP); the simplest architecture
- backpropagation — Algorithm that computes the cost-function gradient, , via the chain rule
- computation-graph — DAG of mathematical operations recorded during the forward pass; the structure autograd traverses to compute gradients
- gradient-descent — Iterative optimisation: step in the negative-gradient direction , , to minimise cost
- cost-function — Scalar measure of how badly the network performs (e.g. mean squared error)
- activation-function — Nonlinear function (sigmoid, ReLU) on layer’s weighted sum, giving a NN its expressive power
Architectures
- transformer-architecture — Token embeddings flowing through interleaved attention + MLP blocks; the backbone of modern LLMs
- attention-mechanism — Q/K/V + softmax: how token embeddings share information based on context
- multi-head-attention — Many attention heads per block running in parallel, each learning a different pattern
- self-attention-vs-cross-attention — Two variants of an attention head differing in where Q, K, V come from
- architecture-bias-and-weight-conventions — How “weight per arc, bias per neuron” generalises across MLP, CNN, RNN, Transformer
- network-diagram-vs-computation-graph — Two graph-based views of the same neural network at different levels of abstraction
Embedding & I/O
- one-hot-encoding — Binary vector representation of categorical indices; prevents false ordinal relationships and acts as a row-select on the weight matrix
- tokenization — Splitting text into subword tokens before embedding
- word-embedding — Learned vectors per token; directions encode meaning; dot product measures alignment
- unembedding — Final projection from residual stream to per-token scores
- logits — Unnormalised pre-softmax scores over the vocabulary
- softmax — Turns logits into a probability distribution; temperature controls sharpness
Training & Optimisation
- pretraining — Self-supervised next-token prediction over massive text corpora
- rlhf — Post-pretraining fine-tuning using human preferences to bend models toward assistant behaviour
Inference & Deployment
Language Models
- large-language-model — Top-level concept: what an LLM is, how sampling works, why scale matters
- gpt-3 — OpenAI’s 175B-parameter transformer; running example for all the parameter counts here
Interpretability
- superposition — Why features are spread across many neurons rather than one-per-neuron
- johnson-lindenstrauss-lemma — The math result explaining why high-dim spaces can pack many near-orthogonal directions
AI Safety & Alignment
Key Papers & Source Summaries
Neural Networks series (3b1b):
- src-3b1b-neural-networks-ch1 — Structure of a feed-forward neural network (MNIST)
- src-3b1b-neural-networks-ch2 — Learning via gradient descent on a cost function
- src-3b1b-neural-networks-ch3 — What the trained network actually learned (and didn’t)
- src-3b1b-neural-networks-ch4 — Backpropagation intuition: three levers, Hebbian echoes, SGD
- src-3b1b-neural-networks-ch5 — Backpropagation calculus: chain rule, multi-neuron generalisation
LLMs series (3b1b):
- src-3b1b-llms-ch1-llms-briefly — Non-technical overview of LLMs, pretraining, RLHF, transformers
- src-3b1b-llms-ch2-transformers — Transformer pipeline: embeddings, blocks, unembedding, softmax
- src-3b1b-llms-ch3-attention — Single-head attention (Q/K/V, masking) and multi-head attention
- src-3b1b-llms-ch4-mlps-store-facts — How MLP blocks might store facts; superposition
Queries
- backprop-graph-terminology — Why “children”, “upstream”, and “downstream” mean what they do in a backprop computation graph
Key Figures
- 3blue1brown — Grant Sanderson’s math/ML YouTube channel; source of the Neural Networks and LLMs series