Math

5 posts in this domain.

Why does cosine similarity dominate over Euclidean distance in embeddings? Two vectors can be far apart and still mean the same thing. Cosine similarity asks the only question that turns out to matter: are they pointing the same way? Apr 29, 2026 · intro
Why is the central limit theorem load-bearing? Almost every confidence interval, A/B test, and gradient-noise argument quietly leans on one fact: averages of independent things look Gaussian, even when the things themselves don't. Apr 29, 2026 · intro
Why does information entropy use log base 2? Shannon could have picked any base for the logarithm in his entropy formula. He picked 2 — and the choice quietly fixes the unit you measure information in. Apr 29, 2026 · intro
Why does softmax look like that? Softmax is the function that turns a vector of arbitrary numbers into probabilities. The exponential in the middle isn't decorative — it's what makes the whole machine differentiable, well-behaved, and historically inevitable. Apr 29, 2026 · intermediate
Why matrix multiplication is the bottleneck of modern ML Modern ML is mostly one operation in a trench coat. Understanding why matmul dominates explains hardware, software, and why GPUs eat the world. Apr 29, 2026 · intermediate