Build a Multilayer Perceptron in pure Python. No libraries.
The thing that surprised me most when building an MLP from scratch wasn't the math. The math is manageable. What tripped me up was how much of PyTorch's "magic" is actually just bookkeeping.
During the forward pass, you have to remember your intermediate values (the pre-activation z and the post-activation a for each layer) because the backward pass needs them. If you throw them away, you have to re-run the forward pass during backpropagation, which defeats the purpose.
The second gotcha: backpropagation is the chain rule applied backwards through a list. That's it. But the indexing is fiddly, and off-by-one errors are invisible until your loss doesn't decrease.
After implementing the training loop and watching the loss actually decrease, I felt something I didn't expect: relief. Not pride — relief. Because for the first time I was certain that the thing I was building was doing what I thought it was doing. No black box.
Also: a 3,400-parameter network trained for 500 epochs on 26 examples fits in about 14 KB and classifies all 26 letters with 95%+ accuracy. That's smaller than most JPEG images.
A complete MLP that:
- Takes a 26-dimensional one-hot vector (representing a letter) as input
- Passes it through a hidden layer of 64 ReLU neurons
- Produces a 26-dimensional output
- Is trained with SGD + cross-entropy loss + backpropagation
- Achieves 95%+ accuracy at classifying all 26 letters
Seven chapters, building one piece at a time.
| Chapter | File | What It Covers |
|---|---|---|
| 1 | 01_math_foundations.py |
Vectors, matrices, dot product — the atoms |
| 2 | 02_single_neuron.py |
One neuron: weighted sum + activation |
| 3 | 03_forward_pass.py |
Layer and Network: many neurons, chained |
| 4 | 04_loss_function.py |
Softmax, cross-entropy, MSE |
| 5 | 05_backpropagation.py |
The chain rule, spelled out |
| 6 | 06_training_loop.py |
SGD: forward → loss → backward → update |
| 7 | 07_final_project.py |
LetterClassifier: the first complete AI |
Option A: Just read the solutions Run each solution file and read the output. Good for building intuition before you implement.
python3 tutorials/01-mlp-from-scratch/solution/01_math_foundations.pyOption B: Implement from starter code
Open starter_code/01_math_foundations.py, read the docstrings, implement each function. Run the tests to check.
python3 -m pytest tutorials/01-mlp-from-scratch/tests/test_01_math.py -vOption C: Read lesson.md first
lesson.md has a detailed walkthrough of each chapter with code explained line by line. Read it, then implement.
- This file (you're reading it)
- lesson.md — the narrative
starter_code/01_math_foundations.py— implement it- Run
tests/test_01_math.py— check your work - Repeat for chapters 2-7
- Run
python3 solution/07_final_project.py— see the whole thing work
The final project is self-contained:
python3 tutorials/01-mlp-from-scratch/solution/07_final_project.pyThis will train the letter classifier for 500 epochs and print accuracy for all 26 letters. Takes about 5-10 seconds.
Once you've got the MLP working, Tutorial 02 introduces LSTMs — networks with memory. The motivating question: "Our MLP handles one letter at a time. What if we need to look at a sequence of letters to figure out how the first one sounds?"