Prime your brain first — retention follows

Read ~9m
11 terms · 9 segments

Building makemore Part 2: MLP

9chapters with key takeaways — read first, then watch
1

Bigram Model Limitations & MLP Introduction

0:01-1:481m 47sLimitation
2

MLP Architecture & Word Embeddings Theory

1:49-8:597m 10sArchitecture
3

Dataset Preparation & Embedding Layer Implementation

9:00-18:349m 34sArchitecture
4

Hidden & Output Layers, Loss Calculation

18:35-32:1613m 41sArchitecture
5

Efficient Loss & Basic Training Loop

32:17-41:209m 3sTraining
6

Full Dataset Training, Mini-Batching, & Learning Rate Tuning

41:21-56:1914m 58sTraining
7

Data Splits, Model Underfitting, & 2D Embedding Visualization

56:20-1:06:5510m 35sUse Case
8

Scaling Embeddings, Advanced Optimization, & Model Sampling

1:06:56-1:14:547m 58sTraining
9

Google Colab for Accessibility

1:14:55-1:15:4045sDemo

Video Details & AI Summary

Published Sep 12, 2022
Analyzed Jan 21, 2026

AI Analysis Summary

This video, part two of the 'makemore' series, delves into building a Multi-Layer Perceptron (MLP) neural network for a character-level language model. It addresses the limitations of simpler bi-gram models by introducing word embeddings and a multi-layer architecture, demonstrating dataset preparation, PyTorch implementation of embedding, hidden, and output layers, and efficient loss calculation. The tutorial covers crucial aspects of training, including mini-batching, learning rate tuning, data splitting, and model scaling, culminating in a model that generates more realistic names and is accessible via Google Colab.

Title Accuracy Score
9/10Excellent
38.4s processing
Model:gemini-2.5-flash