Tokenization — Explain It Like I'm 5
You know how you learned to read? You didn’t look at a whole sentence at once. You went letter by letter, then word by word.
AI doesn’t read exactly like that. It reads in chunks — and it gets to decide how to chunk things up.
Imagine you’re eating a pizza. You could eat it:
- One tiny nibble at a time
- One slice at a time
- The whole thing at once
AI breaks language into “slices” too. Those slices are called tokens.
Here’s the thing though — the AI’s slices are kind of weird. The word “playing” might become two tokens: “play” and “ing”. The word “cat” is one token. The word “unfortunately” gets chopped into pieces you’d never expect.
Why? Because the AI learned its slices by looking at a massive amount of text and figuring out which chunks show up together the most often. Short common words get their own token. Rare or long words get split up.
Here’s why this matters for you:
When companies say ChatGPT can handle “128,000 tokens,” that’s not 128,000 words. It’s closer to 96,000 words — because many words turn into multiple tokens.
And when you’re paying per token? That “ing” at the end of “playing” costs you money.
Also, AI can sometimes make funny spelling mistakes because it’s not thinking letter-by-letter. It’s thinking chunk-by-chunk, and sometimes the chunks don’t add up quite right.
One thing to remember: Tokens aren’t words and they aren’t letters. They’re the AI’s own weird alphabet — and once you understand that, a lot of AI behavior suddenly makes sense.
See Also
- Ai Hallucinations ChatGPT sometimes makes up facts with total confidence. Here's the weird reason why — and why it's not as simple as 'the AI lied.'
- Artificial Intelligence What is AI really? Think of it as a dog that learned tricks — impressive, but it doesn't know why it's doing them.
- Bias Variance Tradeoff The fundamental tension in machine learning between being wrong in the same way vs. being wrong in different ways — and why the simplest model isn't always best.
- Deep Learning Why your phone can spot your face in a messy photo album — and why that trick comes from practice, not magic.
- Embeddings How do computers know that 'dog' and 'puppy' mean almost the same thing? They don't read definitions — they turn words into secret map coordinates, and nearby coordinates mean nearby meanings.