Reinforcement Learning — Explain It Like I'm 5

How do you teach a dog to sit without talking? That's basically how computers learn to play chess, drive cars, and beat world champions.

Reinforcement Learning (ELI5)

Imagine you have a puppy who doesn’t speak English.

You can’t say “sit.” You can’t explain what you want. All you can do is give the puppy a treat when it does something good, and say “no” when it does something wrong.

That’s it. That’s reinforcement learning.

The puppy tries stuff. Rolls around. Barks. Knocks things over. Eventually, by accident, it sits. You give it a treat. The puppy thinks: Oh! That thing I just did — more of that.

It tries sitting again. More treats. Pretty soon the puppy is sitting constantly, because it figured out the pattern on its own — not because anyone explained it.

Computers learn the exact same way.

A computer program (we call it an “agent”) tries things inside a pretend world. Maybe it’s a video game. Maybe it’s a fake city where it’s learning to drive. Every time it does something good — like staying on the road, or scoring a point — it gets a reward. Every time it crashes or loses, it gets a penalty.

The program tries millions of times. Gets rewarded. Gets penalized. Slowly, it starts figuring out what works.

In 2016, a program called AlphaGo learned the ancient board game Go this way. It played against itself millions of times, getting better and better with no human telling it what moves to make. Then it beat the world champion — a feat most experts thought was 10 years away.

The weird part: nobody programmed AlphaGo how to play Go. It figured that out entirely by trial and error, treat and penalty.

Your brain works a bit like this too. Touch a hot stove once? Pain is your penalty. You learn fast. Eat a great meal? Good feeling is your reward. You remember where the restaurant is. We just made that process happen inside a computer.

One thing to remember: Reinforcement learning is how you teach without explaining. The computer figures it out the same way a puppy learns to sit — by trying things and seeing what gets rewarded.

reinforcement-learningaimachine-learning

Reinforcement Learning — Explain It Like I'm 5

Reinforcement Learning (ELI5)

See Also

Related Topics