Machine Learning — Core Concepts

How computers actually learn from data: the three types of machine learning, what training really means, and why your spam filter keeps getting smarter.

What Is Machine Learning?

Machine learning (ML) is a method of building software that learns from data rather than from hand-written rules. Instead of programming a computer to recognize spam by listing keywords, you show it 100,000 examples of spam and non-spam emails, and it figures out the distinguishing patterns itself.

The result isn’t a list of rules you can read. It’s a mathematical structure — a model — that has encoded what spam tends to look like, in a form only the computer can use.

How Training Actually Works

Think of training like this: you start with a model that knows nothing. You feed it an example (say, a photo of a cat) and ask “what is this?” It guesses randomly — maybe “car.” You tell it it’s wrong, and by how much. The model adjusts its internal settings slightly in the direction of being less wrong. Then you feed it the next example. Repeat this millions of times.

This feedback loop is called gradient descent — the model is always nudging itself toward fewer mistakes. By the end of training, it hasn’t been told “cats have pointy ears.” It has learned that pointy ears correlate with “cat” from seeing enough evidence.

Three elements make this work:

Data — the examples you train on
A model architecture — the mathematical structure that does the learning
A loss function — how you measure “wrong” so the model knows which direction to adjust

The Three Types of Machine Learning

1. Supervised Learning

You provide labeled examples: this email is spam, that one isn’t. This photo is a dog, that one is a cat. The model learns to map inputs to the right label.

Real use: Google Photos recognizing your face in pictures. Netflix predicting whether you’ll like a movie. Credit card fraud detection (Visa reportedly uses supervised models that review billions of transactions per day).

2. Unsupervised Learning

You provide examples with no labels. The model finds structure on its own — grouping similar things together, finding unusual patterns, compressing information.

Real use: Spotify grouping listeners by taste without anyone defining the genres. Customer segmentation in e-commerce (Amazon grouping shoppers by purchase behavior to target promotions).

3. Reinforcement Learning

The model takes actions in an environment and receives rewards or penalties. No labeled data — just trial and error.

Real use: DeepMind’s AlphaGo, which learned to beat world champions at Go by playing millions of games against itself. OpenAI’s bots that learned to play Dota 2 at superhuman level, discovering strategies professional players had never tried.

Key Concepts

Features

The inputs the model uses to make decisions. In a house price model, features might be: square footage, number of bedrooms, zip code, year built. Choosing the right features — called feature engineering — is often what separates a mediocre model from a great one.

Overfitting

When a model memorizes the training data instead of learning from it. Like a student who memorizes every answer from last year’s exam but can’t solve a new problem. The model gets nearly perfect scores on data it’s seen, and fails on data it hasn’t. Prevented through techniques like cross-validation and regularization.

Training vs. Inference

Training is the slow, expensive process of learning from data. Inference is applying the trained model to new inputs — this is what happens in real time when you ask Siri a question or Shazam recognizes a song. Training might take days on expensive hardware; inference happens in milliseconds on your phone.

The Train/Test Split

Before training begins, you hold back a chunk of your data — say 20% — that the model never sees during training. After training, you test the model on this held-out data to measure how well it generalizes. This is the closest thing to an honest performance measurement.

Common Misconception

“Machine learning models understand what they’re doing.”

They don’t. A language model that generates fluent text doesn’t understand language — it has learned statistical patterns over billions of words. A model that detects tumors in X-rays hasn’t learned medicine; it has found pixel patterns that correlate with pathologist labels.

This matters because it explains why ML models can fail in bizarre ways. A famous 2018 study showed that a model trained to detect wolves in images was actually detecting snow — because most wolf photos had snowy backgrounds. The model “learned” the wrong pattern because the pattern worked on training data.

Why This Changed Everything

Before ML, automating a task meant someone had to understand and encode all the rules. A programmer had to know what spam looks like. A doctor had to define what a tumor looks like in pixels.

ML removed that bottleneck. Tasks that were impossible to rule-engineer — recognizing handwriting, translating languages, predicting protein structures — became tractable once you could throw enough labeled data at a model and let it find its own rules.

DeepMind’s AlphaFold2 (2020) predicted the 3D structure of nearly every known protein, a 50-year unsolved biology problem. It did it not by understanding biochemistry, but by learning patterns from the 170,000 known protein structures in scientific databases.

One Thing to Remember

Machine learning is the art of letting data write the rules. It works remarkably well — and fails in ways that can be remarkably hard to predict, because the model doesn’t know why its rules work.

techaimachine-learningdata-science