TensorFlow Keras API — Core Concepts
What Keras Actually Is
Keras is TensorFlow’s high-level API for building and training neural networks. Since TensorFlow 2.0, it ships as tf.keras — the official way to define models. Think of it as the difference between assembly language and Python: same computer underneath, radically different developer experience.
Keras gives you three ways to build models, each suited to different levels of complexity.
The Three APIs
Sequential API — The Straight Line
The Sequential API is for models where data flows in a straight line: input → layer → layer → output. No branches, no loops, no merging.
You create a Sequential() object and add layers one by one. Each layer automatically connects to the previous one. This covers roughly 70% of common use cases — image classifiers, simple text models, tabular data predictors.
When to use it: Your model is a pipeline with one input and one output, and each step feeds into the next.
Functional API — The Graph
When your model needs multiple inputs (say, an image and a text caption), multiple outputs, or shared layers, the Functional API steps in. You define each layer as a function call, passing tensors explicitly. This creates a directed acyclic graph (DAG) rather than a straight line.
Real-world example: Google’s Wide & Deep model uses both a “wide” linear path and a “deep” neural network path, then merges them. The Functional API makes this natural.
When to use it: Your architecture has branches, skip connections, shared layers, or multiple inputs/outputs.
Model Subclassing — Full Control
Subclassing tf.keras.Model gives you a Python class where you define layers in __init__ and write the forward pass in call(). This is closest to PyTorch’s style and is essential when your forward pass has conditional logic, loops, or dynamic behavior that cannot be expressed as a static graph.
When to use it: Research models, architectures with dynamic control flow, or when you need maximum flexibility.
How Training Works
Regardless of which API you use, training follows the same three-step pattern:
- Compile — Specify the optimizer (how to update weights), the loss function (what to minimize), and metrics (what to report).
- Fit — Feed training data, set epochs and batch size, and optionally provide validation data.
- Evaluate / Predict — Test on unseen data or generate predictions.
The compile-fit-evaluate loop is Keras’s signature workflow. It hides the gradient computation, backpropagation, and weight update mechanics behind clean method calls.
Layers: The Building Blocks
Keras includes dozens of built-in layers:
| Layer Type | Purpose | Example |
|---|---|---|
| Dense | Fully connected | Classification heads |
| Conv2D | Spatial feature extraction | Image recognition |
| LSTM / GRU | Sequential memory | Time series, text |
| Embedding | Map integers to vectors | Word representations |
| Dropout | Regularization | Preventing overfitting |
| BatchNormalization | Stabilize training | Faster convergence |
Each layer manages its own weights and configuration. You can inspect, save, and reload individual layers independently.
Callbacks: Hooks Into Training
Callbacks let you inject behavior at specific points during training — after each epoch, before each batch, or when a metric stops improving. Built-in callbacks include:
- EarlyStopping — Halt training when validation loss plateaus, saving hours of wasted computation.
- ModelCheckpoint — Save the best model weights automatically.
- TensorBoard — Log metrics for visualization.
- ReduceLROnPlateau — Lower the learning rate when progress stalls.
You can also write custom callbacks by subclassing tf.keras.callbacks.Callback.
Common Misconception
“Keras is a separate library from TensorFlow.” This was true before TensorFlow 2.0, when Keras existed as a standalone package supporting multiple backends (Theano, CNTK, TensorFlow). Today, tf.keras is the Keras implementation, tightly integrated with TensorFlow. The standalone keras package (version 3+) now supports multiple backends again, but in the TensorFlow ecosystem, tf.keras is the standard.
When Keras Is Not Enough
Keras handles most production and research use cases, but you may drop to lower-level TensorFlow APIs when you need:
- Custom training loops with fine-grained gradient manipulation
- Non-standard automatic differentiation patterns
- Direct hardware placement control across multiple GPUs or TPUs
Even then, you typically mix Keras layers with lower-level code rather than abandoning Keras entirely.
The one thing to remember: Keras gives you three APIs — Sequential, Functional, and Subclassing — so you always use the simplest one that fits your architecture, and you can switch when your needs grow.
See Also
- Python Pytorch Lightning Training How PyTorch Lightning removes the boring parts of training AI models so researchers can focus on ideas instead of boilerplate.
- Python Tensorflow Custom Layers How to teach TensorFlow new tricks by building your own custom layers — explained with a cookie cutter analogy.
- Python Tensorflow Data Pipelines How TensorFlow feeds data to your model without wasting time — explained like a restaurant kitchen that never stops cooking.
- Python Tensorflow Model Optimization Why making a trained model smaller and faster matters — explained like packing a suitcase for a trip.
- Python Tensorflow Tensorboard How TensorBoard lets you watch your model learn in real time — explained like a fitness tracker for your AI.