TensorFlow Keras API — Core Concepts

What Keras Actually Is

Keras is TensorFlow’s high-level API for building and training neural networks. Since TensorFlow 2.0, it ships as tf.keras — the official way to define models. Think of it as the difference between assembly language and Python: same computer underneath, radically different developer experience.

Keras gives you three ways to build models, each suited to different levels of complexity.

The Three APIs

Sequential API — The Straight Line

The Sequential API is for models where data flows in a straight line: input → layer → layer → output. No branches, no loops, no merging.

You create a Sequential() object and add layers one by one. Each layer automatically connects to the previous one. This covers roughly 70% of common use cases — image classifiers, simple text models, tabular data predictors.

When to use it: Your model is a pipeline with one input and one output, and each step feeds into the next.

Functional API — The Graph

When your model needs multiple inputs (say, an image and a text caption), multiple outputs, or shared layers, the Functional API steps in. You define each layer as a function call, passing tensors explicitly. This creates a directed acyclic graph (DAG) rather than a straight line.

Real-world example: Google’s Wide & Deep model uses both a “wide” linear path and a “deep” neural network path, then merges them. The Functional API makes this natural.

When to use it: Your architecture has branches, skip connections, shared layers, or multiple inputs/outputs.

Model Subclassing — Full Control

Subclassing tf.keras.Model gives you a Python class where you define layers in __init__ and write the forward pass in call(). This is closest to PyTorch’s style and is essential when your forward pass has conditional logic, loops, or dynamic behavior that cannot be expressed as a static graph.

When to use it: Research models, architectures with dynamic control flow, or when you need maximum flexibility.

How Training Works

Regardless of which API you use, training follows the same three-step pattern:

  1. Compile — Specify the optimizer (how to update weights), the loss function (what to minimize), and metrics (what to report).
  2. Fit — Feed training data, set epochs and batch size, and optionally provide validation data.
  3. Evaluate / Predict — Test on unseen data or generate predictions.

The compile-fit-evaluate loop is Keras’s signature workflow. It hides the gradient computation, backpropagation, and weight update mechanics behind clean method calls.

Layers: The Building Blocks

Keras includes dozens of built-in layers:

Layer TypePurposeExample
DenseFully connectedClassification heads
Conv2DSpatial feature extractionImage recognition
LSTM / GRUSequential memoryTime series, text
EmbeddingMap integers to vectorsWord representations
DropoutRegularizationPreventing overfitting
BatchNormalizationStabilize trainingFaster convergence

Each layer manages its own weights and configuration. You can inspect, save, and reload individual layers independently.

Callbacks: Hooks Into Training

Callbacks let you inject behavior at specific points during training — after each epoch, before each batch, or when a metric stops improving. Built-in callbacks include:

  • EarlyStopping — Halt training when validation loss plateaus, saving hours of wasted computation.
  • ModelCheckpoint — Save the best model weights automatically.
  • TensorBoard — Log metrics for visualization.
  • ReduceLROnPlateau — Lower the learning rate when progress stalls.

You can also write custom callbacks by subclassing tf.keras.callbacks.Callback.

Common Misconception

“Keras is a separate library from TensorFlow.” This was true before TensorFlow 2.0, when Keras existed as a standalone package supporting multiple backends (Theano, CNTK, TensorFlow). Today, tf.keras is the Keras implementation, tightly integrated with TensorFlow. The standalone keras package (version 3+) now supports multiple backends again, but in the TensorFlow ecosystem, tf.keras is the standard.

When Keras Is Not Enough

Keras handles most production and research use cases, but you may drop to lower-level TensorFlow APIs when you need:

  • Custom training loops with fine-grained gradient manipulation
  • Non-standard automatic differentiation patterns
  • Direct hardware placement control across multiple GPUs or TPUs

Even then, you typically mix Keras layers with lower-level code rather than abandoning Keras entirely.

The one thing to remember: Keras gives you three APIs — Sequential, Functional, and Subclassing — so you always use the simplest one that fits your architecture, and you can switch when your needs grow.

pythonmachine-learningtensorflowkeras

See Also