Stationarity Testing in Python — ELI5

Why time series data needs to 'settle down' before you can predict it, and how Python checks if it has.

Imagine trying to measure how tall a child is when they will not stop jumping on a trampoline.

You could take a measurement, but it would be meaningless because the baseline keeps changing — sometimes they are two feet off the ground, sometimes five. You need them to stand still on flat ground first. Then your measurement means something.

Stationarity testing in time series works the same way. A stationary series is one that has “settled down.” Its average value does not drift upward or downward over time, and its ups and downs stay roughly the same size. Think of the temperature inside a well-regulated building — it bounces around a set point but never wanders off.

A non-stationary series has not settled down. Stock prices are a classic example — they wander up and down with no fixed average to return to. So is the population of a growing city or total COVID cases during a pandemic.

Why does this matter? Most forecasting tools only work properly on stationary data. Feeding them a non-stationary series is like asking someone to predict the next bounce of a child on a trampoline — the answer depends entirely on where they are in the bounce, making prediction chaotic.

Python has tests that check whether your data is stationary. The most popular one is the Augmented Dickey-Fuller test. You feed your data in, and it gives you a number. If that number is low enough, your data is stationary and you can model it directly. If not, you need to transform the data first — usually by looking at the changes instead of the levels.

The one thing to remember: Stationarity means your data’s behavior stays consistent over time — and checking for it is the essential first step before applying almost any time series model.

pythontime-seriesstationaritystatistics

Stationarity Testing in Python — ELI5

See Also

Related Topics