Python statistics Module — Core Concepts

What Is the statistics Module?

Python’s statistics module (added in Python 3.4) provides functions for calculating basic statistical properties of numerical data. It’s part of the standard library — no pip install required.

It’s designed for correctness and readability over speed, making it ideal for small to medium datasets and scripts where you don’t want to pull in NumPy.

Central Tendency: Where’s the Middle?

Mean

import statistics

data = [4, 8, 6, 5, 3, 7, 8, 9]
statistics.mean(data)      # 6.25 (arithmetic mean)
statistics.fmean(data)     # 6.25 (faster, float-only, Python 3.8+)

fmean is about 3x faster than mean because it skips Fraction-based exact arithmetic.

Median

statistics.median(data)           # 6.5 (average of two middle values)
statistics.median_low(data)       # 6 (lower of two middle values)
statistics.median_high(data)      # 7 (higher of two middle values)
statistics.median_grouped(data)   # 6.5 (for continuous/grouped data)

The regular median interpolates when the dataset has an even number of items. Use median_low or median_high when you need an actual value from the dataset.

Mode

statistics.mode([1, 2, 2, 3, 3, 3])    # 3 (most common)
statistics.multimode([1, 1, 2, 2, 3])   # [1, 2] (all modes, Python 3.8+)

mode raises StatisticsError if there’s no single most common value (pre-3.8). multimode handles ties gracefully.

Spread: How Scattered Are the Values?

Variance and Standard Deviation

statistics.variance(data)    # 4.214... (sample variance, divides by n-1)
statistics.stdev(data)       # 2.053... (sample standard deviation)

statistics.pvariance(data)   # 3.687... (population variance, divides by n)
statistics.pstdev(data)      # 1.920... (population standard deviation)

When to use which: Use variance/stdev (sample) when your data is a subset of a larger population. Use pvariance/pstdev (population) when you have the entire dataset.

Quantiles (Python 3.8+)

statistics.quantiles(data, n=4)    # Quartile boundaries
statistics.quantiles(data, n=10)   # Decile boundaries

Correlation and Regression (Python 3.10+)

x = [1, 2, 3, 4, 5]
y = [2.1, 3.9, 6.2, 7.8, 10.1]

statistics.correlation(x, y)        # ~0.999 (Pearson correlation)
slope, intercept = statistics.linear_regression(x, y)
# slope ≈ 2.0, intercept ≈ 0.06

These additions make the module useful for quick exploratory analysis without importing anything.

Common Misconception

“The statistics module is just a toy.” For small datasets and scripts, it’s actually preferable to NumPy. It uses exact arithmetic (Fraction and Decimal support) by default, avoiding floating-point surprises. For a config file with 20 server response times, statistics.mean() is the right tool.

statistics vs NumPy vs pandas

FeaturestatisticsNumPypandas
Install neededNoYesYes
Speed (large data)SlowFastFast
Exact arithmeticYes (Fraction)No (float64)No
DistributionsNoBasicNo
DataFramesNoNoYes
Best forScripts, small dataArrays, computationTabular data

When to Use It

  • Quick scripts that analyze a handful of numbers
  • Teaching and learning statistics concepts
  • Situations where exact arithmetic matters (financial calculations)
  • Environments where installing third-party packages isn’t an option

When to Graduate

  • Datasets with more than ~10,000 values (performance)
  • Need for advanced statistical tests (SciPy)
  • Working with DataFrames or time series (pandas)
  • Array broadcasting and vectorized operations (NumPy)

One Thing to Remember

The statistics module is Python’s built-in answer to “what’s the average?” — small, correct, and always available, perfect for quick analysis without any dependencies.

pythonstatisticsstdlibdata-analysis

See Also

  • Python Random Module Patterns Learn how Python picks random numbers, shuffles cards, and makes fair choices — and why it's not truly random.
  • Python Scipy Scientific Computing Learn why scientists and engineers reach for SciPy when they need Python to crunch serious math problems.
  • Python Sympy Symbolic Math See how Python can solve algebra homework for you — with letters instead of just numbers.
  • Ci Cd Why big apps can ship updates every day without turning your phone into a glitchy mess — CI/CD is the behind-the-scenes quality gate and delivery truck.
  • Containerization Why does software that works on your computer break on everyone else's? Containers fix that — and they're why Netflix can deploy 100 updates a day without the site going down.