Python Energy Consumption Modeling — Deep Dive

Master advanced Python techniques for building production-grade energy forecasting systems, from feature pipelines to deployed models.

Technical foundation

Energy consumption modeling in Python combines time-series analysis, regression, and domain-specific engineering. Production systems must handle streaming meter data, adapt to regime changes, and serve forecasts with latency requirements measured in seconds. This deep dive covers the architecture patterns, code techniques, and tradeoffs that separate notebook experiments from deployable pipelines.

Data ingestion and cleaning

Smart meters typically report at 15-minute or hourly intervals via protocols like Green Button (XML) or direct database feeds. A robust ingestion layer needs to handle:

import pandas as pd
import numpy as np

def load_and_clean_meter_data(path: str, freq: str = "1h") -> pd.DataFrame:
    """Load meter CSV with timestamp index, resample, handle gaps."""
    df = pd.read_csv(
        path,
        parse_dates=["timestamp"],
        index_col="timestamp",
    )
    # Resample to uniform frequency
    df = df.resample(freq).mean()

    # Flag gaps longer than 2x the expected interval
    gap_threshold = pd.Timedelta(freq) * 2
    gaps = df.index.to_series().diff() > gap_threshold
    if gaps.any():
        print(f"Warning: {gaps.sum()} gaps detected exceeding {gap_threshold}")

    # Interpolate short gaps, leave long gaps as NaN for imputation
    df["kwh"] = df["kwh"].interpolate(method="time", limit=3)
    return df

Key decisions at this stage:

Interpolation limit — Fill gaps of 1–3 periods; longer gaps need model-based imputation or exclusion.
Outlier detection — Energy readings can spike from meter resets or data errors. A rolling z-score filter (scipy.stats.zscore on a 7-day window) catches most anomalies.
Timezone alignment — Meter data may arrive in UTC while weather data is local. Always normalize to a single timezone before joining.

Feature engineering for energy models

Feature quality dominates model performance. The core feature categories:

Temporal features

def add_temporal_features(df: pd.DataFrame) -> pd.DataFrame:
    df["hour"] = df.index.hour
    df["day_of_week"] = df.index.dayofweek
    df["month"] = df.index.month
    df["is_weekend"] = (df.index.dayofweek >= 5).astype(int)
    # Cyclical encoding to avoid discontinuities
    df["hour_sin"] = np.sin(2 * np.pi * df["hour"] / 24)
    df["hour_cos"] = np.cos(2 * np.pi * df["hour"] / 24)
    df["month_sin"] = np.sin(2 * np.pi * df["month"] / 12)
    df["month_cos"] = np.cos(2 * np.pi * df["month"] / 12)
    return df

Cyclical encoding prevents the model from treating hour 23 and hour 0 as maximally distant.

Weather features

Temperature is the single most important external variable. Beyond raw temperature, derived features improve accuracy:

def add_weather_features(df: pd.DataFrame, base_heat: float = 18.0, base_cool: float = 24.0) -> pd.DataFrame:
    df["hdd"] = np.maximum(base_heat - df["temp_c"], 0)
    df["cdd"] = np.maximum(df["temp_c"] - base_cool, 0)
    df["temp_lag_24h"] = df["temp_c"].shift(24)
    df["temp_rolling_72h"] = df["temp_c"].rolling(72).mean()
    # Thermal mass effect: buildings respond slowly to temperature changes
    df["temp_ewm_12h"] = df["temp_c"].ewm(halflife=12).mean()
    return df

The exponentially weighted mean captures thermal inertia — buildings don’t instantly cool when temperature drops.

Lag and autoregressive features

def add_lag_features(df: pd.DataFrame, target: str = "kwh") -> pd.DataFrame:
    for lag in [1, 24, 48, 168]:  # 1h, 1day, 2days, 1week
        df[f"{target}_lag_{lag}"] = df[target].shift(lag)
    df[f"{target}_rolling_24h_mean"] = df[target].rolling(24).mean()
    df[f"{target}_rolling_168h_mean"] = df[target].rolling(168).mean()
    return df

The 168-hour lag (one week) captures weekly seasonality directly.

Model architectures

Gradient-boosted trees for production forecasting

LightGBM is the workhorse of production energy forecasting. It handles mixed feature types, trains in minutes on millions of rows, and provides feature importance for interpretability.

import lightgbm as lgb
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import mean_absolute_percentage_error

def train_energy_model(df: pd.DataFrame, target: str = "kwh", n_splits: int = 5):
    feature_cols = [c for c in df.columns if c != target]
    X = df[feature_cols].dropna()
    y = df.loc[X.index, target]

    tscv = TimeSeriesSplit(n_splits=n_splits)
    scores = []

    for train_idx, val_idx in tscv.split(X):
        X_train, X_val = X.iloc[train_idx], X.iloc[val_idx]
        y_train, y_val = y.iloc[train_idx], y.iloc[val_idx]

        model = lgb.LGBMRegressor(
            n_estimators=1000,
            learning_rate=0.05,
            num_leaves=63,
            min_child_samples=20,
            subsample=0.8,
            colsample_bytree=0.8,
            reg_alpha=0.1,
            reg_lambda=0.1,
        )
        model.fit(
            X_train, y_train,
            eval_set=[(X_val, y_val)],
            callbacks=[lgb.early_stopping(50), lgb.log_evaluation(0)],
        )
        preds = model.predict(X_val)
        mape = mean_absolute_percentage_error(y_val, preds)
        scores.append(mape)

    print(f"Mean MAPE across folds: {np.mean(scores):.4f}")
    return model

Critical detail: TimeSeriesSplit preserves temporal order. Using KFold would leak future data and produce unrealistically optimistic metrics.

SARIMAX for interpretable baselines

from statsmodels.tsa.statespace.sarimax import SARIMAX

def fit_sarimax(series: pd.Series, exog: pd.DataFrame = None):
    model = SARIMAX(
        series,
        exog=exog,
        order=(2, 1, 1),
        seasonal_order=(1, 1, 1, 24),  # 24-hour seasonality
        enforce_stationarity=False,
        enforce_invertibility=False,
    )
    results = model.fit(disp=False, maxiter=200)
    print(results.summary().tables[1])
    return results

SARIMAX is slower to train but produces confidence intervals natively, which matters for risk-aware grid planning.

Temporal Fusion Transformer for multi-horizon forecasts

For organizations needing 1-hour to 7-day forecasts simultaneously:

from pytorch_forecasting import TemporalFusionTransformer, TimeSeriesDataSet

# Define the dataset with known and unknown future inputs
training = TimeSeriesDataSet(
    df_train,
    time_idx="time_idx",
    target="kwh",
    group_ids=["building_id"],
    max_encoder_length=168,   # 1 week lookback
    max_prediction_length=48, # 2 day forecast
    time_varying_known_reals=["hour_sin", "hour_cos", "temp_forecast"],
    time_varying_unknown_reals=["kwh"],
    static_categoricals=["building_type"],
)

TFT provides variable importance scores and temporal attention weights, making it interpretable despite being a deep learning model.

Change-point detection

Energy systems undergo structural changes. Detecting these prevents stale-model drift:

import ruptures

def detect_regime_changes(series: np.ndarray, penalty: float = 10.0) -> list:
    """Detect change points in daily energy consumption."""
    algo = ruptures.Pelt(model="rbf").fit(series)
    change_points = algo.predict(pen=penalty)
    return change_points

When a change point is detected, the pipeline should retrain using only post-change data. Some production systems automate this with a monitoring service that compares rolling prediction error against a threshold.

Deployment architecture

A production energy forecasting system typically has three layers:

Batch pipeline — Runs nightly, retrains models on latest data, generates day-ahead and week-ahead forecasts, stores results in a time-series database (InfluxDB, TimescaleDB).
Real-time adjustment — An online model (lightweight linear regression or exponential smoothing) corrects batch forecasts using the latest few hours of actuals.
API layer — FastAPI or Flask endpoint serves forecasts with building/meter ID and time horizon as parameters.

from fastapi import FastAPI, Query
from datetime import datetime

app = FastAPI()

@app.get("/forecast/{building_id}")
async def get_forecast(
    building_id: str,
    start: datetime = Query(...),
    hours: int = Query(default=24, le=168),
):
    forecast = forecast_service.predict(building_id, start, hours)
    return {
        "building_id": building_id,
        "start": start.isoformat(),
        "horizon_hours": hours,
        "predictions": forecast.to_dict(),
        "model_version": forecast_service.model_version,
    }

Evaluation and monitoring

Beyond MAPE, production systems track:

CV(RMSE) — The ASHRAE standard for building energy models. Calculated as RMSE divided by mean observed consumption.
Bias — Systematic over- or under-prediction signals model drift.
Peak error — Maximum absolute error matters for grid capacity planning.
Prediction interval calibration — If you say there’s a 90% confidence interval, actual values should fall within it roughly 90% of the time.

def cvrmse(actual: np.ndarray, predicted: np.ndarray) -> float:
    rmse = np.sqrt(np.mean((actual - predicted) ** 2))
    return rmse / np.mean(actual)

def peak_error(actual: np.ndarray, predicted: np.ndarray) -> float:
    return np.max(np.abs(actual - predicted))

Tradeoffs to consider

Decision	Option A	Option B
Model complexity	Simple (degree-day regression) — fast, interpretable, limited accuracy	Complex (TFT) — high accuracy, requires GPU, harder to debug
Retraining frequency	Weekly — stable but slow to adapt	Daily — responsive but higher compute cost
Forecast horizon	Short (1–24h) — higher accuracy, operational decisions	Long (1–12 months) — lower accuracy, strategic planning
Feature scope	Weather only — robust, minimal data dependencies	Weather + occupancy + calendar + price — higher accuracy, more data pipelines to maintain

Real-world scale

Enel, the Italian utility serving 70 million customers, uses Python-based ML pipelines to forecast demand across distribution networks. Their system processes terabytes of smart meter data using PySpark for distributed feature engineering and LightGBM for per-substation forecasts. The result: 3–5% MAPE on day-ahead predictions, translating to hundreds of millions in avoided over-procurement costs annually.

One thing to remember: Production energy modeling is an engineering system, not just a model. Data quality, retraining automation, change-point detection, and monitoring matter as much as algorithm choice.

pythonenergydata-sciencesustainability