Lifelines for Survival Analysis — Deep Dive

Mathematical foundations

The survival function

The survival function S(t) = P(T > t) gives the probability that the event time T exceeds t. It is a non-increasing function starting at 1 and approaching 0. The median survival time is the value t where S(t) = 0.5.

The hazard function

The hazard function h(t) = lim(Δt→0) P(t ≤ T < t+Δt | T ≥ t) / Δt represents the instantaneous failure rate. The cumulative hazard H(t) = ∫₀ᵗ h(u)du relates to survival via S(t) = exp(-H(t)).

Censoring mechanisms

  • Right censoring — subject leaves the study or study ends before event. Most common.
  • Left censoring — event happened before observation started (e.g., infection before first test).
  • Interval censoring — event happened between two observation times.

Lifelines handles right censoring natively. Left and interval censoring require special model classes.

Kaplan-Meier estimation

import pandas as pd
from lifelines import KaplanMeierFitter
import matplotlib.pyplot as plt

# Sample clinical trial data
data = pd.DataFrame({
    "duration_months": [6, 7, 10, 15, 19, 22, 25, 30, 32, 40,
                        5, 8, 12, 14, 20, 24, 28, 35, 38, 45],
    "event_observed":  [1, 1, 0, 1, 1, 0, 1, 0, 1, 0,
                        1, 1, 1, 0, 1, 1, 0, 1, 0, 0],
    "treatment":       ["A"]*10 + ["B"]*10,
})

kmf = KaplanMeierFitter()

fig, ax = plt.subplots(figsize=(10, 6))

for group in ["A", "B"]:
    mask = data["treatment"] == group
    kmf.fit(
        durations=data.loc[mask, "duration_months"],
        event_observed=data.loc[mask, "event_observed"],
        label=f"Treatment {group}",
    )
    kmf.plot_survival_function(ax=ax, ci_show=True)
    print(f"Treatment {group} — median survival: {kmf.median_survival_time_:.1f} months")

ax.set_xlabel("Months")
ax.set_ylabel("Survival Probability")
ax.set_title("Kaplan-Meier Survival Curves by Treatment Group")
plt.tight_layout()
plt.savefig("km_curves.png", dpi=150)

Confidence intervals

Lifelines computes confidence intervals using Greenwood’s formula by default. The width of the interval depends on the number of subjects at risk — it widens at later time points as fewer subjects remain.

# Access the survival table directly
kmf.fit(data["duration_months"], data["event_observed"])
table = kmf.survival_function_
print(table.head(10))

# Confidence bounds
ci = kmf.confidence_interval_survival_function_
print(ci.head(10))

Log-rank test: comparing groups

from lifelines.statistics import logrank_test

group_a = data[data["treatment"] == "A"]
group_b = data[data["treatment"] == "B"]

results = logrank_test(
    group_a["duration_months"], group_b["duration_months"],
    event_observed_A=group_a["event_observed"],
    event_observed_B=group_b["event_observed"],
)

print(f"Test statistic: {results.test_statistic:.3f}")
print(f"P-value: {results.p_value:.4f}")

if results.p_value < 0.05:
    print("Statistically significant difference in survival")
else:
    print("No significant difference detected")

The log-rank test is non-parametric and compares the full survival curves, not just median values. It has maximum power when hazards are proportional.

Cox proportional hazards regression

Basic fitting

import pandas as pd
from lifelines import CoxPHFitter

# Rossi recidivism dataset (classic survival dataset)
from lifelines.datasets import load_rossi

rossi = load_rossi()
print(rossi.columns.tolist())
# ['week', 'arrest', 'fin', 'age', 'race', 'wexp', 'mar', 'paro', 'prio']

cph = CoxPHFitter()
cph.fit(rossi, duration_col="week", event_col="arrest")

cph.print_summary()

Interpreting coefficients

# Hazard ratios
print(cph.summary[["coef", "exp(coef)", "p"]])

# exp(coef) is the hazard ratio:
# - exp(coef) > 1: higher hazard (worse survival)
# - exp(coef) < 1: lower hazard (better survival)
# - exp(coef) = 1: no effect

# Example: if age has exp(coef) = 0.94,
# each additional year of age reduces the hazard by 6%

Prediction

import pandas as pd

# Predict survival for a new subject
new_subject = pd.DataFrame({
    "fin": [1],
    "age": [30],
    "race": [1],
    "wexp": [1],
    "mar": [0],
    "paro": [1],
    "prio": [2],
})

# Survival function for this subject
surv = cph.predict_survival_function(new_subject)
print(f"P(survival > 26 weeks): {surv.loc[26].values[0]:.3f}")

# Median survival time
median = cph.predict_median(new_subject)
print(f"Predicted median survival: {median.values[0]:.1f} weeks")

# Partial hazard (relative risk score)
risk = cph.predict_partial_hazard(new_subject)
print(f"Relative risk: {risk.values[0]:.3f}")

Checking the proportional hazards assumption

The Cox model assumes hazard ratios are constant over time. Violations lead to biased estimates.

# Schoenfeld residuals test
cph.check_assumptions(rossi, p_value_threshold=0.05, show_plots=True)

# If a covariate violates the assumption, options include:
# 1. Stratify by that variable
# 2. Include a time-interaction term
# 3. Use a parametric model instead

Stratified Cox model

from lifelines import CoxPHFitter

# Stratify by a variable that violates proportional hazards
cph_strat = CoxPHFitter()
cph_strat.fit(
    rossi,
    duration_col="week",
    event_col="arrest",
    strata=["race"],  # Separate baseline hazard per stratum
)
cph_strat.print_summary()

Parametric survival models

When you can assume a distribution, parametric models provide smoother estimates and extrapolation:

from lifelines import WeibullFitter, LogNormalFitter, LogLogisticFitter

data_duration = rossi["week"]
data_event = rossi["arrest"]

# Weibull — versatile, handles increasing/decreasing/constant hazard
wf = WeibullFitter()
wf.fit(data_duration, data_event)
print(f"Weibull — lambda: {wf.lambda_:.3f}, rho: {wf.rho_:.3f}")
print(f"  AIC: {wf.AIC_:.1f}")

# Log-Normal — good for processes with multiplicative random effects
lnf = LogNormalFitter()
lnf.fit(data_duration, data_event)
print(f"Log-Normal — mu: {lnf.mu_:.3f}, sigma: {lnf.sigma_:.3f}")
print(f"  AIC: {lnf.AIC_:.1f}")

# Log-Logistic — S-shaped hazard, common in cancer studies
llf = LogLogisticFitter()
llf.fit(data_duration, data_event)
print(f"  AIC: {llf.AIC_:.1f}")

# Compare using AIC (lower is better)
print("\nBest model by AIC:")
models = {"Weibull": wf.AIC_, "LogNormal": lnf.AIC_, "LogLogistic": llf.AIC_}
best = min(models, key=models.get)
print(f"  {best}: {models[best]:.1f}")

Accelerated Failure Time (AFT) models

AFT models are parametric regression models — the alternative to Cox when proportional hazards does not hold:

from lifelines import WeibullAFTFitter

aft = WeibullAFTFitter()
aft.fit(rossi, duration_col="week", event_col="arrest")
aft.print_summary()

# AFT coefficients have a different interpretation:
# Positive coef → event takes LONGER (protective)
# Negative coef → event happens SOONER (harmful)

Real-world application: SaaS churn analysis

import pandas as pd
import numpy as np
from lifelines import KaplanMeierFitter, CoxPHFitter
from lifelines.statistics import logrank_test

# Simulate subscription data
np.random.seed(42)
n = 500

churn_data = pd.DataFrame({
    "months_active": np.random.exponential(12, n).round(1),
    "churned": np.random.binomial(1, 0.6, n),
    "plan": np.random.choice(["free", "pro", "enterprise"], n, p=[0.5, 0.35, 0.15]),
    "monthly_logins": np.random.poisson(15, n),
    "support_tickets": np.random.poisson(2, n),
})

# Kaplan-Meier by plan
kmf = KaplanMeierFitter()
for plan in ["free", "pro", "enterprise"]:
    mask = churn_data["plan"] == plan
    kmf.fit(churn_data.loc[mask, "months_active"],
            churn_data.loc[mask, "churned"],
            label=plan)
    print(f"{plan} — median retention: {kmf.median_survival_time_:.1f} months")

# Cox model with covariates
churn_data["plan_pro"] = (churn_data["plan"] == "pro").astype(int)
churn_data["plan_enterprise"] = (churn_data["plan"] == "enterprise").astype(int)

cph = CoxPHFitter()
cph.fit(
    churn_data[["months_active", "churned", "plan_pro", "plan_enterprise",
                "monthly_logins", "support_tickets"]],
    duration_col="months_active",
    event_col="churned",
)
cph.print_summary()

# Actionable insight: which factors protect against churn?
significant = cph.summary[cph.summary["p"] < 0.05]
print("\nSignificant predictors of churn:")
print(significant[["exp(coef)", "p"]])

Model validation

from lifelines import CoxPHFitter
from lifelines.utils import k_fold_cross_validation

# Cross-validated concordance index
scores = k_fold_cross_validation(
    CoxPHFitter(),
    rossi,
    duration_col="week",
    event_col="arrest",
    k=5,
    scoring_method="concordance_index",
)

print(f"Cross-validated C-index: {scores.mean():.3f} ± {scores.std():.3f}")
# C-index > 0.7 is generally considered useful
# C-index = 0.5 is random

Performance and scaling

Dataset sizeApproachNotes
< 10k rowsAll Lifelines modelsFast, no issues
10k - 100kCox and parametric modelsKM still fast, Cox may take seconds
100k - 1MUse CoxPHFitter(penalizer=0.01)Regularization stabilizes large models
> 1Mscikit-survival or custom implementationLifelines is not optimized for this scale

Memory-efficient patterns

# For large datasets, avoid computing the full survival table
# Use predict methods directly instead of calling .plot()

# Regularized Cox for high-dimensional data
cph = CoxPHFitter(penalizer=0.1, l1_ratio=0.5)  # Elastic net
cph.fit(rossi, duration_col="week", event_col="arrest")

Common pitfalls

  1. Ignoring censoring. Treating censored observations as events biases survival estimates downward. Dropping them biases estimates upward. Always use survival-specific methods.
  2. Violating proportional hazards. If the log-rank test is significant but the Cox model gives nonsensical hazard ratios, check the PH assumption with Schoenfeld residuals.
  3. Immortal time bias. If group membership depends on surviving long enough to receive treatment, the analysis is biased. Landmark analysis or time-varying covariates can fix this.
  4. Competing risks. If subjects can experience multiple types of events (death from cancer vs. death from heart disease), standard survival analysis treats non-cancer deaths as censoring, which may overestimate cancer-specific risk. Use the competing risks approach in that scenario.

The one thing to remember: Lifelines provides a complete, Pythonic toolkit for survival analysis — from Kaplan-Meier curves through Cox regression to parametric models — with built-in assumption checking that prevents the most common statistical mistakes.

pythonstatisticsdata-science

See Also