Lifelines for Survival Analysis — ELI5

Imagine you buy a pack of lightbulbs and want to know how long each one will last. You screw them all in at the same time and wait. Some burn out after a month. Some last a year. Some are still going when you get bored and stop watching.

That “how long until something happens” question is called survival analysis, and it shows up everywhere:

  • How long do patients live after starting a new treatment?
  • How many months before a customer cancels their subscription?
  • How many miles before a car part breaks?

The tricky part is the lightbulbs that were still working when you stopped watching. You do not know when they will eventually burn out — you only know they lasted at least this long. Throwing away that incomplete data wastes valuable information. Survival analysis has special math to include it.

Lifelines is a Python library built for exactly this kind of question. You give it a list of times (how long each lightbulb lasted) and a list of flags (did it burn out, or was it still working?). It draws a curve that shows the probability of surviving past any given time.

For a hospital, that curve might show: “80% of patients are still alive after one year, 60% after two years, 35% after five years.” For a streaming service, it might show: “Half of new subscribers cancel within the first three months.”

Lifelines also compares groups. Did patients who took Drug A survive longer than those who took Drug B? Did premium customers stick around longer than free-tier users? It answers these questions with statistical rigor, not guesswork.

The one thing to remember: Lifelines helps Python answer “how long until something happens?” — even when you have incomplete data about things that have not happened yet.

pythonstatisticsdata-science

See Also