Python Saga Pattern — ELI5

Imagine you’re planning a vacation. You need to book three things: a flight, a hotel, and a rental car. You book the flight first — great! Then the hotel — done! But when you try to rent a car, nothing is available.

Now you have a problem. You don’t want to fly somewhere and stay in a hotel with no way to get around. So you undo your hotel booking and cancel your flight. Each step has a backup plan.

That’s exactly what the saga pattern does in software.

When your app needs to do something that involves multiple services — like placing an order that requires charging a credit card, reserving inventory, and scheduling shipping — it can’t wrap everything in one “all-or-nothing” transaction the way a single database can. Each service has its own database, its own rules.

A saga breaks the big operation into steps. Each step has a compensating action — a way to undo it if a later step fails:

  1. Charge payment → if later steps fail, refund payment
  2. Reserve inventory → if later steps fail, release inventory
  3. Schedule shipping → if this fails, undo steps 1 and 2

If everything succeeds, you’re done. If step 3 fails, the saga runs the undo actions for steps 2 and 1, in reverse order. The system ends up in a consistent state either way.

The key insight is that “undo” in distributed systems isn’t a magical rollback — it’s a real action (issuing a refund, releasing a reservation) that your code explicitly defines.

The one thing to remember: The saga pattern coordinates multi-step operations across services by pairing each step with an undo action, so failures at any point leave the system in a clean state.

pythonarchitecturepatterns

See Also