Route Optimization in Python — Deep Dive
Production route optimization lives at the intersection of combinatorial optimization, geospatial data engineering, and real-time systems. The academic TSP formulation is a starting point, but shipping software that dispatchers trust requires handling messy real-world inputs and tight latency budgets.
Distance matrix construction
The foundation of any route solver is a reliable distance matrix. Two dominant open-source options:
- OSRM (Open Source Routing Machine) — self-hosted, uses OpenStreetMap data, returns driving distance and duration. The
/tableendpoint computes an N×N matrix in a single HTTP call. - Valhalla — another OSM-based engine with richer costing models (truck profiles, avoid tolls).
import requests
import numpy as np
def build_distance_matrix(coords: list[tuple[float, float]], osrm_url: str) -> np.ndarray:
"""Build a duration matrix via OSRM table service."""
coord_str = ";".join(f"{lon},{lat}" for lat, lon in coords)
resp = requests.get(f"{osrm_url}/table/v1/driving/{coord_str}?annotations=duration")
resp.raise_for_status()
return np.array(resp.json()["durations"])
For fewer than 25 waypoints, the Google Directions API works but costs scale with N². Beyond that, a self-hosted OSRM instance on a 4-core VM handles 500-point matrices in under two seconds.
OR-Tools solver setup
Google OR-Tools provides the RoutingModel class that handles TSP and its constrained variants (CVRP, VRPTW, PDPTW).
from ortools.constraint_solver import routing_enums_pb2, pywrapcp
def solve_tsp(distance_matrix: list[list[int]], depot: int = 0):
manager = pywrapcp.RoutingIndexManager(len(distance_matrix), 1, depot)
routing = pywrapcp.RoutingModel(manager)
def distance_callback(from_index, to_index):
from_node = manager.IndexToNode(from_index)
to_node = manager.IndexToNode(to_index)
return distance_matrix[from_node][to_node]
transit_callback_index = routing.RegisterTransitCallback(distance_callback)
routing.SetArcCostEvaluatorOfAllVehicles(transit_callback_index)
search_params = pywrapcp.DefaultRoutingSearchParameters()
search_params.first_solution_strategy = (
routing_enums_pb2.FirstSolutionStrategy.PATH_CHEAPEST_ARC
)
search_params.local_search_metaheuristic = (
routing_enums_pb2.LocalSearchMetaheuristic.GUIDED_LOCAL_SEARCH
)
search_params.time_limit.FromSeconds(5)
solution = routing.SolveWithParameters(search_params)
if not solution:
return None
route = []
index = routing.Start(0)
while not routing.IsEnd(index):
route.append(manager.IndexToNode(index))
index = solution.Value(routing.NextVar(index))
route.append(manager.IndexToNode(index))
return route, solution.ObjectiveValue()
Key tuning levers:
| Parameter | Effect |
|---|---|
first_solution_strategy | Controls the initial route. PATH_CHEAPEST_ARC is a safe default. SAVINGS works well for multi-vehicle. |
local_search_metaheuristic | GUIDED_LOCAL_SEARCH usually outperforms SIMULATED_ANNEALING for VRPTW. |
time_limit | Longer budgets improve quality logarithmically — diminishing returns past 30 seconds for under 500 nodes. |
Adding time windows
Time windows require a time dimension on the routing model:
def add_time_windows(routing, manager, transit_callback_index, time_windows, depot_window):
time = "Time"
routing.AddDimension(
transit_callback_index,
30, # max waiting time at a stop (minutes)
1440, # max total route time (minutes)
False,
time,
)
time_dimension = routing.GetDimensionOrDie(time)
for location_idx, (start, end) in enumerate(time_windows):
if location_idx == manager.GetNumberOfNodes() - 1:
continue
index = manager.NodeToIndex(location_idx)
time_dimension.CumulVar(index).SetRange(start, end)
# Depot constraints
depot_idx = routing.Start(0)
time_dimension.CumulVar(depot_idx).SetRange(*depot_window)
When a stop cannot be reached within its window, the solver either drops it (if allowed via penalty) or returns no solution. Setting appropriate drop penalties lets the system make tradeoff decisions automatically.
Real-time re-routing
Static routes break when a driver gets stuck in traffic or a customer cancels. Re-routing strategies:
- Periodic re-solve — every 15 minutes, rebuild the matrix with updated travel times and solve from the driver’s current position. Works for fleets under 100 vehicles.
- Insertion heuristic — when a new order arrives, find the cheapest position to insert it into existing routes without re-solving everything.
- Event-driven triggers — re-route only when delay exceeds a threshold (e.g., 10 minutes behind schedule).
def cheapest_insertion(route: list[int], new_stop: int, matrix: np.ndarray) -> tuple[list[int], int]:
best_cost = float("inf")
best_pos = -1
for i in range(1, len(route)):
added = matrix[route[i-1]][new_stop] + matrix[new_stop][route[i]] - matrix[route[i-1]][route[i]]
if added < best_cost:
best_cost = added
best_pos = i
new_route = route[:best_pos] + [new_stop] + route[best_pos:]
return new_route, best_cost
Performance benchmarks
On a 2023 M2 MacBook Pro with OR-Tools 9.8:
| Stops | Vehicles | Time limit | Gap to known optimal |
|---|---|---|---|
| 50 | 1 | 2 s | < 1% |
| 200 | 5 | 10 s | 2-4% |
| 500 | 20 | 30 s | 5-8% |
| 1000 | 50 | 60 s | 8-15% |
For 1000+ stops, consider decomposing by geographic cluster (K-Means on coordinates) and solving sub-problems in parallel.
Visualization and dispatcher tools
Folium renders routes on interactive Leaflet maps:
import folium
def plot_route(coords, route_order):
m = folium.Map(location=coords[0], zoom_start=12)
ordered = [coords[i] for i in route_order]
folium.PolyLine(ordered, weight=4, color="blue").add_to(m)
for idx, pos in enumerate(ordered):
folium.CircleMarker(pos, radius=6, popup=f"Stop {idx}").add_to(m)
return m
Production systems often pair this with a FastAPI backend that dispatchers hit to trigger re-optimization, view ETAs, and manually override stop order when drivers call in exceptions.
Common pitfalls
- Symmetric distance assumption — driving A→B may differ from B→A due to one-way streets. Always use an asymmetric matrix.
- Ignoring service time — the five minutes a driver spends at each stop adds up. Model it as part of the time dimension.
- Over-trusting the solver — OR-Tools finds good solutions, not proven optimal ones (for large N). Validate with driver feedback.
- Stale map data — road closures and new construction invalidate cached matrices. Refresh at least weekly.
The one thing to remember: A production route optimizer chains a real-road distance matrix, a constraint-aware solver like OR-Tools, and a re-routing layer that adapts when reality deviates from the plan.
See Also
- Python Adaptive Learning Systems How Python builds learning apps that adjust to each student like a personal tutor who knows exactly what you need next.
- Python Airflow Learn Airflow as a timetable manager that makes sure data tasks run in the right order every day.
- Python Altair Learn Altair through the idea of drawing charts by describing rules, not by hand-placing every visual element.
- Python Automated Grading How Python grades homework and exams automatically, from simple answer keys to understanding written essays.
- Python Batch Vs Stream Processing Batch processing is like doing laundry once a week; stream processing is like a self-cleaning shirt that cleans itself constantly.