Nomad Job Scheduling with Python — Core Concepts
What Nomad solves
When you have multiple Python services that need to run across a cluster of servers, you face scheduling decisions: which server has enough CPU and memory, how to handle server failures, how to roll out updates without downtime, and how to scale services up during peak traffic.
Nomad handles all of this. It’s a workload orchestrator — a single binary that acts as both the cluster manager and the scheduler. HashiCorp designed it to be operationally simpler than Kubernetes while handling the same core scheduling problem.
Core concepts
Jobs are the top-level unit. A job describes what you want to run. For a Python service, a job might say “run my FastAPI container with 3 instances.”
Task Groups contain one or more tasks that must run on the same machine. A common pattern: your Python API server and a logging sidecar in the same task group.
Tasks are individual workloads — a Docker container, a raw binary, or a Python script. Each task specifies resource requirements (CPU, memory) and configuration.
Allocations are the binding of task groups to specific machines. When you submit a job, Nomad creates allocations by matching resource requests against available capacity.
Evaluations happen when the cluster state changes (new job, node failure, scaling event). Nomad’s scheduler evaluates what needs to change and creates a plan.
Job types
Nomad supports three job types:
- Service — long-running processes that should always be up (web servers, APIs, workers). Nomad restarts them on failure and reschedules on node loss.
- Batch — jobs that run to completion (data processing, migrations, report generation). Nomad doesn’t restart them after they finish successfully.
- System — one instance per eligible node (monitoring agents, log collectors). Automatically placed on new nodes when they join.
Deploying Python services
A Nomad job specification for a Python service:
job "api-service" {
datacenters = ["dc1"]
type = "service"
group "api" {
count = 3
network {
port "http" { to = 8000 }
}
service {
name = "api"
port = "http"
check {
type = "http"
path = "/health"
interval = "10s"
timeout = "3s"
}
}
task "fastapi" {
driver = "docker"
config {
image = "ghcr.io/myorg/api:v2.1"
ports = ["http"]
}
env {
DATABASE_URL = "postgresql://app:${NOMAD_SECRET_DB_PASS}@db.internal:5432/app"
WORKERS = "4"
}
resources {
cpu = 500
memory = 256
}
}
}
}
Python SDK integration
The python-nomad library wraps Nomad’s HTTP API:
import nomad
client = nomad.Nomad(host="nomad.internal", timeout=10)
# List running jobs
jobs = client.jobs.get_jobs()
for job in jobs:
print(f"{job['ID']}: {job['Status']}")
# Get allocation details for a job
allocations = client.job.get_allocations("api-service")
for alloc in allocations:
print(f" {alloc['ID'][:8]} on {alloc['NodeID'][:8]}: {alloc['ClientStatus']}")
This API access lets Python scripts automate deployment workflows, build monitoring dashboards, and implement custom scaling logic.
Common misconception
People often assume Nomad only runs Docker containers. In fact, Nomad supports multiple task drivers: Docker, raw exec (plain binaries), Java, QEMU, and more. You can run a Python script directly with the exec driver — no Docker required. This flexibility makes Nomad useful for mixed workloads where some services are containerized and others aren’t.
Nomad vs Kubernetes
Kubernetes is more feature-rich: built-in service mesh, custom resource definitions, a massive ecosystem. Nomad is simpler: a single binary, easier to operate, faster to learn. Teams choose Nomad when they want straightforward container scheduling without the operational overhead of Kubernetes. Companies like Cloudflare, Roblox, and Trivago run Nomad in production at significant scale.
Service discovery and networking
Nomad integrates with Consul (another HashiCorp tool) for service discovery. When your Python service registers with Consul, other services find it by name rather than hardcoded addresses. Nomad also supports its own built-in service discovery for simpler setups.
The one thing to remember: Nomad gives Python teams a simpler path to cluster-level job scheduling — submit a job spec, and Nomad handles placement, health checking, restarts, and rolling updates across your infrastructure.
See Also
- Python Ansible Automation How Python powers Ansible to automatically set up and manage hundreds of servers without logging into each one
- Python Docker Compose Orchestration How Python developers use Docker Compose to run multiple services together like a conductor leading an orchestra
- Python Etcd Distributed Config How Python applications use etcd to share configuration across many servers and react to changes instantly
- Python Helm Charts Python Why Python developers use Helm charts to package and deploy their apps to Kubernetes clusters
- Python Pulumi Infrastructure How Python developers use Pulumi to build cloud infrastructure using the same language they already know