Nomad Job Scheduling with Python — Core Concepts

What Nomad solves

When you have multiple Python services that need to run across a cluster of servers, you face scheduling decisions: which server has enough CPU and memory, how to handle server failures, how to roll out updates without downtime, and how to scale services up during peak traffic.

Nomad handles all of this. It’s a workload orchestrator — a single binary that acts as both the cluster manager and the scheduler. HashiCorp designed it to be operationally simpler than Kubernetes while handling the same core scheduling problem.

Core concepts

Jobs are the top-level unit. A job describes what you want to run. For a Python service, a job might say “run my FastAPI container with 3 instances.”

Task Groups contain one or more tasks that must run on the same machine. A common pattern: your Python API server and a logging sidecar in the same task group.

Tasks are individual workloads — a Docker container, a raw binary, or a Python script. Each task specifies resource requirements (CPU, memory) and configuration.

Allocations are the binding of task groups to specific machines. When you submit a job, Nomad creates allocations by matching resource requests against available capacity.

Evaluations happen when the cluster state changes (new job, node failure, scaling event). Nomad’s scheduler evaluates what needs to change and creates a plan.

Job types

Nomad supports three job types:

  • Service — long-running processes that should always be up (web servers, APIs, workers). Nomad restarts them on failure and reschedules on node loss.
  • Batch — jobs that run to completion (data processing, migrations, report generation). Nomad doesn’t restart them after they finish successfully.
  • System — one instance per eligible node (monitoring agents, log collectors). Automatically placed on new nodes when they join.

Deploying Python services

A Nomad job specification for a Python service:

job "api-service" {
  datacenters = ["dc1"]
  type = "service"

  group "api" {
    count = 3

    network {
      port "http" { to = 8000 }
    }

    service {
      name = "api"
      port = "http"
      check {
        type     = "http"
        path     = "/health"
        interval = "10s"
        timeout  = "3s"
      }
    }

    task "fastapi" {
      driver = "docker"

      config {
        image = "ghcr.io/myorg/api:v2.1"
        ports = ["http"]
      }

      env {
        DATABASE_URL = "postgresql://app:${NOMAD_SECRET_DB_PASS}@db.internal:5432/app"
        WORKERS      = "4"
      }

      resources {
        cpu    = 500
        memory = 256
      }
    }
  }
}

Python SDK integration

The python-nomad library wraps Nomad’s HTTP API:

import nomad

client = nomad.Nomad(host="nomad.internal", timeout=10)

# List running jobs
jobs = client.jobs.get_jobs()
for job in jobs:
    print(f"{job['ID']}: {job['Status']}")

# Get allocation details for a job
allocations = client.job.get_allocations("api-service")
for alloc in allocations:
    print(f"  {alloc['ID'][:8]} on {alloc['NodeID'][:8]}: {alloc['ClientStatus']}")

This API access lets Python scripts automate deployment workflows, build monitoring dashboards, and implement custom scaling logic.

Common misconception

People often assume Nomad only runs Docker containers. In fact, Nomad supports multiple task drivers: Docker, raw exec (plain binaries), Java, QEMU, and more. You can run a Python script directly with the exec driver — no Docker required. This flexibility makes Nomad useful for mixed workloads where some services are containerized and others aren’t.

Nomad vs Kubernetes

Kubernetes is more feature-rich: built-in service mesh, custom resource definitions, a massive ecosystem. Nomad is simpler: a single binary, easier to operate, faster to learn. Teams choose Nomad when they want straightforward container scheduling without the operational overhead of Kubernetes. Companies like Cloudflare, Roblox, and Trivago run Nomad in production at significant scale.

Service discovery and networking

Nomad integrates with Consul (another HashiCorp tool) for service discovery. When your Python service registers with Consul, other services find it by name rather than hardcoded addresses. Nomad also supports its own built-in service discovery for simpler setups.

The one thing to remember: Nomad gives Python teams a simpler path to cluster-level job scheduling — submit a job spec, and Nomad handles placement, health checking, restarts, and rolling updates across your infrastructure.

pythonnomadschedulingorchestration

See Also