SaltStack Configuration with Python — Deep Dive
Salt’s communication internals
Salt uses ZeroMQ for its transport layer (with TCP and WebSocket alternatives available). The master runs two ZeroMQ channels:
- Publisher (port 4505) — broadcasts commands to all minions using PUB/SUB
- Return (port 4506) — receives results from minions using REQ/REP
When you run salt '*' test.ping, the master publishes a message on port 4505. Every minion receives it, executes the function, and sends the result back on port 4506. The master collects responses and displays them.
This publish-subscribe model is why Salt scales better than SSH-based tools. Adding more minions doesn’t proportionally increase master load — the broadcast reaches all subscribers simultaneously.
Writing execution modules
Execution modules are the fundamental building block. Every salt command maps to a Python function:
# /srv/salt/_modules/deployment.py
"""Custom deployment module for Python applications."""
import os
import subprocess
import json
import logging
log = logging.getLogger(__name__)
__virtualname__ = "deploy"
def __virtual__():
"""Only load on Linux systems with systemd."""
if __grains__["kernel"] == "Linux" and os.path.exists("/usr/bin/systemctl"):
return __virtualname__
return False
def release(app_name, version, repo_url, venv_path="/opt/venvs"):
"""
Deploy a new version of a Python application.
CLI Example:
salt 'web*' deploy.release myapp v2.3.1 https://github.com/org/myapp
"""
app_dir = f"/opt/apps/{app_name}"
venv = f"{venv_path}/{app_name}"
steps = []
# Clone or update repository
if os.path.exists(app_dir):
result = _run(f"git -C {app_dir} fetch && git -C {app_dir} checkout {version}")
else:
result = _run(f"git clone {repo_url} {app_dir} && git -C {app_dir} checkout {version}")
steps.append({"git": result})
# Update virtual environment
result = _run(f"{venv}/bin/pip install -r {app_dir}/requirements.txt --quiet")
steps.append({"pip": result})
# Run migrations
result = _run(
f"{venv}/bin/python {app_dir}/manage.py migrate --noinput",
env={"DJANGO_SETTINGS_MODULE": f"{app_name}.settings.production"},
)
steps.append({"migrate": result})
# Restart service
__salt__["service.restart"](app_name)
steps.append({"restart": "completed"})
# Health check
health = __salt__["http.query"](
f"http://localhost:8000/health",
status=True,
)
steps.append({"health": health.get("status", "unknown")})
return {
"app": app_name,
"version": version,
"steps": steps,
"success": health.get("status") == 200,
}
def _run(cmd, env=None):
"""Run a shell command and return structured output."""
full_env = os.environ.copy()
if env:
full_env.update(env)
try:
result = subprocess.run(
cmd, shell=True, capture_output=True, text=True,
timeout=300, env=full_env,
)
return {
"retcode": result.returncode,
"stdout": result.stdout[-500:] if result.stdout else "",
"stderr": result.stderr[-500:] if result.stderr else "",
}
except subprocess.TimeoutExpired:
return {"retcode": -1, "error": "timeout"}
Key patterns:
__virtual__()conditionally loads the module based on the target system__grains__accesses system information (injected by Salt’s loader)__salt__calls other execution modules (Salt’s inter-module API)- Structured return values — always return dicts, not strings, so downstream tooling can parse results
Custom state modules
State modules define new declarative state types:
# /srv/salt/_states/python_app.py
"""State module for managing Python application deployments."""
def deployed(name, version, repo_url, health_endpoint="/health"):
"""
Ensure a Python application is deployed at the specified version.
name:
Application name
version:
Git tag or commit to deploy
repo_url:
Git repository URL
"""
ret = {"name": name, "changes": {}, "result": True, "comment": ""}
# Check current version
current = _get_current_version(name)
if current == version:
ret["comment"] = f"{name} already at version {version}"
return ret
if __opts__["test"]:
ret["result"] = None
ret["comment"] = f"Would update {name} from {current} to {version}"
ret["changes"] = {"old": current, "new": version}
return ret
# Perform deployment using our execution module
result = __salt__["deploy.release"](name, version, repo_url)
if result["success"]:
ret["changes"] = {"old": current, "new": version}
ret["comment"] = f"Deployed {name} {version} successfully"
else:
ret["result"] = False
ret["comment"] = f"Deployment failed: {result}"
return ret
def _get_current_version(app_name):
app_dir = f"/opt/apps/{app_name}"
result = __salt__["cmd.run_stdout"](
f"git -C {app_dir} describe --tags --always 2>/dev/null || echo 'none'"
)
return result.strip()
Used in state files:
# /srv/salt/apps/myapp.sls
myapp:
python_app.deployed:
- version: v2.3.1
- repo_url: https://github.com/org/myapp
- health_endpoint: /api/health
Event-driven automation with reactors
Salt’s event bus enables real-time responses to infrastructure changes:
# /etc/salt/master.d/reactor.conf
reactor:
- "salt/minion/*/start":
- /srv/reactor/minion_start.sls
- "myapp/deployment/failed":
- /srv/reactor/deployment_rollback.sls
- "salt/beacon/*/diskusage/*":
- /srv/reactor/disk_alert.sls
# /srv/reactor/deployment_rollback.sls
rollback_failed_deploy:
runner.state.orchestrate:
- args:
- mods: orchestration.rollback
- pillar:
app_name: {{ data['app_name'] }}
failed_version: {{ data['version'] }}
Custom beacons (Python event generators)
Beacons run on minions and fire events when conditions are met:
# /srv/salt/_beacons/app_latency.py
"""Beacon that monitors application response latency."""
import requests
import time
import logging
log = logging.getLogger(__name__)
def validate(config):
if not isinstance(config, list) or not config:
return False, "Configuration must be a non-empty list"
return True, "Valid beacon configuration"
def beacon(config):
"""Check application latency and fire event if threshold exceeded."""
ret = []
for entry in config:
url = entry.get("url", "http://localhost:8000/health")
threshold_ms = entry.get("threshold_ms", 500)
start = time.monotonic()
try:
resp = requests.get(url, timeout=5)
latency_ms = (time.monotonic() - start) * 1000
if latency_ms > threshold_ms:
ret.append({
"url": url,
"latency_ms": round(latency_ms, 1),
"threshold_ms": threshold_ms,
"status_code": resp.status_code,
"tag": "high_latency",
})
except requests.RequestException as e:
ret.append({
"url": url,
"error": str(e),
"tag": "unreachable",
})
return ret
Orchestration runners
Runners execute on the master and coordinate multi-minion workflows:
# /srv/salt/_runners/canary_deploy.py
"""Canary deployment runner."""
def execute(app_name, version, canary_target="web-canary*", full_target="web*"):
"""
Deploy to canary hosts first, verify, then roll out to all.
CLI Example:
salt-run canary_deploy.execute myapp v2.3.1
"""
client = __salt__["salt.cmd"]
# Step 1: Deploy to canary
canary_result = __salt__["salt.execute"](
canary_target, "deploy.release",
kwarg={"app_name": app_name, "version": version},
)
canary_success = all(
r.get("success", False) for r in canary_result.values()
)
if not canary_success:
return {
"stage": "canary",
"success": False,
"message": "Canary deployment failed, aborting",
"details": canary_result,
}
# Step 2: Wait and monitor canary
import time
time.sleep(60) # Wait for metrics to settle
health_results = __salt__["salt.execute"](
canary_target, "deploy.health_check",
kwarg={"app_name": app_name},
)
canary_healthy = all(
r.get("healthy", False) for r in health_results.values()
)
if not canary_healthy:
# Rollback canary
__salt__["salt.execute"](
canary_target, "deploy.rollback",
kwarg={"app_name": app_name},
)
return {
"stage": "canary_verification",
"success": False,
"message": "Canary health check failed, rolled back",
}
# Step 3: Full rollout
full_result = __salt__["salt.execute"](
full_target, "deploy.release",
kwarg={"app_name": app_name, "version": version},
)
return {
"stage": "complete",
"success": all(r.get("success", False) for r in full_result.values()),
"canary_hosts": list(canary_result.keys()),
"total_hosts": list(full_result.keys()),
"version": version,
}
Salt vs Ansible: technical comparison
| Dimension | Salt | Ansible |
|---|---|---|
| Architecture | Master + agents (ZeroMQ) | Agentless (SSH) |
| Speed at scale | Sub-second for 10K hosts | Minutes for 1K+ hosts |
| Real-time events | Built-in event bus + reactors | Not native |
| Learning curve | Steeper (master setup, PKI) | Gentler (just SSH) |
| Agentless mode | salt-ssh (slower) | Default |
| State language | YAML + Jinja2 | YAML + Jinja2 |
| Python extensibility | Modules, states, beacons, runners, returners | Modules, plugins, filters |
| Community size | Smaller | Larger |
Performance tuning
- Worker threads: Increase
worker_threadson the master for parallel job handling (default: 5) - Batch mode:
salt --batch 20%processes hosts in waves to avoid overwhelming the master - Minion cache: Enable
minion_data_cache: Trueto avoid repeated grain collection - Returner offloading: Send results to Redis or PostgreSQL via returner modules instead of processing everything on the master
- Syndic architecture: For very large deployments (50K+ minions), use syndic masters as intermediaries
The one thing to remember: Salt’s Python-native architecture — from execution modules to event reactors to orchestration runners — makes it the most programmable configuration management tool available, especially for teams that think in Python.
See Also
- Python Ansible Automation How Python powers Ansible to automatically set up and manage hundreds of servers without logging into each one
- Python Docker Compose Orchestration How Python developers use Docker Compose to run multiple services together like a conductor leading an orchestra
- Python Etcd Distributed Config How Python applications use etcd to share configuration across many servers and react to changes instantly
- Python Helm Charts Python Why Python developers use Helm charts to package and deploy their apps to Kubernetes clusters
- Python Nomad Job Scheduling How Python developers use HashiCorp Nomad to run their programs across many computers without managing each one