Prometheus Metrics in Python — Core Concepts
Prometheus is a time-series monitoring system that scrapes metrics from your applications at regular intervals. Unlike push-based systems where apps send data to a collector, Prometheus pulls data — your Python service exposes an HTTP endpoint, and Prometheus fetches it. This pull model simplifies configuration and makes it easy to monitor services without modifying their outbound network rules.
Metric types
The prometheus_client library provides four metric types, each suited to different measurements.
Counter
A number that only goes up. Use it for totals: requests served, errors encountered, bytes processed.
from prometheus_client import Counter
REQUEST_COUNT = Counter("http_requests_total", "Total HTTP requests", ["method", "endpoint", "status"])
# In your request handler
REQUEST_COUNT.labels(method="GET", endpoint="/api/orders", status="200").inc()
Counters reset to zero when the process restarts. Prometheus handles this gracefully — rate() and increase() functions in PromQL detect resets and calculate correctly.
Gauge
A number that goes up and down. Use it for current state: active connections, queue depth, temperature, memory usage.
from prometheus_client import Gauge
ACTIVE_CONNECTIONS = Gauge("active_connections", "Current active connections")
ACTIVE_CONNECTIONS.inc() # Connection opened
ACTIVE_CONNECTIONS.dec() # Connection closed
ACTIVE_CONNECTIONS.set(42) # Set to exact value
Histogram
Measures the distribution of values, typically request durations or response sizes. It buckets observations and provides count, sum, and per-bucket counts.
from prometheus_client import Histogram
REQUEST_DURATION = Histogram(
"http_request_duration_seconds",
"Request duration in seconds",
["endpoint"],
buckets=[0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0]
)
# Time a request
with REQUEST_DURATION.labels(endpoint="/api/orders").time():
process_request()
The buckets define the precision of percentile calculations. Choose buckets that match your SLA thresholds.
Summary
Similar to histogram but calculates quantiles on the client side. Less commonly used because client-side quantiles cannot be aggregated across instances.
Exposing metrics
Your Python app needs an HTTP endpoint that Prometheus can scrape:
from prometheus_client import start_http_server
start_http_server(8000) # Metrics available at http://localhost:8000/metrics
For Flask or FastAPI apps, use middleware instead:
# Flask
from prometheus_client import make_wsgi_app
from werkzeug.middleware.dispatcher import DispatcherMiddleware
app.wsgi_app = DispatcherMiddleware(app.wsgi_app, {"/metrics": make_wsgi_app()})
Labels
Labels add dimensions to metrics. A single http_requests_total counter with labels for method, endpoint, and status replaces dozens of separate metrics.
Rules for labels:
- Keep cardinality manageable. Do not use user IDs or request IDs as labels — this creates millions of time series and overwhelms Prometheus.
- Use labels for dimensions you will filter or group by in dashboards and alerts.
- Good labels: HTTP method, status code, service name, region.
- Bad labels: user email, session token, request body hash.
Querying with PromQL
Prometheus includes a query language for analyzing metrics:
# Request rate over the last 5 minutes
rate(http_requests_total[5m])
# 95th percentile latency
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))
# Error rate as a percentage
sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m])) * 100
These queries power Grafana dashboards and alerting rules.
Common misconception
“Prometheus adds significant overhead to my Python application.” The prometheus_client library uses thread-safe atomic operations for metric updates. Incrementing a counter takes nanoseconds. The scrape endpoint serialization happens only when Prometheus pulls (typically every 15-30 seconds), and for most services, the response is a few kilobytes of text.
When to use Prometheus
Prometheus excels at operational monitoring: is the service healthy, how fast is it, what is the error rate. It is not designed for event logging (use structured logging), request tracing (use OpenTelemetry), or business analytics (use a data warehouse). The sweet spot is real-time operational visibility with alerting.
One thing to remember: Prometheus metrics in Python follow a simple pattern — define counters, gauges, and histograms in your code, expose them on an endpoint, and let Prometheus scrape, store, and alert on the data.
See Also
- Python Adaptive Learning Systems How Python builds learning apps that adjust to each student like a personal tutor who knows exactly what you need next.
- Python Airflow Learn Airflow as a timetable manager that makes sure data tasks run in the right order every day.
- Python Altair Learn Altair through the idea of drawing charts by describing rules, not by hand-placing every visual element.
- Python Automated Grading How Python grades homework and exams automatically, from simple answer keys to understanding written essays.
- Python Batch Vs Stream Processing Batch processing is like doing laundry once a week; stream processing is like a self-cleaning shirt that cleans itself constantly.