Python Metrics Collection — Core Concepts

Metrics are numerical measurements collected at regular intervals. Unlike logs (text descriptions of events) or traces (request timelines), metrics are cheap to store and fast to query — making them the first line of defense for detecting problems.

The four metric types

Counters

A counter only goes up. It tracks cumulative totals.

  • Requests served: 14,832
  • Errors encountered: 47
  • Bytes sent: 2,340,000

You typically care about the rate (requests per second) rather than the raw number.

Gauges

A gauge goes up and down. It represents a current value.

  • Active connections: 142
  • Memory usage: 512 MB
  • Queue depth: 23

Histograms

A histogram tracks the distribution of values, like response times. It answers: “What percentage of requests took less than 200ms?”

Histograms use buckets. Each bucket counts observations below a threshold:

Bucket (ms)Count
≤ 508,200
≤ 1009,100
≤ 2509,800
≤ 5009,950
≤ 100010,000

From this, you can calculate percentiles (p50, p95, p99).

Summaries

Summaries compute percentiles on the client side. They’re less common because they can’t be aggregated across instances. Prefer histograms unless you have a specific reason.

Implementing metrics in Python

With prometheus_client

The most common approach for Python services:

from prometheus_client import Counter, Histogram, Gauge, start_http_server

# Define metrics
REQUEST_COUNT = Counter(
    "http_requests_total",
    "Total HTTP requests",
    ["method", "endpoint", "status"]
)

REQUEST_LATENCY = Histogram(
    "http_request_duration_seconds",
    "Request latency in seconds",
    ["endpoint"],
    buckets=[0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0]
)

ACTIVE_REQUESTS = Gauge(
    "http_active_requests",
    "Currently active requests"
)

# Use in code
@app.middleware("http")
async def metrics_middleware(request, call_next):
    ACTIVE_REQUESTS.inc()
    start = time.perf_counter()
    response = await call_next(request)
    duration = time.perf_counter() - start

    REQUEST_COUNT.labels(
        method=request.method,
        endpoint=request.url.path,
        status=response.status_code
    ).inc()
    REQUEST_LATENCY.labels(endpoint=request.url.path).observe(duration)
    ACTIVE_REQUESTS.dec()
    return response

# Expose /metrics endpoint for Prometheus to scrape
start_http_server(8000)

With OpenTelemetry Metrics

from opentelemetry import metrics
from opentelemetry.sdk.metrics import MeterProvider

provider = MeterProvider()
metrics.set_meter_provider(provider)
meter = metrics.get_meter("my-service")

request_counter = meter.create_counter(
    "http.requests",
    description="Total HTTP requests"
)

request_duration = meter.create_histogram(
    "http.request.duration",
    unit="s",
    description="Request duration"
)

OpenTelemetry metrics can export to Prometheus, OTLP, or other backends.

Labels (dimensions)

Labels add context to metrics. A counter http_requests_total with labels method and status lets you query:

  • Total requests: sum(http_requests_total)
  • GET requests: http_requests_total{method="GET"}
  • Error rate: sum(rate(http_requests_total{status=~"5.."}[5m]))

Cardinality warning: Every unique combination of label values creates a separate time series. Adding a user_id label with millions of users will overwhelm your metrics backend. Keep label cardinality low (method, status code, endpoint pattern — not individual IDs).

Pull vs. push

  • Pull (Prometheus model): Your app exposes a /metrics endpoint. Prometheus scrapes it every 15 seconds.
  • Push (StatsD / OTLP model): Your app sends metrics to a collector actively.

Pull is simpler for services with stable endpoints. Push works better for short-lived jobs (batch scripts, Lambda functions) that may finish before a scrape happens.

Common misconception

“Metrics tell you what’s wrong.” Metrics tell you something is wrong and help you narrow down where. To understand why, you need logs and traces. The three together — metrics, logs, traces — form the “three pillars of observability.”

One thing to remember: Start with four metrics for any Python service: request count, error count, latency histogram, and active connections. These four numbers catch most production issues before users notice.

pythonobservabilityprometheusmonitoring

See Also

  • Python Alerting Patterns Alerting is a smoke detector for your code — it wakes you up when something is burning, not when someone is cooking.
  • Python Correlation Ids Correlation IDs are name tags for requests — they let you follow one visitor's journey through a crowded theme park of services.
  • Python Grafana Dashboards Python Grafana turns boring numbers from your Python app into colorful, real-time dashboards — like a car's dashboard but for your code.
  • Python Log Aggregation Elk ELK collects scattered log files from all your services into one searchable place — like gathering every sticky note in the office into a single filing cabinet.
  • Python Logging Best Practices Treat logs like a flight recorder so you can understand failures after they happen, not just during development.