Python Metrics Collection — Core Concepts

Learn the four metric types — counters, gauges, histograms, and summaries — and how to expose them from Python services using Prometheus and OpenTelemetry.

Metrics are numerical measurements collected at regular intervals. Unlike logs (text descriptions of events) or traces (request timelines), metrics are cheap to store and fast to query — making them the first line of defense for detecting problems.

The four metric types

Counters

A counter only goes up. It tracks cumulative totals.

Requests served: 14,832
Errors encountered: 47
Bytes sent: 2,340,000

You typically care about the rate (requests per second) rather than the raw number.

Gauges

A gauge goes up and down. It represents a current value.

Active connections: 142
Memory usage: 512 MB
Queue depth: 23

Histograms

A histogram tracks the distribution of values, like response times. It answers: “What percentage of requests took less than 200ms?”

Histograms use buckets. Each bucket counts observations below a threshold:

Bucket (ms)	Count
≤ 50	8,200
≤ 100	9,100
≤ 250	9,800
≤ 500	9,950
≤ 1000	10,000

From this, you can calculate percentiles (p50, p95, p99).

Summaries

Summaries compute percentiles on the client side. They’re less common because they can’t be aggregated across instances. Prefer histograms unless you have a specific reason.

Implementing metrics in Python

With prometheus_client

The most common approach for Python services:

from prometheus_client import Counter, Histogram, Gauge, start_http_server

# Define metrics
REQUEST_COUNT = Counter(
    "http_requests_total",
    "Total HTTP requests",
    ["method", "endpoint", "status"]
)

REQUEST_LATENCY = Histogram(
    "http_request_duration_seconds",
    "Request latency in seconds",
    ["endpoint"],
    buckets=[0.01, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0]
)

ACTIVE_REQUESTS = Gauge(
    "http_active_requests",
    "Currently active requests"
)

# Use in code
@app.middleware("http")
async def metrics_middleware(request, call_next):
    ACTIVE_REQUESTS.inc()
    start = time.perf_counter()
    response = await call_next(request)
    duration = time.perf_counter() - start

    REQUEST_COUNT.labels(
        method=request.method,
        endpoint=request.url.path,
        status=response.status_code
    ).inc()
    REQUEST_LATENCY.labels(endpoint=request.url.path).observe(duration)
    ACTIVE_REQUESTS.dec()
    return response

# Expose /metrics endpoint for Prometheus to scrape
start_http_server(8000)

With OpenTelemetry Metrics

from opentelemetry import metrics
from opentelemetry.sdk.metrics import MeterProvider

provider = MeterProvider()
metrics.set_meter_provider(provider)
meter = metrics.get_meter("my-service")

request_counter = meter.create_counter(
    "http.requests",
    description="Total HTTP requests"
)

request_duration = meter.create_histogram(
    "http.request.duration",
    unit="s",
    description="Request duration"
)

OpenTelemetry metrics can export to Prometheus, OTLP, or other backends.

Labels (dimensions)

Labels add context to metrics. A counter http_requests_total with labels method and status lets you query:

Total requests: sum(http_requests_total)
GET requests: http_requests_total{method="GET"}
Error rate: sum(rate(http_requests_total{status=~"5.."}[5m]))

Cardinality warning: Every unique combination of label values creates a separate time series. Adding a user_id label with millions of users will overwhelm your metrics backend. Keep label cardinality low (method, status code, endpoint pattern — not individual IDs).

Pull vs. push

Pull (Prometheus model): Your app exposes a /metrics endpoint. Prometheus scrapes it every 15 seconds.
Push (StatsD / OTLP model): Your app sends metrics to a collector actively.

Pull is simpler for services with stable endpoints. Push works better for short-lived jobs (batch scripts, Lambda functions) that may finish before a scrape happens.

Common misconception

“Metrics tell you what’s wrong.” Metrics tell you something is wrong and help you narrow down where. To understand why, you need logs and traces. The three together — metrics, logs, traces — form the “three pillars of observability.”

One thing to remember: Start with four metrics for any Python service: request count, error count, latency histogram, and active connections. These four numbers catch most production issues before users notice.

pythonobservabilityprometheusmonitoring