Distributed Tracing with OpenTelemetry in Python — Core Concepts

Understand traces, spans, context propagation, and exporters to instrument Python services with OpenTelemetry for full request visibility.

When a single user action triggers work across multiple services, diagnosing performance problems or errors becomes difficult. Logs tell you what happened in one service; metrics tell you aggregate statistics. Distributed tracing fills the gap by connecting events across services into a single timeline.

OpenTelemetry (OTel) is the industry-standard framework for producing traces, metrics, and logs. For Python developers, it provides libraries that instrument your code and export telemetry to backends like Jaeger, Zipkin, or Grafana Tempo.

Traces and spans

A trace represents the full journey of a request through your system. It is identified by a unique trace ID.

A span represents one unit of work within that trace. Each span has:

A name (e.g., “process_payment”)
Start and end timestamps
A parent span (creating a tree structure)
Attributes (key-value metadata like user.id=42)
Status (OK, ERROR)

When service A calls service B, service A creates a child span. The parent-child relationship builds the tree that visualization tools render as a waterfall diagram.

Context propagation

The trace ID must travel between services. This happens through context propagation — typically via HTTP headers.

When service A makes an HTTP request to service B, OpenTelemetry injects the trace ID and span ID into headers (usually traceparent). Service B extracts these headers and creates its spans as children of the incoming span.

This works automatically with OpenTelemetry’s instrumentation libraries for common frameworks.

Instrumenting Python services

OpenTelemetry provides auto-instrumentation for popular frameworks:

pip install opentelemetry-api opentelemetry-sdk \
    opentelemetry-instrumentation-flask \
    opentelemetry-instrumentation-requests \
    opentelemetry-exporter-otlp

Basic setup:

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

# Configure the tracer
provider = TracerProvider()
processor = BatchSpanProcessor(OTLPSpanExporter(endpoint="http://collector:4317"))
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)

With Flask auto-instrumentation:

from opentelemetry.instrumentation.flask import FlaskInstrumentor

app = Flask(__name__)
FlaskInstrumentor().instrument_app(app)

Every incoming request now automatically creates a span. Outgoing HTTP calls via requests or httpx (with their instrumentors) automatically propagate the trace context.

Adding custom spans

Auto-instrumentation covers HTTP boundaries. For internal logic, add manual spans:

tracer = trace.get_tracer("order-service")

with tracer.start_as_current_span("validate_order") as span:
    span.set_attribute("order.id", order_id)
    span.set_attribute("order.items", len(items))
    
    if not valid:
        span.set_status(trace.StatusCode.ERROR, "Invalid order")
        span.record_exception(ValidationError("Missing address"))

Custom spans add granularity. Without them, you see “service A called service B in 200ms.” With them, you see “service A spent 10ms validating, 180ms querying the database, and 10ms formatting the response.”

The collector

The OpenTelemetry Collector is a separate process that receives telemetry, processes it (sampling, enrichment), and exports it to backends. Python services send data to the collector rather than directly to Jaeger or Grafana.

This architecture means you can switch backends without changing application code. It also centralizes sampling decisions and reduces the number of connections each service maintains.

Common misconception

“Tracing adds significant overhead to every request.” With proper sampling (e.g., trace 1% of requests in high-traffic services), the overhead is negligible — typically under 1ms of added latency. Head-based sampling decides at the start of a trace whether to record it, so unsampled requests carry almost zero cost.

When distributed tracing matters most

Tracing pays for itself when you have more than two or three services interacting per request. A monolith rarely needs it — application profiling tools work better. But once requests fan out across services, tracing becomes the only reliable way to understand end-to-end latency and failure cascades.

One thing to remember: OpenTelemetry connects the dots between services by propagating trace context — install the auto-instrumentors, point them at a collector, and you get request-level visibility across your entire Python architecture.

pythonopentelemetrytracing