Content Negotiation — Deep Dive

Accept header parsing

The Accept header follows RFC 7231 syntax with media ranges, quality values, and parameters. Parsing it correctly handles edge cases that naive string splitting misses:

from dataclasses import dataclass

@dataclass
class MediaRange:
    type: str
    subtype: str
    quality: float
    params: dict

def parse_accept(header: str) -> list[MediaRange]:
    ranges = []
    for item in header.split(","):
        item = item.strip()
        parts = item.split(";")
        media = parts[0].strip()
        type_, subtype = media.split("/") if "/" in media else (media, "*")

        params = {}
        quality = 1.0
        for param in parts[1:]:
            key, _, value = param.strip().partition("=")
            if key.strip() == "q":
                quality = float(value.strip())
            else:
                params[key.strip()] = value.strip()

        ranges.append(MediaRange(type_, subtype, quality, params))

    # Sort by quality descending, then specificity
    return sorted(ranges, key=lambda r: (
        -r.quality,
        0 if r.type == "*" else -1,
        0 if r.subtype == "*" else -1,
    ))

# Example
ranges = parse_accept("text/html, application/json;q=0.9, */*;q=0.1")
# Returns: [text/html (q=1.0), application/json (q=0.9), */* (q=0.1)]

The specificity tie-breaking matters: application/json beats application/* which beats */* at equal quality values. Real implementations also handle Accept: (empty) by defaulting to */*.

Multi-format renderer in FastAPI

A clean architecture uses a renderer registry that maps media types to serialization functions:

from fastapi import FastAPI, Request, HTTPException
from fastapi.responses import Response, JSONResponse
import msgpack
import csv
import io
import dicttoxml

app = FastAPI()

RENDERERS = {}

def renderer(media_type: str):
    def decorator(fn):
        RENDERERS[media_type] = fn
        return fn
    return decorator

@renderer("application/json")
def render_json(data) -> Response:
    return JSONResponse(content=data)

@renderer("application/msgpack")
def render_msgpack(data) -> Response:
    return Response(
        content=msgpack.packb(data, use_bin_type=True),
        media_type="application/msgpack"
    )

@renderer("application/xml")
def render_xml(data) -> Response:
    xml_bytes = dicttoxml.dicttoxml(data, custom_root="response")
    return Response(content=xml_bytes, media_type="application/xml")

@renderer("text/csv")
def render_csv(data) -> Response:
    if isinstance(data, dict) and "items" in data:
        items = data["items"]
    elif isinstance(data, list):
        items = data
    else:
        items = [data]

    if not items:
        return Response(content="", media_type="text/csv")

    output = io.StringIO()
    writer = csv.DictWriter(output, fieldnames=items[0].keys())
    writer.writeheader()
    writer.writerows(items)
    return Response(content=output.getvalue(), media_type="text/csv")

def negotiate(request: Request):
    accept = request.headers.get("accept", "application/json")
    ranges = parse_accept(accept)

    for media_range in ranges:
        media_key = f"{media_range.type}/{media_range.subtype}"
        if media_key in RENDERERS:
            return RENDERERS[media_key]
        # Handle wildcards
        if media_range.subtype == "*":
            for key, renderer_fn in RENDERERS.items():
                if key.startswith(f"{media_range.type}/"):
                    return renderer_fn
        if media_range.type == "*":
            return RENDERERS.get("application/json", next(iter(RENDERERS.values())))

    raise HTTPException(status_code=406, detail="Not Acceptable")

Usage in endpoints:

@app.get("/users")
async def list_users(request: Request):
    users = await fetch_all_users()
    data = {"items": [u.dict() for u in users], "total": len(users)}
    render = negotiate(request)
    return render(data)

This approach keeps endpoint logic format-agnostic. Adding a new format (Protocol Buffers, YAML) requires only a new renderer function.

Vendor media types for API versioning

GitHub uses vendor media types to combine content negotiation with versioning:

Accept: application/vnd.github.v3+json

Implement this in Python by parsing the vendor type:

import re

def parse_vendor_type(accept: str) -> tuple[str, int, str]:
    """Parse vendor media type into (vendor, version, format)."""
    pattern = r"application/vnd\.(\w+)\.v(\d+)\+(\w+)"
    match = re.search(pattern, accept)
    if match:
        return match.group(1), int(match.group(2)), match.group(3)
    return None, None, "json"

@app.get("/repos/{owner}/{repo}")
async def get_repo(owner: str, repo: str, request: Request):
    accept = request.headers.get("accept", "")
    vendor, version, fmt = parse_vendor_type(accept)

    repo_data = await fetch_repo(owner, repo)

    if version == 2:
        # Legacy format
        return RepoResponseV2.from_orm(repo_data)
    else:
        # Default to latest (v3)
        return RepoResponseV3.from_orm(repo_data)

This is more elegant than URL-based versioning (/v1/repos/...) because the URL stays stable while the representation changes. The tradeoff: it’s harder to test in a browser and requires clients to set custom Accept headers.

Language negotiation

HTTP language negotiation follows the same pattern but uses Accept-Language:

from babel import Locale, negotiate_locale

SUPPORTED_LOCALES = ["en", "es", "fr", "de", "ja"]
MESSAGES = {
    "en": {"not_found": "Resource not found", "unauthorized": "Authentication required"},
    "es": {"not_found": "Recurso no encontrado", "unauthorized": "Autenticación requerida"},
    "fr": {"not_found": "Ressource non trouvée", "unauthorized": "Authentification requise"},
}

def get_locale(request: Request) -> str:
    accept_lang = request.headers.get("accept-language", "en")
    # Parse quality values: "fr;q=0.9, en;q=0.8" -> ["fr", "en"]
    preferred = []
    for item in accept_lang.split(","):
        parts = item.strip().split(";")
        lang = parts[0].strip()
        q = 1.0
        for p in parts[1:]:
            if p.strip().startswith("q="):
                q = float(p.strip()[2:])
        preferred.append((lang, q))

    preferred.sort(key=lambda x: -x[1])

    for lang, _ in preferred:
        short = lang.split("-")[0]
        if short in SUPPORTED_LOCALES:
            return short
    return "en"

def localized_error(request: Request, key: str) -> str:
    locale = get_locale(request)
    return MESSAGES.get(locale, MESSAGES["en"]).get(key, key)

Encoding negotiation and compression

While most ASGI servers handle Accept-Encoding transparently, you can add custom compression for specific formats:

from starlette.middleware.gzip import GZipMiddleware

# Standard approach: middleware handles it
app.add_middleware(GZipMiddleware, minimum_size=500)

# Custom approach for specific endpoints
import brotli

@app.get("/large-dataset")
async def large_dataset(request: Request):
    data = await generate_large_response()
    json_bytes = json.dumps(data).encode()

    accept_encoding = request.headers.get("accept-encoding", "")

    if "br" in accept_encoding:
        compressed = brotli.compress(json_bytes)
        return Response(
            content=compressed,
            media_type="application/json",
            headers={"Content-Encoding": "br"}
        )
    elif "gzip" in accept_encoding:
        import gzip
        compressed = gzip.compress(json_bytes)
        return Response(
            content=compressed,
            media_type="application/json",
            headers={"Content-Encoding": "gzip"}
        )
    return Response(content=json_bytes, media_type="application/json")

Brotli typically achieves 15-25% better compression than gzip for JSON payloads, at the cost of higher CPU usage during compression.

Content negotiation with Django REST Framework

DRF has built-in content negotiation that’s more mature than most FastAPI patterns:

# settings.py
REST_FRAMEWORK = {
    'DEFAULT_RENDERER_CLASSES': [
        'rest_framework.renderers.JSONRenderer',
        'rest_framework.renderers.BrowsableAPIRenderer',
        'rest_framework_xml.renderers.XMLRenderer',
        'rest_framework_msgpack.renderers.MessagePackRenderer',
    ],
    'DEFAULT_CONTENT_NEGOTIATION_CLASS':
        'rest_framework.negotiation.DefaultContentNegotiation',
}

DRF’s browsable API renderer is a powerful example of content negotiation: when you visit an API endpoint in a browser (which sends Accept: text/html), DRF renders an interactive HTML page. When an API client sends Accept: application/json, the same endpoint returns raw JSON. Zero code changes in the view.

Caching and Vary headers

Content negotiation creates caching complexity. The same URL returns different content based on headers, so caches must store multiple variants:

@app.middleware("http")
async def vary_middleware(request: Request, call_next):
    response = await call_next(request)
    # Tell caches: response varies by these headers
    response.headers["Vary"] = "Accept, Accept-Language, Accept-Encoding"
    return response

Without the Vary header, a CDN might cache the JSON response and serve it to a client requesting XML. The Vary header tells intermediate caches to store separate copies per Accept value.

For CDNs that handle Vary poorly (some strip it entirely), an alternative is to normalize the Accept header into a cache key using a middleware layer:

# Normalize Accept to a cache-friendly key
def cache_key_for_request(request: Request) -> str:
    accept = request.headers.get("accept", "json")
    if "msgpack" in accept:
        fmt = "msgpack"
    elif "xml" in accept:
        fmt = "xml"
    else:
        fmt = "json"
    return f"{request.url.path}:{fmt}"

Performance benchmarks

Response format affects both serialization time and payload size. Benchmarks on a typical user profile object (10 fields, 2 nested objects):

FormatSerialization timePayload size
JSON (stdlib)12μs485 bytes
JSON (orjson)3μs481 bytes
MessagePack5μs312 bytes
XML (dicttoxml)45μs892 bytes
Protocol Buffers2μs198 bytes

Protocol Buffers are fastest and smallest but require schema files and compiled code on both sides. For most Python APIs, orjson as the default with MessagePack as an opt-in gives the best balance of compatibility and performance.

One thing to remember: Content negotiation is the mechanism that lets a single URL serve multiple representations — but the real power comes from choosing the right formats for your actual client ecosystem, not supporting every format because you can.

pythonwebapishttpfastapiflask

See Also

  • Python Aiohttp Client Understand Aiohttp Client through a practical analogy so your Python decisions become faster and clearer.
  • Python Api Client Design Why building your own API client in Python is like creating a TV remote that only has the buttons you actually need.
  • Python Api Documentation Swagger Swagger turns your Python API into an interactive playground where anyone can click buttons to try it out — no coding required.
  • Python Api Mocking Responses Why testing with fake API responses is like rehearsing a play with stand-ins before the real actors show up.
  • Python Api Pagination Clients Why APIs send data in pages, and how Python handles it — like reading a book one chapter at a time instead of swallowing the whole thing.