Content Negotiation — Deep Dive

Build Python APIs that serve JSON, XML, MessagePack, and CSV from the same endpoint using Accept headers, quality parsing, and custom renderers.

Accept header parsing

The Accept header follows RFC 7231 syntax with media ranges, quality values, and parameters. Parsing it correctly handles edge cases that naive string splitting misses:

from dataclasses import dataclass

@dataclass
class MediaRange:
    type: str
    subtype: str
    quality: float
    params: dict

def parse_accept(header: str) -> list[MediaRange]:
    ranges = []
    for item in header.split(","):
        item = item.strip()
        parts = item.split(";")
        media = parts[0].strip()
        type_, subtype = media.split("/") if "/" in media else (media, "*")

        params = {}
        quality = 1.0
        for param in parts[1:]:
            key, _, value = param.strip().partition("=")
            if key.strip() == "q":
                quality = float(value.strip())
            else:
                params[key.strip()] = value.strip()

        ranges.append(MediaRange(type_, subtype, quality, params))

    # Sort by quality descending, then specificity
    return sorted(ranges, key=lambda r: (
        -r.quality,
        0 if r.type == "*" else -1,
        0 if r.subtype == "*" else -1,
    ))

# Example
ranges = parse_accept("text/html, application/json;q=0.9, */*;q=0.1")
# Returns: [text/html (q=1.0), application/json (q=0.9), */* (q=0.1)]

The specificity tie-breaking matters: application/json beats application/* which beats */* at equal quality values. Real implementations also handle Accept: (empty) by defaulting to */*.

Multi-format renderer in FastAPI

A clean architecture uses a renderer registry that maps media types to serialization functions:

from fastapi import FastAPI, Request, HTTPException
from fastapi.responses import Response, JSONResponse
import msgpack
import csv
import io
import dicttoxml

app = FastAPI()

RENDERERS = {}

def renderer(media_type: str):
    def decorator(fn):
        RENDERERS[media_type] = fn
        return fn
    return decorator

@renderer("application/json")
def render_json(data) -> Response:
    return JSONResponse(content=data)

@renderer("application/msgpack")
def render_msgpack(data) -> Response:
    return Response(
        content=msgpack.packb(data, use_bin_type=True),
        media_type="application/msgpack"
    )

@renderer("application/xml")
def render_xml(data) -> Response:
    xml_bytes = dicttoxml.dicttoxml(data, custom_root="response")
    return Response(content=xml_bytes, media_type="application/xml")

@renderer("text/csv")
def render_csv(data) -> Response:
    if isinstance(data, dict) and "items" in data:
        items = data["items"]
    elif isinstance(data, list):
        items = data
    else:
        items = [data]

    if not items:
        return Response(content="", media_type="text/csv")

    output = io.StringIO()
    writer = csv.DictWriter(output, fieldnames=items[0].keys())
    writer.writeheader()
    writer.writerows(items)
    return Response(content=output.getvalue(), media_type="text/csv")

def negotiate(request: Request):
    accept = request.headers.get("accept", "application/json")
    ranges = parse_accept(accept)

    for media_range in ranges:
        media_key = f"{media_range.type}/{media_range.subtype}"
        if media_key in RENDERERS:
            return RENDERERS[media_key]
        # Handle wildcards
        if media_range.subtype == "*":
            for key, renderer_fn in RENDERERS.items():
                if key.startswith(f"{media_range.type}/"):
                    return renderer_fn
        if media_range.type == "*":
            return RENDERERS.get("application/json", next(iter(RENDERERS.values())))

    raise HTTPException(status_code=406, detail="Not Acceptable")

Usage in endpoints:

@app.get("/users")
async def list_users(request: Request):
    users = await fetch_all_users()
    data = {"items": [u.dict() for u in users], "total": len(users)}
    render = negotiate(request)
    return render(data)

This approach keeps endpoint logic format-agnostic. Adding a new format (Protocol Buffers, YAML) requires only a new renderer function.

Vendor media types for API versioning

GitHub uses vendor media types to combine content negotiation with versioning:

Accept: application/vnd.github.v3+json

Implement this in Python by parsing the vendor type:

import re

def parse_vendor_type(accept: str) -> tuple[str, int, str]:
    """Parse vendor media type into (vendor, version, format)."""
    pattern = r"application/vnd\.(\w+)\.v(\d+)\+(\w+)"
    match = re.search(pattern, accept)
    if match:
        return match.group(1), int(match.group(2)), match.group(3)
    return None, None, "json"

@app.get("/repos/{owner}/{repo}")
async def get_repo(owner: str, repo: str, request: Request):
    accept = request.headers.get("accept", "")
    vendor, version, fmt = parse_vendor_type(accept)

    repo_data = await fetch_repo(owner, repo)

    if version == 2:
        # Legacy format
        return RepoResponseV2.from_orm(repo_data)
    else:
        # Default to latest (v3)
        return RepoResponseV3.from_orm(repo_data)

This is more elegant than URL-based versioning (/v1/repos/...) because the URL stays stable while the representation changes. The tradeoff: it’s harder to test in a browser and requires clients to set custom Accept headers.

Language negotiation

HTTP language negotiation follows the same pattern but uses Accept-Language:

from babel import Locale, negotiate_locale

SUPPORTED_LOCALES = ["en", "es", "fr", "de", "ja"]
MESSAGES = {
    "en": {"not_found": "Resource not found", "unauthorized": "Authentication required"},
    "es": {"not_found": "Recurso no encontrado", "unauthorized": "Autenticación requerida"},
    "fr": {"not_found": "Ressource non trouvée", "unauthorized": "Authentification requise"},
}

def get_locale(request: Request) -> str:
    accept_lang = request.headers.get("accept-language", "en")
    # Parse quality values: "fr;q=0.9, en;q=0.8" -> ["fr", "en"]
    preferred = []
    for item in accept_lang.split(","):
        parts = item.strip().split(";")
        lang = parts[0].strip()
        q = 1.0
        for p in parts[1:]:
            if p.strip().startswith("q="):
                q = float(p.strip()[2:])
        preferred.append((lang, q))

    preferred.sort(key=lambda x: -x[1])

    for lang, _ in preferred:
        short = lang.split("-")[0]
        if short in SUPPORTED_LOCALES:
            return short
    return "en"

def localized_error(request: Request, key: str) -> str:
    locale = get_locale(request)
    return MESSAGES.get(locale, MESSAGES["en"]).get(key, key)

Encoding negotiation and compression

While most ASGI servers handle Accept-Encoding transparently, you can add custom compression for specific formats:

from starlette.middleware.gzip import GZipMiddleware

# Standard approach: middleware handles it
app.add_middleware(GZipMiddleware, minimum_size=500)

# Custom approach for specific endpoints
import brotli

@app.get("/large-dataset")
async def large_dataset(request: Request):
    data = await generate_large_response()
    json_bytes = json.dumps(data).encode()

    accept_encoding = request.headers.get("accept-encoding", "")

    if "br" in accept_encoding:
        compressed = brotli.compress(json_bytes)
        return Response(
            content=compressed,
            media_type="application/json",
            headers={"Content-Encoding": "br"}
        )
    elif "gzip" in accept_encoding:
        import gzip
        compressed = gzip.compress(json_bytes)
        return Response(
            content=compressed,
            media_type="application/json",
            headers={"Content-Encoding": "gzip"}
        )
    return Response(content=json_bytes, media_type="application/json")

Brotli typically achieves 15-25% better compression than gzip for JSON payloads, at the cost of higher CPU usage during compression.

Content negotiation with Django REST Framework

DRF has built-in content negotiation that’s more mature than most FastAPI patterns:

# settings.py
REST_FRAMEWORK = {
    'DEFAULT_RENDERER_CLASSES': [
        'rest_framework.renderers.JSONRenderer',
        'rest_framework.renderers.BrowsableAPIRenderer',
        'rest_framework_xml.renderers.XMLRenderer',
        'rest_framework_msgpack.renderers.MessagePackRenderer',
    ],
    'DEFAULT_CONTENT_NEGOTIATION_CLASS':
        'rest_framework.negotiation.DefaultContentNegotiation',
}

DRF’s browsable API renderer is a powerful example of content negotiation: when you visit an API endpoint in a browser (which sends Accept: text/html), DRF renders an interactive HTML page. When an API client sends Accept: application/json, the same endpoint returns raw JSON. Zero code changes in the view.

Caching and Vary headers

Content negotiation creates caching complexity. The same URL returns different content based on headers, so caches must store multiple variants:

@app.middleware("http")
async def vary_middleware(request: Request, call_next):
    response = await call_next(request)
    # Tell caches: response varies by these headers
    response.headers["Vary"] = "Accept, Accept-Language, Accept-Encoding"
    return response

Without the Vary header, a CDN might cache the JSON response and serve it to a client requesting XML. The Vary header tells intermediate caches to store separate copies per Accept value.

For CDNs that handle Vary poorly (some strip it entirely), an alternative is to normalize the Accept header into a cache key using a middleware layer:

# Normalize Accept to a cache-friendly key
def cache_key_for_request(request: Request) -> str:
    accept = request.headers.get("accept", "json")
    if "msgpack" in accept:
        fmt = "msgpack"
    elif "xml" in accept:
        fmt = "xml"
    else:
        fmt = "json"
    return f"{request.url.path}:{fmt}"

Performance benchmarks

Response format affects both serialization time and payload size. Benchmarks on a typical user profile object (10 fields, 2 nested objects):

Format	Serialization time	Payload size
JSON (stdlib)	12μs	485 bytes
JSON (orjson)	3μs	481 bytes
MessagePack	5μs	312 bytes
XML (dicttoxml)	45μs	892 bytes
Protocol Buffers	2μs	198 bytes

Protocol Buffers are fastest and smallest but require schema files and compiled code on both sides. For most Python APIs, orjson as the default with MessagePack as an opt-in gives the best balance of compatibility and performance.

One thing to remember: Content negotiation is the mechanism that lets a single URL serve multiple representations — but the real power comes from choosing the right formats for your actual client ecosystem, not supporting every format because you can.

pythonwebapishttpfastapiflask