Python Response Serialization — Deep Dive

Technical foundation

Response serialization sits on the critical path of every API request. It converts Python objects into wire-format bytes, and its implementation affects latency, payload size, memory usage, and API contract stability. Getting it right means balancing performance with correctness and flexibility.

JSON serialization performance

Python’s built-in json module is safe but slow. Production APIs use faster alternatives:

import json
import orjson
import ujson
from pydantic import BaseModel

class Order(BaseModel):
    id: int
    product: str
    quantity: int
    total_cents: int
    created_at: str

orders = [Order(id=i, product=f"Widget {i}", quantity=i % 10, total_cents=i * 100, created_at="2026-03-28T12:00:00Z") for i in range(1000)]
data = [o.model_dump() for o in orders]

# Benchmark (1000-item list):
# json.dumps:  ~4.2 ms
# ujson.dumps: ~1.1 ms
# orjson.dumps: ~0.3 ms (returns bytes, not str)

orjson is 10–15x faster than stdlib json and handles datetime, UUID, Decimal, and numpy arrays natively. FastAPI can use it via a custom response class:

from fastapi.responses import ORJSONResponse

app = FastAPI(default_response_class=ORJSONResponse)

This replaces the default JSONResponse globally, speeding up every endpoint.

Pydantic v2 serialization modes

Pydantic v2 offers two serialization paths:

class UserResponse(BaseModel):
    id: int
    name: str
    email: str
    created_at: datetime

user = UserResponse(id=1, name="Alice", email="a@b.com", created_at=datetime.now())

# Python dict (for further processing)
user.model_dump()
# {"id": 1, "name": "Alice", "email": "a@b.com", "created_at": datetime(...)}

# JSON string (for wire format)
user.model_dump_json()
# '{"id":1,"name":"Alice","email":"a@b.com","created_at":"2026-03-28T12:00:00"}'

model_dump_json() is faster than model_dump() followed by json.dumps() because it serializes directly to JSON bytes in the Rust layer without creating an intermediate Python dict.

Conditional field inclusion

Real APIs need dynamic field inclusion based on permissions, query parameters, or feature flags:

from pydantic import BaseModel, computed_field

class UserResponse(BaseModel):
    id: int
    name: str
    email: str | None = None
    phone: str | None = None
    internal_notes: str | None = None

def serialize_user(user_db, include_contact: bool = False, is_admin: bool = False):
    exclude = set()
    if not include_contact:
        exclude.update({"email", "phone"})
    if not is_admin:
        exclude.add("internal_notes")
    return UserResponse.model_validate(user_db).model_dump(exclude=exclude, exclude_none=True)

The exclude_none=True parameter removes fields that are None, keeping responses clean. The exclude set dynamically controls which fields appear.

Computed and virtual fields

Some response fields do not exist in the database but are calculated:

from pydantic import computed_field

class OrderResponse(BaseModel):
    id: int
    subtotal_cents: int
    tax_cents: int
    shipping_cents: int

    @computed_field
    @property
    def total_cents(self) -> int:
        return self.subtotal_cents + self.tax_cents + self.shipping_cents

    @computed_field
    @property
    def total_display(self) -> str:
        return f"${self.total_cents / 100:.2f}"

Computed fields appear in JSON output and OpenAPI schemas but are not stored.

Streaming large responses

For endpoints that return thousands of items, building the entire response in memory is wasteful. Use streaming:

from fastapi.responses import StreamingResponse
import orjson

async def stream_orders(user_id: int):
    yield b"["
    first = True
    async for order in db.stream_orders(user_id):
        if not first:
            yield b","
        yield orjson.dumps(OrderResponse.model_validate(order).model_dump())
        first = False
    yield b"]"

@app.get("/users/{user_id}/orders/export")
async def export_orders(user_id: int):
    return StreamingResponse(
        stream_orders(user_id),
        media_type="application/json",
    )

This processes one order at a time, keeping memory constant regardless of result set size.

Content negotiation

Some APIs serve multiple formats. Content negotiation lets clients choose:

from fastapi import Request
import csv
import io

@app.get("/reports/sales")
async def sales_report(request: Request):
    data = await fetch_sales_data()
    
    accept = request.headers.get("accept", "application/json")
    
    if "text/csv" in accept:
        output = io.StringIO()
        writer = csv.DictWriter(output, fieldnames=data[0].keys())
        writer.writeheader()
        writer.writerows(data)
        return Response(content=output.getvalue(), media_type="text/csv")
    
    return ORJSONResponse(content=data)

Versioned response shapes

As APIs evolve, response shapes change. Handle this without breaking clients:

class UserResponseV1(BaseModel):
    id: int
    name: str
    email: str

class UserResponseV2(BaseModel):
    id: int
    display_name: str  # renamed from 'name'
    email: str
    avatar_url: str | None = None  # new field

def get_response_model(version: int):
    return {1: UserResponseV1, 2: UserResponseV2}.get(version, UserResponseV2)

@app.get("/users/{user_id}")
async def get_user(user_id: int, request: Request):
    version = int(request.headers.get("X-API-Version", "2"))
    model = get_response_model(version)
    user = await db.get_user(user_id)
    return model.model_validate(user, from_attributes=True)

Serialization with SQLAlchemy models

Converting ORM models to response models is a common pattern:

from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship

class UserORM(Base):
    __tablename__ = "users"
    id: Mapped[int] = mapped_column(primary_key=True)
    name: Mapped[str]
    email: Mapped[str]
    password_hash: Mapped[str]
    orders: Mapped[list["OrderORM"]] = relationship()

class UserResponse(BaseModel):
    model_config = ConfigDict(from_attributes=True)
    
    id: int
    name: str
    email: str
    # password_hash intentionally excluded
    # orders serialized separately to control N+1

@app.get("/users/{user_id}", response_model=UserResponse)
async def get_user(user_id: int, db: AsyncSession = Depends(get_db)):
    user = await db.get(UserORM, user_id)
    return user  # Pydantic reads attributes, excludes password_hash

The from_attributes=True config lets Pydantic read SQLAlchemy model attributes directly. Only fields defined in UserResponse are serialized — password_hash is silently excluded.

Caching serialized responses

For high-traffic endpoints with stable data, cache the serialized bytes:

import hashlib
from fastapi import Response

@app.get("/products/{product_id}")
async def get_product(product_id: int, response: Response):
    cache_key = f"product:{product_id}:json"
    cached = await redis.get(cache_key)
    
    if cached:
        etag = hashlib.md5(cached).hexdigest()
        response.headers["ETag"] = f'"{etag}"'
        response.headers["Cache-Control"] = "public, max-age=300"
        return Response(content=cached, media_type="application/json")
    
    product = await db.get_product(product_id)
    serialized = ProductResponse.model_validate(product).model_dump_json()
    await redis.setex(cache_key, 300, serialized)
    return Response(content=serialized, media_type="application/json")

Caching serialized bytes avoids both the database query and the serialization cost on cache hits.

Tradeoffs

Fast serializers like orjson sacrifice some flexibility (no custom default function as powerful as stdlib json). Streaming responses trade simplicity for memory efficiency. Versioned models add maintenance cost but prevent breaking changes. The right balance depends on traffic volume, response size, and how quickly your API evolves.

The one thing to remember: Serialize through explicit response models using orjson or model_dump_json() for performance, cache serialized bytes for hot endpoints, and keep response shapes versioned so clients never face surprise breaking changes.

pythonapiserializationjsonperformance

See Also