Python Request Validation Patterns — Deep Dive

Technical foundation

Request validation is your API’s first line of defense and its most performance-sensitive layer. Every request passes through validation before reaching business logic, making validator efficiency, composability, and error quality critical architectural concerns.

Pydantic v2 under the hood

Pydantic v2 rewrote its core in Rust (pydantic-core), delivering 5–50x performance improvements over v1. Understanding its internals helps write more efficient validators.

The validation pipeline for each model:

  1. Schema compilation — At class definition time, Pydantic compiles a CoreSchema from type hints and field metadata. This schema is a tree of validation steps.
  2. Rust-side validation — At runtime, pydantic-core walks the schema tree in compiled Rust code, performing type coercion and constraint checks.
  3. Python-side validators — Custom @field_validator and @model_validator functions run as Python callbacks from the Rust pipeline.

The performance implication: keep validation logic in type hints and Field() constraints whenever possible. Each Python callback adds overhead. A Field(ge=0, le=1000) runs entirely in Rust. A @field_validator that does the same check runs in Python.

Advanced model validators

Model-level validation with access to all fields

from pydantic import BaseModel, model_validator

class TransferRequest(BaseModel):
    from_account: int
    to_account: int
    amount: int
    currency: str

    @model_validator(mode="after")
    def accounts_must_differ(self) -> "TransferRequest":
        if self.from_account == self.to_account:
            raise ValueError("Cannot transfer to the same account")
        return self

    @model_validator(mode="before")
    @classmethod
    def normalize_currency(cls, data: dict) -> dict:
        if "currency" in data:
            data["currency"] = data["currency"].upper()
        return data

The mode="before" validator runs before type coercion — useful for normalizing raw input. The mode="after" validator runs after all fields are validated — useful for cross-field checks.

Reusable validation types

Create custom types that carry their own validation:

from typing import Annotated
from pydantic import AfterValidator, Field

def must_be_positive(v: int) -> int:
    if v <= 0:
        raise ValueError("Must be positive")
    return v

PositiveInt = Annotated[int, AfterValidator(must_be_positive)]

def sanitize_html(v: str) -> str:
    import bleach
    return bleach.clean(v, tags=[], strip=True)

SafeString = Annotated[str, AfterValidator(sanitize_html)]

class CommentRequest(BaseModel):
    post_id: PositiveInt
    body: SafeString = Field(min_length=1, max_length=5000)
    parent_id: PositiveInt | None = None

PositiveInt and SafeString are reusable across your entire codebase. The validation logic is defined once and applied everywhere.

Async validation patterns

Some validations require database lookups. Pydantic validators are synchronous, so async checks belong in a separate layer:

from fastapi import Depends

class CreateOrderRequest(BaseModel):
    product_id: int
    quantity: int = Field(ge=1, le=100)

async def validate_product_exists(
    request: CreateOrderRequest,
    db: AsyncSession = Depends(get_db),
) -> CreateOrderRequest:
    product = await db.get(Product, request.product_id)
    if not product:
        raise HTTPException(
            status_code=422,
            detail=[{"field": "product_id", "message": "Product does not exist"}]
        )
    if product.stock < request.quantity:
        raise HTTPException(
            status_code=422,
            detail=[{"field": "quantity", "message": f"Only {product.stock} in stock"}]
        )
    return request

@app.post("/orders")
async def create_order(
    request: CreateOrderRequest = Depends(validate_product_exists),
):
    ...

This pattern separates structural validation (Pydantic, synchronous, fast) from business validation (async, database-dependent) while keeping both at the API boundary.

File upload validation

File uploads need validation beyond size limits:

from fastapi import UploadFile, File

ALLOWED_TYPES = {"image/jpeg", "image/png", "image/webp"}
MAX_SIZE = 10 * 1024 * 1024  # 10 MB

async def validate_image(file: UploadFile = File(...)) -> UploadFile:
    if file.content_type not in ALLOWED_TYPES:
        raise HTTPException(422, detail=f"File type {file.content_type} not allowed")
    
    content = await file.read()
    if len(content) > MAX_SIZE:
        raise HTTPException(422, detail=f"File exceeds {MAX_SIZE // 1024 // 1024} MB limit")
    
    # Verify magic bytes match claimed content type
    if content[:2] != b'\xff\xd8' and file.content_type == "image/jpeg":
        raise HTTPException(422, detail="File content does not match declared JPEG type")
    
    await file.seek(0)  # Reset for downstream processing
    return file

Always verify file content matches the declared MIME type. Trusting content_type alone is a security gap.

Request size and depth limits

Protect against oversized payloads and deeply nested structures:

from starlette.middleware.base import BaseHTTPMiddleware

class RequestSizeLimitMiddleware(BaseHTTPMiddleware):
    def __init__(self, app, max_body_size: int = 1_048_576):
        super().__init__(app)
        self.max_body_size = max_body_size

    async def dispatch(self, request, call_next):
        content_length = request.headers.get("content-length")
        if content_length and int(content_length) > self.max_body_size:
            return JSONResponse(
                status_code=413,
                content={"detail": f"Request body exceeds {self.max_body_size} bytes"},
            )
        return await call_next(request)

For nested JSON, set Pydantic’s max_nesting_depth or validate with a custom JSON parser that rejects excessive depth.

Coercion vs strict mode

Pydantic v2 coerces by default: the string "42" becomes the integer 42. This is convenient but can hide bugs. For APIs that demand exact types:

from pydantic import ConfigDict

class StrictRequest(BaseModel):
    model_config = ConfigDict(strict=True)
    
    user_id: int      # "42" will now fail validation
    amount: float     # "19.99" will fail
    active: bool      # "true" will fail

Strict mode is recommended for APIs where clients are machines (other services, SDKs) rather than browsers or forms.

Validation error formatting

FastAPI’s default validation error format is detailed but verbose. Customize it for cleaner client consumption:

@app.exception_handler(RequestValidationError)
async def custom_validation_handler(request: Request, exc: RequestValidationError):
    simplified = []
    for error in exc.errors():
        loc = ".".join(str(part) for part in error["loc"] if part != "body")
        simplified.append({
            "field": loc,
            "message": error["msg"],
            "type": error["type"],
        })
    return JSONResponse(
        status_code=422,
        content={
            "type": "VALIDATION_ERROR",
            "detail": f"{len(simplified)} validation error(s)",
            "errors": simplified,
        },
    )

Security-focused validation

Validation is a security boundary. Key patterns:

  • SQL injection — Pydantic’s type system prevents most injection because parameters are typed, not concatenated. But validate regex patterns against ReDoS (catastrophic backtracking).
  • XSS — Sanitize any user-provided string that will be rendered in HTML. Use bleach or nh3 in a custom validator.
  • Mass assignment — Define separate models for create and update to prevent clients from setting internal fields like is_admin or created_at.
  • Numeric overflow — Constrain integers with le and ge to prevent values that overflow database columns.
class CreateUserRequest(BaseModel):
    name: str = Field(max_length=100)
    email: EmailStr
    # No 'role' or 'is_admin' — those are set server-side

class AdminUpdateUserRequest(BaseModel):
    role: str = Field(pattern=r"^(admin|editor|viewer)$")
    is_active: bool

Performance benchmarking

For high-throughput APIs, measure validation cost:

import timeit
from pydantic import BaseModel

class Order(BaseModel):
    product_id: int
    quantity: int
    price_cents: int

data = {"product_id": 1, "quantity": 5, "price_cents": 9999}

# Pydantic v2: ~0.5-2 microseconds per validation
result = timeit.timeit(lambda: Order(**data), number=100_000)
print(f"{result / 100_000 * 1_000_000:.2f} µs per validation")

At 2 microseconds per validation, even at 10,000 requests per second, validation adds only 20 milliseconds of total CPU time per second — negligible compared to database queries.

The one thing to remember: Structure your validation in layers — Pydantic for types and constraints (compiled Rust, fast), dependency injection for async business rules, and separate request models per operation to prevent mass assignment vulnerabilities.

pythonapivalidationpydantic

See Also