FastAPI Middleware Patterns — Core Concepts
What middleware does
Middleware sits between the incoming HTTP request and your route handler. Every request passes through all middleware layers in order, and every response passes back through them in reverse. This “onion” model means middleware can:
- Inspect or modify the request before the route sees it
- Inspect or modify the response before the client receives it
- Short-circuit the request entirely (rejecting it before it reaches a route)
- Measure timing across the full request lifecycle
The two middleware styles in FastAPI
Function-based middleware uses the @app.middleware("http") decorator. You get the request and a call_next function. You do pre-processing, call call_next(request) to pass it through, then do post-processing on the response.
ASGI middleware (class-based) follows Starlette’s ASGI middleware protocol. This is more powerful but more complex. You implement __init__ (receives the ASGI app) and __call__ (processes each request at the ASGI level).
For most use cases, function-based middleware is enough. ASGI middleware is needed when you want to manipulate raw ASGI messages (like streaming responses or WebSocket connections).
Common middleware patterns
Request timing: Start a timer before call_next, measure elapsed time after, add it as a response header. This is the classic first middleware every FastAPI app adds. Companies like Stripe include timing headers to help clients debug slow requests.
Request ID injection: Generate a unique ID for each request, attach it to the request state, and include it in the response headers and all log messages. When a user reports an error, the request ID traces the exact path through your system.
CORS (Cross-Origin Resource Sharing): FastAPI includes CORSMiddleware from Starlette. It handles the preflight OPTIONS requests that browsers send and adds the correct Access-Control-* headers. Misconfigured CORS is one of the most common reasons a frontend can’t talk to a backend.
Authentication gate: Check for an API key or token in the request headers. If it’s missing or invalid, return a 401 immediately without calling call_next. This protects all routes at once.
Rate limiting: Track request counts per client (by IP or API key) and reject requests that exceed the limit. Production rate limiters usually store counts in Redis rather than in-process memory.
Middleware execution order
Middleware runs in the order you add it. The first middleware added is the outermost layer — it runs first on the way in and last on the way out.
If you add a timing middleware first and an auth middleware second, the timing middleware’s clock includes the auth check time. If auth rejects the request, timing still measures how long the rejection took.
This ordering matters. General-purpose middleware (timing, request IDs) should be outermost. Specific middleware (auth, rate limiting) should be inner, closer to the routes.
Middleware vs dependencies
FastAPI’s dependency injection system can do many of the same things as middleware. So when should you use which?
Use middleware when:
- The logic applies to every request, no exceptions
- You need to modify the response (add headers, transform body)
- You need to run code both before and after the route handler
- The logic is cross-cutting (logging, timing, CORS)
Use dependencies when:
- The logic only applies to certain routes
- You need access to route-specific parameters (path params, request body)
- You want per-route customization (different auth levels)
- You need to participate in FastAPI’s dependency injection chain
Common misconception
Many developers assume middleware is the right place for authentication. It can work, but FastAPI’s dependency injection is usually better for auth because it integrates with OpenAPI documentation, provides per-route flexibility, and gives type-safe access to the authenticated user object. Use middleware for auth only when you need a blanket check across the entire application with no exceptions.
Pitfalls
- Reading the request body in middleware consumes it. If you read
request.body()in middleware, the route handler gets an empty body. You need to cache and re-attach it, which is awkward. - Streaming responses don’t work well with function-based middleware because
call_nextbuffers the entire response. For streaming, use ASGI middleware. - Exception handling in middleware can mask route errors. If your middleware catches exceptions too broadly, error reporting and proper HTTP status codes get swallowed.
The one thing to remember: Middleware is for cross-cutting concerns that apply to every request — timing, logging, CORS, request IDs — while route-specific logic like authentication is often better handled by FastAPI’s dependency injection.
See Also
- Python Aiohttp Client Understand Aiohttp Client through a practical analogy so your Python decisions become faster and clearer.
- Python Api Client Design Why building your own API client in Python is like creating a TV remote that only has the buttons you actually need.
- Python Api Documentation Swagger Swagger turns your Python API into an interactive playground where anyone can click buttons to try it out — no coding required.
- Python Api Mocking Responses Why testing with fake API responses is like rehearsing a play with stand-ins before the real actors show up.
- Python Api Pagination Clients Why APIs send data in pages, and how Python handles it — like reading a book one chapter at a time instead of swallowing the whole thing.