REST vs GraphQL vs gRPC in Python — Deep Dive
Technical foundation
Choosing an API style in Python is not a matter of fashion — it shapes serialization overhead, error propagation, tooling choices, and team workflow. This deep dive examines each style through working code, performance characteristics, and production patterns.
REST with FastAPI — the baseline
FastAPI is the dominant choice for REST in modern Python. It combines Pydantic models with OpenAPI generation:
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
app = FastAPI()
class Order(BaseModel):
id: int
product: str
quantity: int
total_cents: int
@app.get("/users/{user_id}/orders", response_model=list[Order])
async def get_user_orders(user_id: int):
orders = await fetch_orders(user_id)
if not orders:
raise HTTPException(status_code=404, detail="No orders found")
return orders
Key characteristics of REST in Python:
- Serialization cost: JSON encoding via
orjsonorujsontypically takes 0.1–0.5 ms per response for moderate payloads. - HTTP caching:
Cache-ControlandETagheaders let CDNs and browsers cache responses with zero application code. - Versioning: URL-based (
/v2/users) or header-based (Accept: application/vnd.api.v2+json). FastAPI supports both through routers. - Error handling: HTTP status codes provide a universal vocabulary. Clients need no special knowledge to interpret a 404 or 429.
The hidden cost is over-fetching. A mobile client displaying a user’s name and avatar still receives the full 30-field user object. Multiply that by a list endpoint returning 50 items and you waste bandwidth on fields nobody renders.
GraphQL with Strawberry — typed queries
Strawberry integrates with FastAPI and uses Python type hints as the schema source:
import strawberry
from strawberry.fastapi import GraphQLRouter
@strawberry.type
class Order:
id: int
product: str
quantity: int
total_cents: int
@strawberry.type
class User:
id: int
name: str
orders: list[Order]
@strawberry.type
class Query:
@strawberry.field
async def user(self, id: int) -> User:
data = await fetch_user_with_orders(id)
return User(**data)
schema = strawberry.Schema(query=Query)
graphql_app = GraphQLRouter(schema)
The client asks for exactly what it needs:
query {
user(id: 42) {
name
orders {
product
totalCents
}
}
}
The N+1 problem and DataLoaders
Without care, a query for 50 users that each have orders will fire 50 separate database queries. Strawberry supports DataLoaders:
from strawberry.dataloader import DataLoader
async def load_orders(user_ids: list[int]) -> list[list[Order]]:
rows = await db.fetch_all(
"SELECT * FROM orders WHERE user_id = ANY($1)", user_ids
)
grouped = defaultdict(list)
for r in rows:
grouped[r["user_id"]].append(Order(**r))
return [grouped.get(uid, []) for uid in user_ids]
order_loader = DataLoader(load_fn=load_orders)
This batches the 50 queries into a single WHERE user_id = ANY(...) call.
Query complexity limits
GraphQL’s flexibility is also its attack surface. A malicious query can request deeply nested relations:
query {
user(id: 1) {
orders { product }
friends {
orders { product }
friends {
orders { product }
}
}
}
}
Defend with query depth limits and cost analysis. Strawberry supports extensions for this:
from strawberry.extensions import QueryDepthLimiter
schema = strawberry.Schema(
query=Query,
extensions=[QueryDepthLimiter(max_depth=5)]
)
Caching strategies
Since all GraphQL requests hit POST /graphql, HTTP-level caching does not apply. Instead:
- Persisted queries: The client sends a hash; the server looks up the full query. This enables GET-based CDN caching.
- Response-level caching: Hash the normalized query + variables, cache in Redis.
- Field-level caching: Use DataLoaders with TTL caches for expensive resolvers.
gRPC with grpcio — binary speed
Define the service contract in a .proto file:
syntax = "proto3";
service OrderService {
rpc GetUserOrders (UserRequest) returns (OrderList);
rpc StreamOrders (UserRequest) returns (stream Order);
}
message UserRequest {
int32 user_id = 1;
}
message Order {
int32 id = 1;
string product = 2;
int32 quantity = 3;
int32 total_cents = 4;
}
message OrderList {
repeated Order orders = 1;
}
Generate Python stubs: python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. order.proto
Server implementation:
import grpc
from concurrent import futures
import order_pb2
import order_pb2_grpc
class OrderServicer(order_pb2_grpc.OrderServiceServicer):
async def GetUserOrders(self, request, context):
orders = await fetch_orders(request.user_id)
return order_pb2.OrderList(
orders=[order_pb2.Order(**o) for o in orders]
)
async def StreamOrders(self, request, context):
async for order in stream_orders(request.user_id):
yield order_pb2.Order(**order)
server = grpc.aio.server()
order_pb2_grpc.add_OrderServiceServicer_to_server(OrderServicer(), server)
server.add_insecure_port("[::]:50051")
Performance characteristics
Protocol Buffers serialize to binary, which is typically 3–10x smaller than equivalent JSON and 5–20x faster to parse. For a payload of 1,000 orders:
- REST (JSON via orjson): ~2.1 ms serialize, ~450 KB payload
- gRPC (protobuf): ~0.3 ms serialize, ~85 KB payload
This gap widens with larger payloads and higher throughput. At 10,000 requests per second between two Python services, gRPC’s binary format and HTTP/2 multiplexing save measurable CPU and bandwidth.
Streaming patterns
gRPC streaming is native, not bolted on:
- Server streaming: The server sends a stream of messages (live order updates).
- Client streaming: The client sends a stream (batch upload of telemetry events).
- Bidirectional: Both sides stream simultaneously (real-time chat, collaborative editing).
In Python, grpc.aio provides async generators for all three patterns.
Error handling
gRPC uses status codes (OK, NOT_FOUND, INTERNAL, DEADLINE_EXCEEDED) with optional detail messages. Rich error details use google.rpc.Status with typed payloads:
from grpc_status import rpc_status
from google.rpc import status_pb2, error_details_pb2
detail = error_details_pb2.BadRequest.FieldViolation(
field="user_id", description="Must be positive"
)
status = status_pb2.Status(
code=code_pb2.INVALID_ARGUMENT,
message="Validation failed",
details=[any_pb2.Any(value=detail.SerializeToString())]
)
await context.abort_with_status(rpc_status.to_status(status))
Architecture patterns in practice
The BFF pattern (Backend for Frontend)
A common production setup uses all three:
- gRPC between Python microservices (user-service, order-service, payment-service).
- GraphQL gateway that aggregates gRPC calls and exposes them to the frontend.
- REST endpoints for webhooks, third-party integrations, and health checks.
The GraphQL gateway translates frontend queries into targeted gRPC calls, combining the flexibility of GraphQL with the speed of gRPC.
Migration strategies
Moving from REST to GraphQL does not require a rewrite. A proven approach:
- Add a GraphQL endpoint alongside existing REST routes.
- GraphQL resolvers call the same service layer as REST handlers.
- Frontend teams migrate screens one at a time.
- Deprecate REST endpoints only after all consumers have moved.
For adding gRPC between services:
- Define proto files that match existing REST response schemas.
- Run both REST and gRPC servers in the same process (FastAPI + grpc.aio).
- Migrate internal callers to gRPC first; keep REST for external consumers.
- Remove REST internal endpoints once migration is complete.
Observability across styles
Each style needs different instrumentation:
- REST: OpenTelemetry auto-instruments FastAPI with route-level spans, HTTP status tracking, and latency histograms.
- GraphQL: Instrument per-resolver spans via Strawberry extensions. Track query complexity as a metric.
- gRPC:
grpc.aiosupports interceptors for tracing. Addopentelemetry-instrumentation-grpcfor automatic span creation per RPC method.
Unified tracing across all three requires propagating trace context (W3C Trace Context headers for REST/GraphQL, gRPC metadata for gRPC).
Tradeoffs summary
REST pays a tax in over-fetching but gains universal tooling and cacheability. GraphQL eliminates over-fetching but introduces query complexity management and gives up HTTP caching. gRPC wins on raw performance but sacrifices browser compatibility and human-readable debugging.
The right answer is rarely one style for everything. Mature Python architectures use each where it excels and invest in a shared service layer so switching transport does not require rewriting business logic.
The one thing to remember: Build your business logic in a transport-agnostic service layer, then expose it through whichever API style each consumer needs — REST for simplicity, GraphQL for flexibility, gRPC for speed.
See Also
- Python Api Authentication Comparison API keys, JWTs, OAuth, and sessions — four ways Python APIs verify who is knocking at the door.
- Python Api Caching Layers Why Python APIs remember answers to common questions — like a teacher who writes frequent answers on the whiteboard.
- Python Api Error Handling Standards Why good error messages from your Python API are like clear road signs — they tell callers exactly what went wrong and what to do next.
- Python Api Load Testing Testing how many people your Python API can handle at once — like stress-testing a bridge before opening it to traffic.
- Python Api Monitoring Observability How Python APIs keep track of their own health — like a car dashboard that warns you before the engine overheats.