REST vs GraphQL vs gRPC in Python — Deep Dive

Technical foundation

Choosing an API style in Python is not a matter of fashion — it shapes serialization overhead, error propagation, tooling choices, and team workflow. This deep dive examines each style through working code, performance characteristics, and production patterns.

REST with FastAPI — the baseline

FastAPI is the dominant choice for REST in modern Python. It combines Pydantic models with OpenAPI generation:

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel

app = FastAPI()

class Order(BaseModel):
    id: int
    product: str
    quantity: int
    total_cents: int

@app.get("/users/{user_id}/orders", response_model=list[Order])
async def get_user_orders(user_id: int):
    orders = await fetch_orders(user_id)
    if not orders:
        raise HTTPException(status_code=404, detail="No orders found")
    return orders

Key characteristics of REST in Python:

  • Serialization cost: JSON encoding via orjson or ujson typically takes 0.1–0.5 ms per response for moderate payloads.
  • HTTP caching: Cache-Control and ETag headers let CDNs and browsers cache responses with zero application code.
  • Versioning: URL-based (/v2/users) or header-based (Accept: application/vnd.api.v2+json). FastAPI supports both through routers.
  • Error handling: HTTP status codes provide a universal vocabulary. Clients need no special knowledge to interpret a 404 or 429.

The hidden cost is over-fetching. A mobile client displaying a user’s name and avatar still receives the full 30-field user object. Multiply that by a list endpoint returning 50 items and you waste bandwidth on fields nobody renders.

GraphQL with Strawberry — typed queries

Strawberry integrates with FastAPI and uses Python type hints as the schema source:

import strawberry
from strawberry.fastapi import GraphQLRouter

@strawberry.type
class Order:
    id: int
    product: str
    quantity: int
    total_cents: int

@strawberry.type
class User:
    id: int
    name: str
    orders: list[Order]

@strawberry.type
class Query:
    @strawberry.field
    async def user(self, id: int) -> User:
        data = await fetch_user_with_orders(id)
        return User(**data)

schema = strawberry.Schema(query=Query)
graphql_app = GraphQLRouter(schema)

The client asks for exactly what it needs:

query {
  user(id: 42) {
    name
    orders {
      product
      totalCents
    }
  }
}

The N+1 problem and DataLoaders

Without care, a query for 50 users that each have orders will fire 50 separate database queries. Strawberry supports DataLoaders:

from strawberry.dataloader import DataLoader

async def load_orders(user_ids: list[int]) -> list[list[Order]]:
    rows = await db.fetch_all(
        "SELECT * FROM orders WHERE user_id = ANY($1)", user_ids
    )
    grouped = defaultdict(list)
    for r in rows:
        grouped[r["user_id"]].append(Order(**r))
    return [grouped.get(uid, []) for uid in user_ids]

order_loader = DataLoader(load_fn=load_orders)

This batches the 50 queries into a single WHERE user_id = ANY(...) call.

Query complexity limits

GraphQL’s flexibility is also its attack surface. A malicious query can request deeply nested relations:

query {
  user(id: 1) {
    orders { product }
    friends {
      orders { product }
      friends {
        orders { product }
      }
    }
  }
}

Defend with query depth limits and cost analysis. Strawberry supports extensions for this:

from strawberry.extensions import QueryDepthLimiter
schema = strawberry.Schema(
    query=Query,
    extensions=[QueryDepthLimiter(max_depth=5)]
)

Caching strategies

Since all GraphQL requests hit POST /graphql, HTTP-level caching does not apply. Instead:

  • Persisted queries: The client sends a hash; the server looks up the full query. This enables GET-based CDN caching.
  • Response-level caching: Hash the normalized query + variables, cache in Redis.
  • Field-level caching: Use DataLoaders with TTL caches for expensive resolvers.

gRPC with grpcio — binary speed

Define the service contract in a .proto file:

syntax = "proto3";

service OrderService {
  rpc GetUserOrders (UserRequest) returns (OrderList);
  rpc StreamOrders (UserRequest) returns (stream Order);
}

message UserRequest {
  int32 user_id = 1;
}

message Order {
  int32 id = 1;
  string product = 2;
  int32 quantity = 3;
  int32 total_cents = 4;
}

message OrderList {
  repeated Order orders = 1;
}

Generate Python stubs: python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. order.proto

Server implementation:

import grpc
from concurrent import futures
import order_pb2
import order_pb2_grpc

class OrderServicer(order_pb2_grpc.OrderServiceServicer):
    async def GetUserOrders(self, request, context):
        orders = await fetch_orders(request.user_id)
        return order_pb2.OrderList(
            orders=[order_pb2.Order(**o) for o in orders]
        )

    async def StreamOrders(self, request, context):
        async for order in stream_orders(request.user_id):
            yield order_pb2.Order(**order)

server = grpc.aio.server()
order_pb2_grpc.add_OrderServiceServicer_to_server(OrderServicer(), server)
server.add_insecure_port("[::]:50051")

Performance characteristics

Protocol Buffers serialize to binary, which is typically 3–10x smaller than equivalent JSON and 5–20x faster to parse. For a payload of 1,000 orders:

  • REST (JSON via orjson): ~2.1 ms serialize, ~450 KB payload
  • gRPC (protobuf): ~0.3 ms serialize, ~85 KB payload

This gap widens with larger payloads and higher throughput. At 10,000 requests per second between two Python services, gRPC’s binary format and HTTP/2 multiplexing save measurable CPU and bandwidth.

Streaming patterns

gRPC streaming is native, not bolted on:

  • Server streaming: The server sends a stream of messages (live order updates).
  • Client streaming: The client sends a stream (batch upload of telemetry events).
  • Bidirectional: Both sides stream simultaneously (real-time chat, collaborative editing).

In Python, grpc.aio provides async generators for all three patterns.

Error handling

gRPC uses status codes (OK, NOT_FOUND, INTERNAL, DEADLINE_EXCEEDED) with optional detail messages. Rich error details use google.rpc.Status with typed payloads:

from grpc_status import rpc_status
from google.rpc import status_pb2, error_details_pb2

detail = error_details_pb2.BadRequest.FieldViolation(
    field="user_id", description="Must be positive"
)
status = status_pb2.Status(
    code=code_pb2.INVALID_ARGUMENT,
    message="Validation failed",
    details=[any_pb2.Any(value=detail.SerializeToString())]
)
await context.abort_with_status(rpc_status.to_status(status))

Architecture patterns in practice

The BFF pattern (Backend for Frontend)

A common production setup uses all three:

  1. gRPC between Python microservices (user-service, order-service, payment-service).
  2. GraphQL gateway that aggregates gRPC calls and exposes them to the frontend.
  3. REST endpoints for webhooks, third-party integrations, and health checks.

The GraphQL gateway translates frontend queries into targeted gRPC calls, combining the flexibility of GraphQL with the speed of gRPC.

Migration strategies

Moving from REST to GraphQL does not require a rewrite. A proven approach:

  1. Add a GraphQL endpoint alongside existing REST routes.
  2. GraphQL resolvers call the same service layer as REST handlers.
  3. Frontend teams migrate screens one at a time.
  4. Deprecate REST endpoints only after all consumers have moved.

For adding gRPC between services:

  1. Define proto files that match existing REST response schemas.
  2. Run both REST and gRPC servers in the same process (FastAPI + grpc.aio).
  3. Migrate internal callers to gRPC first; keep REST for external consumers.
  4. Remove REST internal endpoints once migration is complete.

Observability across styles

Each style needs different instrumentation:

  • REST: OpenTelemetry auto-instruments FastAPI with route-level spans, HTTP status tracking, and latency histograms.
  • GraphQL: Instrument per-resolver spans via Strawberry extensions. Track query complexity as a metric.
  • gRPC: grpc.aio supports interceptors for tracing. Add opentelemetry-instrumentation-grpc for automatic span creation per RPC method.

Unified tracing across all three requires propagating trace context (W3C Trace Context headers for REST/GraphQL, gRPC metadata for gRPC).

Tradeoffs summary

REST pays a tax in over-fetching but gains universal tooling and cacheability. GraphQL eliminates over-fetching but introduces query complexity management and gives up HTTP caching. gRPC wins on raw performance but sacrifices browser compatibility and human-readable debugging.

The right answer is rarely one style for everything. Mature Python architectures use each where it excels and invest in a shared service layer so switching transport does not require rewriting business logic.

The one thing to remember: Build your business logic in a transport-agnostic service layer, then expose it through whichever API style each consumer needs — REST for simplicity, GraphQL for flexibility, gRPC for speed.

pythonapirestgraphqlgrpc

See Also