N+1 Query Problem — Deep Dive

Eliminate N+1 queries with advanced SQLAlchemy and Django techniques including lazy='raise', custom prefetching, and automated detection in CI.

Anatomy of N+1 in SQL

To understand the fixes, you need to see what the database actually does.

The N+1 pattern

-- Query 1: fetch all orders
SELECT id, customer_id, total FROM orders;
-- Returns 100 rows

-- Queries 2-101: one per order
SELECT id, name, email FROM customers WHERE id = 1;
SELECT id, name, email FROM customers WHERE id = 2;
SELECT id, name, email FROM customers WHERE id = 3;
-- ... 97 more

The JOIN fix (select_related / joinedload)

-- Single query with LEFT JOIN
SELECT orders.id, orders.total, customers.id, customers.name, customers.email
FROM orders
LEFT JOIN customers ON orders.customer_id = customers.id;

One query, all data. The trade-off: if a customer has 50 orders, the customer’s data is repeated 50 times in the result set, increasing network transfer size.

The IN-query fix (prefetch_related / selectinload)

-- Query 1: fetch orders
SELECT id, customer_id, total FROM orders;

-- Query 2: fetch all relevant customers in one shot
SELECT id, name, email FROM customers WHERE id IN (1, 2, 3, ..., 47);

Two queries regardless of order count. The ORM matches customers to orders in Python memory. This avoids data duplication and works well with one-to-many and many-to-many relationships.

SQLAlchemy strategies in depth

Preventing N+1 with lazy=“raise”

The most effective prevention is making lazy loading impossible:

class Order(Base):
    __tablename__ = "orders"

    id: Mapped[int] = mapped_column(primary_key=True)
    customer_id: Mapped[int] = mapped_column(ForeignKey("customers.id"))

    customer: Mapped["Customer"] = relationship(lazy="raise")
    items: Mapped[list["OrderItem"]] = relationship(lazy="raise")

Now accessing order.customer without explicit eager loading raises InvalidRequestError:

# This crashes — forces you to think about loading
orders = session.scalars(select(Order)).all()
orders[0].customer  # InvalidRequestError!

# This works — explicit loading strategy
stmt = select(Order).options(selectinload(Order.customer))
orders = session.scalars(stmt).all()
orders[0].customer  # Customer object, loaded efficiently

This shifts N+1 detection from runtime monitoring to development-time errors. Every relationship access requires an explicit loading decision.

Chaining loading strategies for nested relationships

from sqlalchemy.orm import selectinload, joinedload

# Orders → Items → Product → Category
stmt = select(Order).options(
    selectinload(Order.items)
    .joinedload(OrderItem.product)
    .selectinload(Product.category)
)

This executes:

SELECT * FROM orders (main query)
SELECT * FROM order_items WHERE order_id IN (...) (selectinload)
For each item, the product is JOINed in query 2 (joinedload)
SELECT * FROM categories WHERE id IN (...) (selectinload)

Total: 3 queries regardless of data size.

contains_eager for custom JOINs

When you need to filter on a related table AND load it:

from sqlalchemy.orm import contains_eager

# Find orders with expensive items, and load those items
stmt = (
    select(Order)
    .join(Order.items)
    .where(OrderItem.price > 100)
    .options(contains_eager(Order.items))
)

contains_eager tells SQLAlchemy that the JOIN data should populate the relationship, avoiding a redundant second query.

Hybrid approach: load_only for partial columns

Sometimes you don’t need all columns from related objects:

from sqlalchemy.orm import selectinload, load_only

stmt = select(Order).options(
    selectinload(Order.customer).load_only(Customer.name, Customer.email)
)

This fetches only name and email from customers, reducing data transfer.

Django strategies in depth

Prefetch with custom querysets

prefetch_related supports Prefetch objects for filtered or annotated related data:

from django.db.models import Prefetch

# Only prefetch active items with their reviews
orders = Order.objects.prefetch_related(
    Prefetch(
        'items',
        queryset=OrderItem.objects.filter(
            status='active'
        ).select_related('product')
    )
)

Django’s N+1 in templates

Templates are particularly vulnerable because relationship access is invisible:

{% for order in orders %}
    {{ order.customer.name }}  {# lazy load per order #}
    {% for item in order.items.all %}  {# lazy load per order #}
        {{ item.product.name }}  {# lazy load per item #}
    {% endfor %}
{% endfor %}

The view must set up all eager loading before passing data to the template:

def order_list(request):
    orders = Order.objects.select_related('customer').prefetch_related(
        Prefetch('items', queryset=OrderItem.objects.select_related('product'))
    )
    return render(request, 'orders/list.html', {'orders': orders})

django-auto-prefetch

For projects with widespread N+1 issues, django-auto-prefetch replaces Django’s model base class and automatically batches lazy loads:

import auto_prefetch

class Order(auto_prefetch.Model):
    customer = auto_prefetch.ForeignKey(Customer, on_delete=models.CASCADE)

Instead of executing one query per lazy access, it batches all pending lazy loads for the same relationship into a single IN query. It’s not as efficient as explicit prefetching but eliminates the worst N+1 cases automatically.

Automated detection in CI

Using pytest-django with assertNumQueries

def test_order_list_no_n_plus_one(client, django_assert_num_queries):
    # Create test data: 10 orders with 3 items each
    create_orders(count=10, items_per_order=3)

    with django_assert_num_queries(3):  # orders + customers + items
        response = client.get("/api/orders/")
        assert response.status_code == 200

If someone removes an eager load, the test fails because query count increases.

SQLAlchemy query counting

from sqlalchemy import event

class QueryCounter:
    def __init__(self, engine):
        self.count = 0
        event.listen(engine, "before_cursor_execute", self._increment)

    def _increment(self, conn, cursor, stmt, params, context, executemany):
        self.count += 1

    def __enter__(self):
        self.count = 0
        return self

    def __exit__(self, *args):
        pass

# In tests
def test_no_n_plus_one():
    with QueryCounter(engine) as counter:
        orders = get_orders_with_items()
        serialize_orders(orders)
    assert counter.count <= 3, f"Expected ≤3 queries, got {counter.count}"

nplusone in CI

# conftest.py
import pytest
from nplusone.core import profiler

@pytest.fixture(autouse=True)
def no_n_plus_one(request):
    with profiler.Profiler() as p:
        yield
    if p.queries:
        # Filter for N+1 patterns only
        n_plus_ones = [q for q in p.queries if q.type == 'lazy_load']
        if n_plus_ones:
            pytest.fail(
                f"N+1 queries detected: {len(n_plus_ones)} lazy loads"
            )

Performance comparison: real numbers

Benchmark with 100 orders, each with 5 items, PostgreSQL on localhost:

Strategy	Queries	Time
Lazy loading (N+1)	101	45ms
joinedload	1	4ms
selectinload	2	5ms
subqueryload	2	6ms

With remote database (5ms network latency):

Strategy	Queries	Time
Lazy loading (N+1)	101	520ms
joinedload	1	12ms
selectinload	2	18ms
subqueryload	2	19ms

Network latency is the multiplier. On localhost the difference is 10x; across a network it’s 30-40x.

GraphQL and N+1

GraphQL APIs are especially prone to N+1 because clients choose which fields to request, and each field resolver may trigger independent database queries.

The standard solution is DataLoader — a batching utility that collects all IDs requested in a single event loop tick and fetches them in one query:

from aiodataloader import DataLoader

async def batch_load_customers(customer_ids):
    customers = await fetch_customers_by_ids(customer_ids)
    return [customers.get(cid) for cid in customer_ids]

customer_loader = DataLoader(batch_load_customers)

# Resolvers use the loader instead of direct queries
async def resolve_customer(order, info):
    return await customer_loader.load(order.customer_id)

Even with 100 orders resolving their customers simultaneously, DataLoader collapses it to a single WHERE id IN (...) query.

One thing to remember: The cost of N+1 is N × round_trip_latency. On localhost it’s annoying; across a network it’s catastrophic. Make relationship loading decisions explicit — use lazy="raise" in SQLAlchemy or assertNumQueries in Django tests to catch N+1 before production.

pythondatabasesperformance