Python Pagination Patterns — Deep Dive

Offset Pagination Implementation

FastAPI with SQLAlchemy

from fastapi import FastAPI, Query
from sqlalchemy import select, func
from sqlalchemy.ext.asyncio import AsyncSession

app = FastAPI()

@app.get("/users")
async def list_users(
    page: int = Query(1, ge=1),
    per_page: int = Query(20, ge=1, le=100),
    db: AsyncSession = Depends(get_db),
):
    offset = (page - 1) * per_page

    # Count total (expensive on large tables)
    total = await db.scalar(select(func.count()).select_from(User))

    # Fetch page
    query = select(User).order_by(User.id).offset(offset).limit(per_page)
    result = await db.execute(query)
    users = result.scalars().all()

    return {
        "data": [user.to_dict() for user in users],
        "pagination": {
            "total": total,
            "page": page,
            "per_page": per_page,
            "pages": (total + per_page - 1) // per_page,
        },
    }

The COUNT(*) query is the hidden cost. On PostgreSQL with millions of rows, this alone can take hundreds of milliseconds. Consider caching the count or returning an approximate count using reltuples from pg_stat_user_tables.

Django Paginator

Django has built-in offset pagination:

from django.core.paginator import Paginator
from django.http import JsonResponse

def user_list(request):
    page_num = int(request.GET.get("page", 1))
    per_page = min(int(request.GET.get("per_page", 20)), 100)

    users = User.objects.all().order_by("id")
    paginator = Paginator(users, per_page)
    page = paginator.get_page(page_num)

    return JsonResponse({
        "data": list(page.object_list.values("id", "name", "email")),
        "pagination": {
            "total": paginator.count,
            "page": page.number,
            "per_page": per_page,
            "pages": paginator.num_pages,
            "has_next": page.has_next(),
            "has_prev": page.has_previous(),
        },
    })

Django’s Paginator wraps OFFSET/LIMIT and handles edge cases like out-of-range pages.

Cursor Pagination Implementation

Encoding and Decoding Cursors

import base64
import json
from datetime import datetime

def encode_cursor(data: dict) -> str:
    json_str = json.dumps(data, default=str)
    return base64.urlsafe_b64encode(json_str.encode()).decode()

def decode_cursor(cursor: str) -> dict:
    try:
        json_str = base64.urlsafe_b64decode(cursor.encode()).decode()
        return json.loads(json_str)
    except Exception:
        raise ValueError("Invalid cursor")

Use urlsafe_b64encode so cursors work in URL query parameters without escaping.

FastAPI Cursor Pagination

@app.get("/events")
async def list_events(
    cursor: str | None = None,
    limit: int = Query(20, ge=1, le=100),
    db: AsyncSession = Depends(get_db),
):
    query = select(Event).order_by(Event.created_at.desc(), Event.id.desc())

    if cursor:
        cursor_data = decode_cursor(cursor)
        # Seek past the cursor position
        query = query.where(
            (Event.created_at < cursor_data["created_at"]) |
            (
                (Event.created_at == cursor_data["created_at"]) &
                (Event.id < cursor_data["id"])
            )
        )

    # Fetch one extra to check for next page
    result = await db.execute(query.limit(limit + 1))
    events = result.scalars().all()

    has_next = len(events) > limit
    events = events[:limit]

    next_cursor = None
    if has_next and events:
        last = events[-1]
        next_cursor = encode_cursor({
            "created_at": last.created_at.isoformat(),
            "id": last.id,
        })

    return {
        "data": [e.to_dict() for e in events],
        "pagination": {
            "has_next": has_next,
            "next_cursor": next_cursor,
            "limit": limit,
        },
    }

The limit + 1 trick is critical: fetch one extra row to determine if there’s a next page without a separate count query.

Multi-Column Cursor Sort

Sorting by multiple columns requires compound cursor conditions. For ORDER BY created_at DESC, id DESC:

WHERE (created_at, id) < (:cursor_created_at, :cursor_id)

PostgreSQL supports tuple comparison natively. For databases that don’t, expand into the OR condition shown above.

Keyset Pagination

Keyset is cursor pagination with transparent, human-readable parameters:

@app.get("/products")
async def list_products(
    after_id: int | None = None,
    limit: int = Query(20, ge=1, le=100),
    db: AsyncSession = Depends(get_db),
):
    query = select(Product).order_by(Product.id)

    if after_id is not None:
        query = query.where(Product.id > after_id)

    result = await db.execute(query.limit(limit + 1))
    products = result.scalars().all()

    has_next = len(products) > limit
    products = products[:limit]

    return {
        "data": [p.to_dict() for p in products],
        "pagination": {
            "has_next": has_next,
            "next_after_id": products[-1].id if has_next else None,
        },
    }

Keyset works beautifully when you have a natural unique ordered key (like auto-increment IDs or timestamps with tie-breakers).

Performance Analysis

The critical difference shows up on deep pages:

Offset pagination generates: SELECT * FROM events ORDER BY id LIMIT 20 OFFSET 100000

The database engine scans 100,020 rows, discards 100,000, and returns 20. On a table with 10 million rows, this gets progressively slower.

Cursor pagination generates: SELECT * FROM events WHERE id > 100000 ORDER BY id LIMIT 20

With an index on id, this is an index seek directly to row 100,001, then reads 20 rows. Constant time regardless of position.

Benchmark on a PostgreSQL table with 5 million rows:

Page depthOffset timeCursor time
Page 12ms2ms
Page 1008ms2ms
Page 10,000450ms2ms
Page 100,0004,200ms2ms

The difference becomes dramatic beyond a few thousand pages.

Django REST Framework Integration

DRF provides built-in pagination classes:

# settings.py
REST_FRAMEWORK = {
    "DEFAULT_PAGINATION_CLASS": "rest_framework.pagination.CursorPagination",
    "PAGE_SIZE": 20,
}

# Or per-view
from rest_framework.pagination import CursorPagination

class EventPagination(CursorPagination):
    page_size = 20
    ordering = "-created_at"
    cursor_query_param = "cursor"

class EventViewSet(viewsets.ReadOnlyModelViewSet):
    queryset = Event.objects.all()
    serializer_class = EventSerializer
    pagination_class = EventPagination

DRF’s CursorPagination handles encoding, decoding, and reverse pagination automatically.

Edge Cases

Empty results: Always return a valid response structure even when there are no results. {"data": [], "pagination": {"has_next": false}} is correct.

Deleted items: If a user paginates through results and items are deleted between requests, cursor pagination handles this gracefully (it just picks up from the cursor position). Offset pagination may skip items or show duplicates.

Concurrent inserts: For real-time feeds where new items appear frequently, cursor pagination prevents the “shifting window” problem where offset-based pages show the same item twice because a new item pushed everything down.

Bidirectional cursors: To support “previous page” in cursor pagination, include both next_cursor and prev_cursor:

prev_cursor = None
if cursor:  # Not the first page
    first = events[0]
    prev_cursor = encode_cursor({
        "created_at": first.created_at.isoformat(),
        "id": first.id,
        "direction": "prev",
    })

The server uses the direction field to flip the sort and comparison operators.

Hybrid Approach

Some APIs use offset for the first N pages and switch to cursor for deeper results:

@app.get("/search")
async def search(
    q: str,
    page: int = Query(1, ge=1, le=100),  # Max 100 pages via offset
    cursor: str | None = None,
):
    if cursor:
        return await _cursor_search(q, cursor)
    if page <= 100:
        return await _offset_search(q, page)
    raise HTTPException(400, "Use cursor for pages beyond 100")

Google Search does something similar — you can’t access page 50 of results directly. This avoids the performance cliff of deep offset pagination while keeping the simplicity of page numbers for common use cases.

The one thing to remember: Implement cursor pagination from the start using the limit + 1 trick and compound sort keys — it’s barely more complex than offset but scales to any dataset size without performance degradation.

pythonwebapis

See Also

  • Python Aiohttp Client Understand Aiohttp Client through a practical analogy so your Python decisions become faster and clearer.
  • Python Api Client Design Why building your own API client in Python is like creating a TV remote that only has the buttons you actually need.
  • Python Api Documentation Swagger Swagger turns your Python API into an interactive playground where anyone can click buttons to try it out — no coding required.
  • Python Api Mocking Responses Why testing with fake API responses is like rehearsing a play with stand-ins before the real actors show up.
  • Python Api Pagination Clients Why APIs send data in pages, and how Python handles it — like reading a book one chapter at a time instead of swallowing the whole thing.