Database Pool Sizing — Core Concepts

Why pool sizing matters

A connection pool pre-creates and reuses database connections instead of opening and closing them for every query. This avoids the overhead of TCP handshakes, authentication, and SSL negotiation — which can take 20–100ms each. But the number of connections in that pool has a dramatic effect on both performance and stability.

Instagram’s engineering team famously discovered that reducing their PostgreSQL connection count actually improved throughput. The reason: fewer connections meant less context-switching overhead on the database server.

Key concepts

Pool minimum and maximum

Most pool implementations have two critical settings:

  • pool_size (min): Connections kept alive even when idle. These are ready immediately when a request arrives.
  • max_overflow: Additional connections created under load, closed when demand drops.

In SQLAlchemy, for example, the total maximum is pool_size + max_overflow.

The bottleneck isn’t where you think

Applications often default to large pool sizes (50, 100, or more), assuming more connections means more throughput. The actual bottleneck is usually the database server’s ability to handle concurrent work:

  • CPU-bound queries: Each active query needs CPU time. With 8 cores, more than 8–12 truly concurrent queries often cause CPU contention.
  • I/O-bound queries: Queries waiting on disk reads can overlap more, but disk throughput has limits too.
  • Lock contention: More concurrent transactions means more potential for row-level or table-level locks to collide.

The PostgreSQL rule of thumb

The PostgreSQL wiki suggests a starting formula:

connections = (core_count × 2) + effective_spindle_count

For a server with 4 CPU cores and an SSD (1 effective spindle), that’s (4 × 2) + 1 = 9. This seems surprisingly low, but testing consistently shows it’s near-optimal for most workloads.

How it works in practice

When your Python web app receives a request:

  1. The handler asks the pool for a connection
  2. If an idle connection exists, it’s handed over instantly
  3. If all connections are busy but max isn’t reached, a new one is created
  4. If max is reached, the request waits (with a configurable timeout)
  5. When the handler finishes, the connection returns to the pool

Step 4 is where problems surface. If your pool is too small for your traffic, requests queue up and response times spike. If it’s too large, the database itself slows down under the weight of concurrent work.

Common misconception

“My app has 100 concurrent users, so I need 100 connections.”

This confuses concurrent users with concurrent queries. A typical web request might hold a database connection for 5–50ms out of a 200ms total response time. That means each connection can serve 4–40 requests per second. With a pool of 10 connections, you can often handle hundreds of concurrent users.

Sizing for multiple application instances

If you run 4 copies of your app (common in production behind a load balancer), each with a pool of 10, your database sees 40 connections. This is why pool sizing must account for the total across all instances, not just one.

Tools like PgBouncer sit between your apps and the database, multiplexing many application connections into fewer database connections. This lets each app have a generous local pool while keeping database-side connections tight.

One thing to remember: Start with a small pool (5–10 per instance), measure query latency and queue wait times under realistic load, then adjust. The right size is found through testing, not guessing.

pythondatabasesperformance

See Also

  • Python Aioredis Understand Aioredis through a practical analogy so your Python decisions become faster and clearer.
  • Python Alembic Understand Alembic through a practical analogy so your Python decisions become faster and clearer.
  • Python Asyncpg Database asyncpg is the fastest way for Python to talk to PostgreSQL without making your program sit around waiting.
  • Python Asyncpg Understand Asyncpg through a practical analogy so your Python decisions become faster and clearer.
  • Python Cassandra Python Understand Cassandra Python through a practical analogy so your Python decisions become faster and clearer.