Builder Pattern — Deep Dive
Anatomy of a Pythonic builder
The classic Gang of Four builder involves a Director, an abstract Builder interface, and concrete builders. In Python, this ceremony is rarely needed. Instead, we lean on method chaining, dataclasses for the product, and a single builder class.
from dataclasses import dataclass, field
@dataclass(frozen=True)
class Query:
table: str
columns: list[str]
filters: list[str] = field(default_factory=list)
order_by: str | None = None
limit: int | None = None
class QueryBuilder:
def __init__(self, table: str):
self._table = table
self._columns: list[str] = []
self._filters: list[str] = []
self._order_by: str | None = None
self._limit: int | None = None
def select(self, *columns: str) -> "QueryBuilder":
self._columns.extend(columns)
return self
def where(self, condition: str) -> "QueryBuilder":
self._filters.append(condition)
return self
def order(self, column: str) -> "QueryBuilder":
self._order_by = column
return self
def cap(self, n: int) -> "QueryBuilder":
self._limit = n
return self
def build(self) -> Query:
if not self._columns:
raise ValueError("At least one column required")
return Query(
table=self._table,
columns=list(self._columns),
filters=list(self._filters),
order_by=self._order_by,
limit=self._limit,
)
Usage reads almost like a sentence:
query = (
QueryBuilder("users")
.select("name", "email")
.where("active = true")
.order("created_at")
.cap(50)
.build()
)
Key design decisions
Immutable products
The product (Query) is a frozen dataclass. Once built, it can’t be mutated. This prevents bugs where someone modifies a query object after construction, invalidating assumptions downstream. The builder is the only mutable piece — and it’s discarded after .build().
Defensive copies
Notice the list(self._columns) in build(). Without this, the product would share the builder’s internal list. If someone reuses the builder, the previously built object could change. Always copy mutable state into the product.
Validation in build()
Validation belongs in build(), not in individual setter methods. Why? Because constraints often span multiple fields. A report builder might require that if group_by is set, at least one aggregation column must also be present. You can only check that once all fields are accumulated.
def build(self) -> Report:
if self._group_by and not self._aggregations:
raise ValueError("group_by requires at least one aggregation")
return Report(...)
Step builder for enforced ordering
Sometimes construction steps have a required order. A step builder uses different return types to guide the user through a sequence:
class ConnectionBuilder:
def __init__(self):
self._host: str | None = None
self._port: int = 5432
def host(self, h: str) -> "ConnectionBuilderWithHost":
self._host = h
return ConnectionBuilderWithHost(self)
class ConnectionBuilderWithHost:
def __init__(self, parent: ConnectionBuilder):
self._parent = parent
def port(self, p: int) -> "ConnectionBuilderWithHost":
self._parent._port = p
return self
def database(self, db: str) -> "ConnectionBuilderReady":
return ConnectionBuilderReady(self._parent, db)
class ConnectionBuilderReady:
def __init__(self, parent: ConnectionBuilder, db: str):
self._parent = parent
self._db = db
def build(self) -> dict:
return {
"host": self._parent._host,
"port": self._parent._port,
"database": self._db,
}
Type checkers like mypy will flag attempts to call .build() before .host() and .database() are called. This turns runtime errors into static analysis catches.
Builder with __init_subclass__
For frameworks that register builders dynamically:
class BaseBuilder:
_registry: dict[str, type] = {}
def __init_subclass__(cls, format_name: str = "", **kwargs):
super().__init_subclass__(**kwargs)
if format_name:
BaseBuilder._registry[format_name] = cls
@classmethod
def for_format(cls, name: str) -> "BaseBuilder":
return cls._registry[name]()
class CSVBuilder(BaseBuilder, format_name="csv"):
def build(self) -> str:
return "csv-output"
class JSONBuilder(BaseBuilder, format_name="json"):
def build(self) -> str:
return "json-output"
builder = BaseBuilder.for_format("csv")
This combines builder and factory registration cleanly.
Real-world examples
Django’s QuerySet
Django’s ORM is one of the most famous builders in Python. Each method (.filter(), .exclude(), .order_by(), .values()) returns a new QuerySet — an immutable builder that doesn’t hit the database until evaluated. This is a lazy builder pattern.
Requests library’s PreparedRequest
The requests library lets you build a request in stages via Request() and then .prepare(), separating configuration from execution.
SQLAlchemy’s select()
SQLAlchemy 2.0 uses a builder-style API: select(User).where(User.active == True).order_by(User.name).
Tradeoffs
Advantages:
- Readable construction of complex objects
- Compile-time safety with step builders and type checkers
- Easy to add new optional parameters without breaking existing code
- Natural fit for test fixtures and configuration objects
Disadvantages:
- More classes and code than a simple constructor
- Mutable builder state can cause thread-safety issues if shared
- Over-engineering risk for simple objects — a dataclass with defaults is usually enough
When to refactor toward a builder
Watch for these signals:
__init__has more than 5 parameters- Multiple boolean flags that interact with each other
- Test code repeatedly constructs the same object with small variations
- Code review comments like “what does the third argument mean?”
- Default values depend on other parameters (conditional defaults)
If two or more of these appear, a builder will likely improve clarity.
Testing with builders
Builders shine in test suites. A test builder lets you create objects with sensible defaults and override only what the specific test cares about:
class UserBuilder:
def __init__(self):
self._name = "Test User"
self._email = "test@example.com"
self._active = True
def name(self, n: str) -> "UserBuilder":
self._name = n
return self
def inactive(self) -> "UserBuilder":
self._active = False
return self
def build(self) -> dict:
return {"name": self._name, "email": self._email, "active": self._active}
# Tests read clearly
active_user = UserBuilder().build()
inactive_user = UserBuilder().inactive().build()
named_user = UserBuilder().name("Alice").build()
The one thing to remember: A well-designed builder separates the messy process of configuring a complex object from the clean, validated result — making both construction and the final product easier to trust.
See Also
- Python Adapter Pattern How Python's Adapter Pattern works like a travel power plug — making incompatible things work together.
- Python Bridge Pattern Why separating what something does from how it does it keeps your Python code from becoming a tangled mess.
- Python Composite Pattern How the Composite Pattern lets you treat a group of things the same way you'd treat a single thing in Python.
- Python Facade Pattern How the Facade Pattern gives you one simple button instead of a confusing control panel in Python.
- Python Flyweight Pattern How the Flyweight Pattern saves memory by sharing common data instead of copying it thousands of times.