Python API Design Principles — Deep Dive
API Surface Definition
Your public API is everything importable without an underscore prefix. Define it explicitly:
# my_package/__init__.py
from .client import Client
from .config import Config
from .exceptions import APIError, NotFoundError
__all__ = ["Client", "Config", "APIError", "NotFoundError"]
Everything not in __all__ is an implementation detail. Changing it is a patch, not a breaking change. This boundary is the foundation of sustainable API evolution.
The _internal Convention
For private modules that other internal modules need:
my_package/
├── __init__.py # public surface
├── client.py # public
├── config.py # public
├── _serialization.py # private module
└── _internal/
├── cache.py # private
└── connection.py # private
Document this boundary in your contributing guide: “Anything in _internal/ or prefixed with _ may change without notice.”
Protocol-Based Extensibility
Designing for Duck Typing
Instead of requiring users to subclass your base class, define protocols:
from typing import Protocol, runtime_checkable
@runtime_checkable
class Storage(Protocol):
def save(self, key: str, data: bytes) -> None: ...
def load(self, key: str) -> bytes | None: ...
def delete(self, key: str) -> bool: ...
class FileCache:
"""Works with any Storage implementation — no inheritance needed."""
def __init__(self, storage: Storage):
self._storage = storage
Users can implement Storage with any class that has the right methods. No import from your library required in their implementation — true loose coupling.
Why Protocols Beat ABCs for Public APIs
Abstract base classes create a hard dependency: users must import and subclass your ABC. Protocols use structural typing — if the methods match, the type matches. This is more Pythonic and easier for users who come from different codebases.
Reserve ABCs for when you need __init_subclass__ hooks, required attribute registration, or when you provide default method implementations.
Method Chaining and Fluent Interfaces
The Builder Pattern
For objects with many configuration options:
class QueryBuilder:
def __init__(self, table: str):
self._table = table
self._conditions: list[str] = []
self._limit: int | None = None
self._order: str | None = None
def where(self, condition: str) -> "QueryBuilder":
self._conditions.append(condition)
return self
def limit(self, n: int) -> "QueryBuilder":
self._limit = n
return self
def order_by(self, column: str) -> "QueryBuilder":
self._order = column
return self
def build(self) -> str:
query = f"SELECT * FROM {self._table}"
if self._conditions:
query += " WHERE " + " AND ".join(self._conditions)
if self._order:
query += f" ORDER BY {self._order}"
if self._limit:
query += f" LIMIT {self._limit}"
return query
# Usage reads like a sentence
query = (
QueryBuilder("users")
.where("active = true")
.where("role = 'admin'")
.order_by("created_at")
.limit(10)
.build()
)
When Chaining Hurts
Method chaining makes debugging harder — you cannot set breakpoints mid-chain or inspect intermediate results easily. Use it for configuration (build-time) but not for operations with side effects (run-time).
Error Hierarchy Design
Structure Your Exceptions
class MyPackageError(Exception):
"""Base exception for all my_package errors."""
class ConfigError(MyPackageError):
"""Invalid configuration."""
class ConnectionError(MyPackageError):
"""Network connectivity issue."""
class APIError(MyPackageError):
"""Remote API returned an error."""
def __init__(self, status_code: int, message: str):
self.status_code = status_code
self.message = message
super().__init__(f"API error {status_code}: {message}")
class NotFoundError(APIError):
"""Resource not found (404)."""
def __init__(self, resource: str):
super().__init__(404, f"{resource} not found")
Users can catch broadly (except MyPackageError) or narrowly (except NotFoundError). The hierarchy lets them choose their error-handling granularity.
Error Messages as Documentation
Error messages are part of your API. They should include:
- What happened
- What was expected
- How to fix it
# Bad
raise ValueError("invalid input")
# Good
raise ValueError(
f"Expected ISO date string (YYYY-MM-DD), got {value!r}. "
f"Example: '2026-03-28'"
)
Backward-Compatible API Evolution
Adding Parameters
Adding optional keyword arguments is always backward-compatible:
# v1.0
def fetch(url: str, timeout: int = 30) -> Response: ...
# v1.1 — added retry, existing code still works
def fetch(url: str, timeout: int = 30, retry: int = 0) -> Response: ...
Renaming Parameters
Use **kwargs temporarily to support both old and new names:
def fetch(url: str, *, timeout: int = 30, **kwargs) -> Response:
if "max_wait" in kwargs:
warnings.warn(
"max_wait is deprecated, use timeout instead",
DeprecationWarning,
stacklevel=2,
)
timeout = kwargs.pop("max_wait")
if kwargs:
raise TypeError(f"Unexpected keyword arguments: {kwargs}")
...
Remove max_wait support in the next major version.
Changing Return Types
This is almost always breaking. Strategies:
- Extend, don’t replace — add attributes to the return object instead of changing its type
- New method —
get_user_v2()returns the new type whileget_user()continues working - Feature flag —
get_user(detailed=True)returns an extended type (but this can be confusing)
The safest approach: return a rich object from the start. A dataclass with a few attributes today can grow to many attributes without breaking callers who only use the original fields.
Designing for Testability
Dependency Injection
Accept dependencies as parameters instead of hardcoding them:
# Hard to test — tightly coupled to requests
class UserService:
def get_user(self, user_id: int):
response = requests.get(f"https://api.example.com/users/{user_id}")
return response.json()
# Easy to test — inject the HTTP client
class UserService:
def __init__(self, http_client: HttpClient):
self._http = http_client
def get_user(self, user_id: int):
response = self._http.get(f"/users/{user_id}")
return response.json()
Tests inject a mock HttpClient. Production injects the real one. The API surface stays clean.
Avoiding Global State
Global state (module-level variables, singletons) makes APIs unpredictable in tests:
# Bad — global state shared across tests
_cache = {}
def get_cached(key):
return _cache.get(key)
# Better — instance state, isolated per test
class Cache:
def __init__(self):
self._store = {}
def get(self, key):
return self._store.get(key)
API Usability Testing
The “Five-Minute Test”
Give a colleague your API without documentation. Can they accomplish a basic task in five minutes using only IDE autocomplete and type hints? If not, the API is too complex or poorly named.
The “Wrong Way” Test
Try to use your API incorrectly. What happens? Good APIs:
- Raise
TypeErrorfor wrong argument types (use type hints + runtime checks) - Raise
ValueErrorfor invalid values with actionable messages - Never silently produce wrong results
Documenting with Examples First
Write usage examples before implementing the API. This technique (README-driven development) surfaces awkward interfaces before you write the code:
# Draft the README examples first:
# client = Client("https://api.example.com")
# users = client.users.list(active=True)
# user = client.users.create(name="Alice", email="alice@example.com")
# client.users.delete(user.id)
If the examples look clean, implement the API to match. If they look awkward, redesign before coding.
Real-World Case Study: requests vs urllib
| Aspect | requests | urllib |
|---|---|---|
| Simple GET | requests.get(url) | urllib.request.urlopen(url).read() |
| JSON parsing | r.json() | json.loads(urllib.request.urlopen(url).read()) |
| POST with data | requests.post(url, json=data) | 6+ lines with Request, encoding, urlopen |
| Error handling | r.raise_for_status() | Check e.code in HTTPError handler |
| Auth | auth=("user", "pass") | HTTPBasicAuthHandler + build_opener |
requests became the most downloaded Python package because it applied every principle in this article: progressive disclosure, consistency, fail-fast, sensible defaults, and clear naming.
Tradeoffs
- Explicitness vs brevity — keyword-only arguments are clearer but more verbose for simple calls
- Protocols vs ABCs — protocols are flexible but lose default implementations and
__init_subclass__hooks - Rich return types vs primitives — domain objects are clearer but add types to learn
- Strict validation vs permissiveness — catching errors early is good, but overly strict APIs frustrate power users who know what they are doing
The one thing to remember: Design your API as if the user has never read your docs — make the right path obvious through naming, defaults, and type hints, and make the wrong path fail loudly with helpful error messages.
See Also
- Python Code Documentation Sphinx Turn Python code comments into a beautiful documentation website automatically.
- Python Docstring Conventions Write helpful notes inside your Python functions so anyone can understand them without reading the code.
- Python Project Layout Conventions Organize Python project files like a tidy toolbox so every teammate finds what they need instantly.
- Python Semantic Versioning Read version numbers like a label that tells you exactly how risky an upgrade will be.
- Ci Cd Why big apps can ship updates every day without turning your phone into a glitchy mess — CI/CD is the behind-the-scenes quality gate and delivery truck.