Database Migrations with Alembic — Core Concepts
Why database migrations exist
Application code evolves constantly — new features need new tables, columns get renamed, old data structures become obsolete. Without a structured migration system, database changes become a game of manual SQL scripts, Slack messages asking “did you run the ALTER TABLE?”, and production outages when someone forgets a step.
Alembic solves this by creating a linear (or branching) history of database schema changes that can be applied, rolled back, and tracked across every environment.
How Alembic works
The migration chain
Every Alembic migration file has a unique revision ID and a pointer to its parent:
revision: "a1b2c3d4"
down_revision: "e5f6g7h8"
This creates a linked list of changes. Alembic tracks which revision each database is currently at using a table called alembic_version. When you run alembic upgrade head, it finds the current revision, walks the chain forward, and applies each migration in order.
Auto-generation
Alembic’s killer feature is --autogenerate. It compares your SQLAlchemy models against the actual database schema and generates a migration file capturing the differences:
- New tables or columns →
CREATE TABLE/ADD COLUMN - Removed columns →
DROP COLUMN - Changed column types →
ALTER COLUMN - New indexes or constraints →
CREATE INDEX
It doesn’t catch everything — data migrations, renamed columns (it sees a drop + add), or changes to stored procedures require manual editing. But it handles 80% of structural changes automatically.
Upgrade and downgrade
Each migration file contains two functions:
- upgrade(): Apply the change (move forward)
- downgrade(): Reverse the change (move backward)
This bidirectional design means you can roll back a bad migration without restoring from backup. In practice, downgrades for production databases are tricky (you can’t un-delete data), but they’re invaluable in development.
Key concepts
The env.py file
Alembic’s env.py is the bridge between your project and the migration engine. It configures the database connection, imports your models (so autogenerate can compare them), and controls how migrations run. Most teams customize it once and rarely touch it again.
Branching and merging
When two developers create migrations from the same base revision, Alembic detects the branch and requires a merge migration before proceeding. This prevents conflicting changes from silently overwriting each other.
Offline mode
Alembic can generate SQL scripts without connecting to a database (alembic upgrade head --sql). This is essential for organizations where DBAs review and apply SQL manually, or where applications can’t connect directly to production databases.
Common misconception
“Alembic migrations are just for development — we apply changes manually in production.”
This approach falls apart at scale. Manual changes aren’t tracked, aren’t reversible, and can’t be reproduced. Companies like Yelp and Reddit use Alembic migrations in production deployment pipelines — migrations run automatically as part of the deploy process, with rollback procedures if they fail.
When Alembic shines vs. falls short
Alembic excels for projects using SQLAlchemy with PostgreSQL or MySQL. It integrates tightly with SQLAlchemy’s model definitions, making autogenerate genuinely useful.
It’s less ideal for projects using Django (which has its own migration system), NoSQL databases, or raw SQL without an ORM. It also doesn’t handle data migrations gracefully — moving data between columns or backfilling new fields requires writing manual Python code inside migration files.
One thing to remember: Alembic makes database changes reproducible, orderable, and reversible. Treat migration files like source code — review them, test them, and never edit them after they’ve run in production.
See Also
- Python Aioredis Understand Aioredis through a practical analogy so your Python decisions become faster and clearer.
- Python Alembic Understand Alembic through a practical analogy so your Python decisions become faster and clearer.
- Python Asyncpg Database asyncpg is the fastest way for Python to talk to PostgreSQL without making your program sit around waiting.
- Python Asyncpg Understand Asyncpg through a practical analogy so your Python decisions become faster and clearer.
- Python Cassandra Python Understand Cassandra Python through a practical analogy so your Python decisions become faster and clearer.