SQL Injection Prevention in Python — Deep Dive
Injection anatomy: beyond basic quotes
Simple string injection is well-understood, but attacks come in many forms:
Union-based injection extracts data from other tables:
-- Attacker input for a search field: ' UNION SELECT username, password FROM admin_users --
SELECT name, price FROM products WHERE name = '' UNION SELECT username, password FROM admin_users --'
The UNION combines results from the products table with the admin_users table. The -- comments out the rest of the original query.
Stacked queries execute entirely new statements:
-- Attacker input: '; DROP TABLE users; --
SELECT * FROM users WHERE id = ''; DROP TABLE users; --'
Not all databases/drivers support stacked queries (PostgreSQL does, MySQL with certain drivers does, SQLite does). Python’s psycopg2 supports them by default.
Blind injection extracts data without visible output, using boolean conditions or time delays:
-- Boolean-based: does the page change?
' AND (SELECT COUNT(*) FROM users WHERE username='admin' AND password LIKE 'a%') > 0 --
-- Time-based: does the response take longer?
' AND (SELECT CASE WHEN (1=1) THEN pg_sleep(5) ELSE pg_sleep(0) END) --
Parameterized queries across Python drivers
Each Python database library has its own parameterization syntax:
# psycopg2 (PostgreSQL)
cursor.execute(
"SELECT * FROM users WHERE email = %s AND status = %s",
(email, "active")
)
# psycopg2 with named parameters
cursor.execute(
"SELECT * FROM users WHERE email = %(email)s AND status = %(status)s",
{"email": email, "status": "active"}
)
# sqlite3
cursor.execute(
"SELECT * FROM users WHERE email = ? AND status = ?",
(email, "active")
)
# sqlite3 with named parameters
cursor.execute(
"SELECT * FROM users WHERE email = :email AND status = :status",
{"email": email, "status": "active"}
)
# asyncpg (PostgreSQL, async)
row = await conn.fetchrow(
"SELECT * FROM users WHERE email = $1 AND status = $2",
email, "active"
)
# mysql-connector-python
cursor.execute(
"SELECT * FROM users WHERE email = %s AND status = %s",
(email, "active")
)
The key principle is universal: the SQL structure is a static string; the data arrives through separate parameter channels.
ORM-level vulnerabilities
ORMs use parameterized queries internally, but careless code can introduce injection:
# SQLAlchemy — VULNERABLE uses of text()
from sqlalchemy import text
# DANGEROUS — f-string in text()
session.execute(text(f"SELECT * FROM users WHERE name = '{name}'"))
# SAFE — bound parameters with text()
session.execute(text("SELECT * FROM users WHERE name = :name"), {"name": name})
# Django — VULNERABLE raw SQL
User.objects.raw(f"SELECT * FROM auth_user WHERE username = '{username}'")
# SAFE — parameterized raw SQL
User.objects.raw("SELECT * FROM auth_user WHERE username = %s", [username])
# Django — VULNERABLE extra()
User.objects.extra(where=[f"username = '{username}'"])
# SAFE — parameterized extra()
User.objects.extra(where=["username = %s"], params=[username])
Django’s extra() is deprecated precisely because it’s easy to misuse. Use annotate(), filter(), or Subquery() instead.
Dynamic query building safely
Sometimes you need dynamic WHERE clauses, ORDER BY, or column selection. These require special handling:
# Dynamic WHERE clauses with SQLAlchemy
from sqlalchemy import and_, or_, column
from sqlalchemy.sql import select
def build_user_query(filters: dict):
stmt = select(users)
conditions = []
if "name" in filters:
conditions.append(users.c.name == filters["name"]) # Parameterized
if "min_age" in filters:
conditions.append(users.c.age >= filters["min_age"]) # Parameterized
if "status" in filters:
conditions.append(users.c.status.in_(filters["status"])) # Parameterized
if conditions:
stmt = stmt.where(and_(*conditions))
return stmt
For dynamic column names and sort orders:
ALLOWED_COLUMNS = {"name", "email", "created_at", "updated_at"}
ALLOWED_DIRECTIONS = {"asc", "desc"}
def safe_order_by(query, sort_col: str, direction: str = "asc"):
if sort_col not in ALLOWED_COLUMNS:
raise ValueError(f"Invalid column: {sort_col}")
if direction.lower() not in ALLOWED_DIRECTIONS:
raise ValueError(f"Invalid direction: {direction}")
# Use SQLAlchemy's column() for safe dynamic ordering
col = getattr(users.c, sort_col)
if direction == "desc":
col = col.desc()
return query.order_by(col)
Allowlists are mandatory for identifiers because parameterized queries only protect values.
Second-order injection
Second-order (stored) injection happens when malicious data is safely stored in the database but later concatenated into a query:
# Step 1: User registers with username "admin'--"
# Parameterized insert — safe, stored as-is
cursor.execute("INSERT INTO users (username) VALUES (%s)", ("admin'--",))
# Step 2: Later, a different part of the app uses the stored value unsafely
user = get_user_from_session() # Returns "admin'--"
cursor.execute(f"SELECT * FROM audit_log WHERE username = '{user.username}'")
# VULNERABLE — the stored value now causes injection
The fix: use parameterized queries everywhere, including internal queries that use data from the database. Treat stored data with the same caution as direct user input.
IN clause patterns
Dynamic IN clauses need special handling:
# psycopg2 — use tuple adaptation
user_ids = [1, 2, 3, 4, 5]
cursor.execute("SELECT * FROM users WHERE id IN %s", (tuple(user_ids),))
# SQLAlchemy
from sqlalchemy import select
stmt = select(users).where(users.c.id.in_(user_ids))
# sqlite3 — generate placeholders
placeholders = ",".join("?" * len(user_ids))
cursor.execute(f"SELECT * FROM users WHERE id IN ({placeholders})", user_ids)
# Django ORM — safe by default
User.objects.filter(id__in=user_ids)
Be cautious with very large IN lists (10,000+ items). They can cause performance issues and exceed query length limits. Consider using temporary tables or ANY with arrays in PostgreSQL.
Stored procedures: not a silver bullet
Stored procedures are sometimes cited as an injection defense. They help only if the procedure itself uses parameterized queries:
-- SAFE stored procedure
CREATE PROCEDURE get_user(p_username VARCHAR(100))
BEGIN
SELECT * FROM users WHERE username = p_username;
END;
-- VULNERABLE stored procedure (yes, this exists in the wild)
CREATE PROCEDURE search_users(p_search TEXT)
BEGIN
SET @query = CONCAT('SELECT * FROM users WHERE name LIKE ''%', p_search, '%''');
PREPARE stmt FROM @query;
EXECUTE stmt;
END;
Calling the first procedure from Python is safe. Calling the second is just as vulnerable as any string-concatenated query.
Automated detection tools
Bandit is a Python static analysis tool that catches common injection patterns:
pip install bandit
bandit -r myapp/ -t B608 # Check for SQL injection (hardcoded SQL expressions)
# Example findings:
# >> Issue: [B608:hardcoded_sql_expressions] Possible SQL injection
# >> Severity: Medium Confidence: Low
# >> Location: myapp/views.py:42
# >> 41 name = request.args.get('name')
# >> 42 cursor.execute(f"SELECT * FROM users WHERE name = '{name}'")
SQLMap tests running applications for injection vulnerabilities:
# Test a specific parameter
sqlmap -u "http://localhost:8000/search?q=test" --batch --level=3
# Test POST parameters
sqlmap -u "http://localhost:8000/login" --data="username=admin&password=test" --batch
Run SQLMap against staging environments only, never production. It sends thousands of probing requests.
Database-level defenses
Least privilege: Create dedicated database users for your application with minimal permissions:
-- PostgreSQL: app user can SELECT, INSERT, UPDATE on specific tables
CREATE ROLE webapp_user WITH LOGIN PASSWORD 'strong-password';
GRANT SELECT, INSERT, UPDATE ON users, orders, products TO webapp_user;
-- No DELETE, DROP, ALTER, GRANT permissions
Row-level security (PostgreSQL):
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
CREATE POLICY user_orders ON orders
USING (user_id = current_setting('app.current_user_id')::int);
This prevents one user from accessing another user’s orders at the database level, even if the application has a bug.
Monitoring and alerting
Log and alert on suspicious query patterns:
import logging
import re
logger = logging.getLogger("sql_monitor")
SUSPICIOUS_PATTERNS = [
r"UNION\s+SELECT",
r";\s*(DROP|DELETE|UPDATE|INSERT)",
r"--\s*$",
r"pg_sleep",
r"WAITFOR\s+DELAY",
r"BENCHMARK\s*\(",
]
def check_for_injection(param_value: str) -> bool:
for pattern in SUSPICIOUS_PATTERNS:
if re.search(pattern, param_value, re.IGNORECASE):
logger.warning(
"Potential SQL injection attempt detected",
extra={"value": param_value[:200], "pattern": pattern},
)
return True
return False
This monitoring layer doesn’t prevent injection (parameterized queries do that), but it alerts you to active attack attempts so you can investigate and potentially block the source.
The one thing to remember: Use parameterized queries for every database operation (including internal queries using stored data), validate identifiers against allowlists, run Bandit in CI to catch regressions, and configure database accounts with least privilege — because SQL injection remains the most consistently exploited web vulnerability.
See Also
- Python Api Key Management Why apps use special passwords called API keys, and how to keep them safe — explained with a library card analogy
- Python Attribute Based Access Control How apps make fine-grained permission decisions based on who you are, what you're accessing, and the circumstances — explained with an airport analogy
- Python Audit Logging Learn Audit Logging with a clear mental model so your Python code is easier to trust and maintain.
- Python Bandit Security Scanning Why Bandit Security Scanning helps Python teams catch painful mistakes early without slowing daily development.
- Python Clickjacking Prevention How invisible website layers trick you into clicking the wrong thing, and how Python apps stop it