SQL Injection Prevention in Python — Deep Dive

Injection anatomy: beyond basic quotes

Simple string injection is well-understood, but attacks come in many forms:

Union-based injection extracts data from other tables:

-- Attacker input for a search field: ' UNION SELECT username, password FROM admin_users --
SELECT name, price FROM products WHERE name = '' UNION SELECT username, password FROM admin_users --'

The UNION combines results from the products table with the admin_users table. The -- comments out the rest of the original query.

Stacked queries execute entirely new statements:

-- Attacker input: '; DROP TABLE users; --
SELECT * FROM users WHERE id = ''; DROP TABLE users; --'

Not all databases/drivers support stacked queries (PostgreSQL does, MySQL with certain drivers does, SQLite does). Python’s psycopg2 supports them by default.

Blind injection extracts data without visible output, using boolean conditions or time delays:

-- Boolean-based: does the page change?
' AND (SELECT COUNT(*) FROM users WHERE username='admin' AND password LIKE 'a%') > 0 --

-- Time-based: does the response take longer?
' AND (SELECT CASE WHEN (1=1) THEN pg_sleep(5) ELSE pg_sleep(0) END) --

Parameterized queries across Python drivers

Each Python database library has its own parameterization syntax:

# psycopg2 (PostgreSQL)
cursor.execute(
    "SELECT * FROM users WHERE email = %s AND status = %s",
    (email, "active")
)

# psycopg2 with named parameters
cursor.execute(
    "SELECT * FROM users WHERE email = %(email)s AND status = %(status)s",
    {"email": email, "status": "active"}
)

# sqlite3
cursor.execute(
    "SELECT * FROM users WHERE email = ? AND status = ?",
    (email, "active")
)

# sqlite3 with named parameters
cursor.execute(
    "SELECT * FROM users WHERE email = :email AND status = :status",
    {"email": email, "status": "active"}
)

# asyncpg (PostgreSQL, async)
row = await conn.fetchrow(
    "SELECT * FROM users WHERE email = $1 AND status = $2",
    email, "active"
)

# mysql-connector-python
cursor.execute(
    "SELECT * FROM users WHERE email = %s AND status = %s",
    (email, "active")
)

The key principle is universal: the SQL structure is a static string; the data arrives through separate parameter channels.

ORM-level vulnerabilities

ORMs use parameterized queries internally, but careless code can introduce injection:

# SQLAlchemy — VULNERABLE uses of text()
from sqlalchemy import text

# DANGEROUS — f-string in text()
session.execute(text(f"SELECT * FROM users WHERE name = '{name}'"))

# SAFE — bound parameters with text()
session.execute(text("SELECT * FROM users WHERE name = :name"), {"name": name})

# Django — VULNERABLE raw SQL
User.objects.raw(f"SELECT * FROM auth_user WHERE username = '{username}'")

# SAFE — parameterized raw SQL
User.objects.raw("SELECT * FROM auth_user WHERE username = %s", [username])

# Django — VULNERABLE extra()
User.objects.extra(where=[f"username = '{username}'"])

# SAFE — parameterized extra()
User.objects.extra(where=["username = %s"], params=[username])

Django’s extra() is deprecated precisely because it’s easy to misuse. Use annotate(), filter(), or Subquery() instead.

Dynamic query building safely

Sometimes you need dynamic WHERE clauses, ORDER BY, or column selection. These require special handling:

# Dynamic WHERE clauses with SQLAlchemy
from sqlalchemy import and_, or_, column
from sqlalchemy.sql import select

def build_user_query(filters: dict):
    stmt = select(users)
    conditions = []

    if "name" in filters:
        conditions.append(users.c.name == filters["name"])  # Parameterized
    if "min_age" in filters:
        conditions.append(users.c.age >= filters["min_age"])  # Parameterized
    if "status" in filters:
        conditions.append(users.c.status.in_(filters["status"]))  # Parameterized

    if conditions:
        stmt = stmt.where(and_(*conditions))

    return stmt

For dynamic column names and sort orders:

ALLOWED_COLUMNS = {"name", "email", "created_at", "updated_at"}
ALLOWED_DIRECTIONS = {"asc", "desc"}

def safe_order_by(query, sort_col: str, direction: str = "asc"):
    if sort_col not in ALLOWED_COLUMNS:
        raise ValueError(f"Invalid column: {sort_col}")
    if direction.lower() not in ALLOWED_DIRECTIONS:
        raise ValueError(f"Invalid direction: {direction}")

    # Use SQLAlchemy's column() for safe dynamic ordering
    col = getattr(users.c, sort_col)
    if direction == "desc":
        col = col.desc()
    return query.order_by(col)

Allowlists are mandatory for identifiers because parameterized queries only protect values.

Second-order injection

Second-order (stored) injection happens when malicious data is safely stored in the database but later concatenated into a query:

# Step 1: User registers with username "admin'--"
# Parameterized insert — safe, stored as-is
cursor.execute("INSERT INTO users (username) VALUES (%s)", ("admin'--",))

# Step 2: Later, a different part of the app uses the stored value unsafely
user = get_user_from_session()  # Returns "admin'--"
cursor.execute(f"SELECT * FROM audit_log WHERE username = '{user.username}'")
# VULNERABLE — the stored value now causes injection

The fix: use parameterized queries everywhere, including internal queries that use data from the database. Treat stored data with the same caution as direct user input.

IN clause patterns

Dynamic IN clauses need special handling:

# psycopg2 — use tuple adaptation
user_ids = [1, 2, 3, 4, 5]
cursor.execute("SELECT * FROM users WHERE id IN %s", (tuple(user_ids),))

# SQLAlchemy
from sqlalchemy import select
stmt = select(users).where(users.c.id.in_(user_ids))

# sqlite3 — generate placeholders
placeholders = ",".join("?" * len(user_ids))
cursor.execute(f"SELECT * FROM users WHERE id IN ({placeholders})", user_ids)

# Django ORM — safe by default
User.objects.filter(id__in=user_ids)

Be cautious with very large IN lists (10,000+ items). They can cause performance issues and exceed query length limits. Consider using temporary tables or ANY with arrays in PostgreSQL.

Stored procedures: not a silver bullet

Stored procedures are sometimes cited as an injection defense. They help only if the procedure itself uses parameterized queries:

-- SAFE stored procedure
CREATE PROCEDURE get_user(p_username VARCHAR(100))
BEGIN
    SELECT * FROM users WHERE username = p_username;
END;

-- VULNERABLE stored procedure (yes, this exists in the wild)
CREATE PROCEDURE search_users(p_search TEXT)
BEGIN
    SET @query = CONCAT('SELECT * FROM users WHERE name LIKE ''%', p_search, '%''');
    PREPARE stmt FROM @query;
    EXECUTE stmt;
END;

Calling the first procedure from Python is safe. Calling the second is just as vulnerable as any string-concatenated query.

Automated detection tools

Bandit is a Python static analysis tool that catches common injection patterns:

pip install bandit
bandit -r myapp/ -t B608  # Check for SQL injection (hardcoded SQL expressions)

# Example findings:
# >> Issue: [B608:hardcoded_sql_expressions] Possible SQL injection
# >>    Severity: Medium   Confidence: Low
# >>    Location: myapp/views.py:42
# >>    41    name = request.args.get('name')
# >>    42    cursor.execute(f"SELECT * FROM users WHERE name = '{name}'")

SQLMap tests running applications for injection vulnerabilities:

# Test a specific parameter
sqlmap -u "http://localhost:8000/search?q=test" --batch --level=3

# Test POST parameters
sqlmap -u "http://localhost:8000/login" --data="username=admin&password=test" --batch

Run SQLMap against staging environments only, never production. It sends thousands of probing requests.

Database-level defenses

Least privilege: Create dedicated database users for your application with minimal permissions:

-- PostgreSQL: app user can SELECT, INSERT, UPDATE on specific tables
CREATE ROLE webapp_user WITH LOGIN PASSWORD 'strong-password';
GRANT SELECT, INSERT, UPDATE ON users, orders, products TO webapp_user;
-- No DELETE, DROP, ALTER, GRANT permissions

Row-level security (PostgreSQL):

ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
CREATE POLICY user_orders ON orders
    USING (user_id = current_setting('app.current_user_id')::int);

This prevents one user from accessing another user’s orders at the database level, even if the application has a bug.

Monitoring and alerting

Log and alert on suspicious query patterns:

import logging
import re

logger = logging.getLogger("sql_monitor")

SUSPICIOUS_PATTERNS = [
    r"UNION\s+SELECT",
    r";\s*(DROP|DELETE|UPDATE|INSERT)",
    r"--\s*$",
    r"pg_sleep",
    r"WAITFOR\s+DELAY",
    r"BENCHMARK\s*\(",
]

def check_for_injection(param_value: str) -> bool:
    for pattern in SUSPICIOUS_PATTERNS:
        if re.search(pattern, param_value, re.IGNORECASE):
            logger.warning(
                "Potential SQL injection attempt detected",
                extra={"value": param_value[:200], "pattern": pattern},
            )
            return True
    return False

This monitoring layer doesn’t prevent injection (parameterized queries do that), but it alerts you to active attack attempts so you can investigate and potentially block the source.

The one thing to remember: Use parameterized queries for every database operation (including internal queries using stored data), validate identifiers against allowlists, run Bandit in CI to catch regressions, and configure database accounts with least privilege — because SQL injection remains the most consistently exploited web vulnerability.

pythonsecuritydatabasesweb

See Also