Portfolio Optimization with Python — Deep Dive
The Black-Litterman model
Standard mean-variance optimization requires expected returns as input, but historical averages are noisy estimators. The Black-Litterman model starts with market-implied equilibrium returns (derived from market-cap weights) and allows investors to overlay personal views with specified confidence levels.
Deriving equilibrium returns
import numpy as np
def implied_returns(
cov_matrix: np.ndarray,
market_weights: np.ndarray,
risk_aversion: float = 2.5,
) -> np.ndarray:
"""
Reverse-optimize: what expected returns would make the market portfolio optimal?
Pi = delta * Sigma * w_mkt
"""
return risk_aversion * cov_matrix @ market_weights
Incorporating investor views
def black_litterman(
cov_matrix: np.ndarray,
equilibrium_returns: np.ndarray,
P: np.ndarray, # K x N view matrix (which assets each view involves)
Q: np.ndarray, # K x 1 view returns
omega: np.ndarray, # K x K view uncertainty matrix
tau: float = 0.05,
) -> np.ndarray:
"""
Black-Litterman posterior expected returns.
P: picking matrix (rows = views, columns = assets)
Q: expected returns of each view
omega: diagonal matrix of view uncertainties
"""
tau_sigma = tau * cov_matrix
inv_tau_sigma = np.linalg.inv(tau_sigma)
inv_omega = np.linalg.inv(omega)
posterior_precision = inv_tau_sigma + P.T @ inv_omega @ P
posterior_cov = np.linalg.inv(posterior_precision)
posterior_mean = posterior_cov @ (inv_tau_sigma @ equilibrium_returns + P.T @ inv_omega @ Q)
return posterior_mean
Example: “I believe tech stocks will outperform bonds by 3% annually with moderate confidence.” This translates into a row in P (long tech, short bonds), a value in Q (0.03), and a diagonal entry in omega reflecting uncertainty.
The beauty of Black-Litterman is that with zero views, the portfolio defaults to market-cap weights — a sensible baseline. Each view tilts the portfolio proportionally to your confidence.
Robust covariance estimation
Ledoit-Wolf shrinkage
The sample covariance matrix is noisy, especially when the number of assets approaches the number of time periods. Shrinkage blends the sample covariance toward a structured target:
from sklearn.covariance import LedoitWolf
def shrunk_covariance(returns: np.ndarray) -> np.ndarray:
"""Ledoit-Wolf shrinkage covariance estimator."""
lw = LedoitWolf().fit(returns)
return lw.covariance_
The shrinkage intensity is estimated optimally — no manual tuning needed. For portfolios with 50+ assets, this produces dramatically more stable weights than the raw sample covariance.
Minimum Covariance Determinant
For datasets contaminated with outliers (earnings announcements, flash crashes), the MCD estimator is more robust:
from sklearn.covariance import MinCovDet
def robust_covariance(returns: np.ndarray) -> np.ndarray:
mcd = MinCovDet(support_fraction=0.75).fit(returns)
return mcd.covariance_
Convex optimization with cvxpy
For complex real-world constraints, scipy.optimize becomes unwieldy. cvxpy provides a declarative interface:
import cvxpy as cp
import numpy as np
def optimize_with_constraints(
expected_returns: np.ndarray,
cov_matrix: np.ndarray,
risk_aversion: float = 1.0,
sector_map: dict[str, list[int]] | None = None,
max_sector_weight: float = 0.40,
turnover_limit: float | None = None,
current_weights: np.ndarray | None = None,
) -> np.ndarray:
n = len(expected_returns)
w = cp.Variable(n)
# Objective: maximize return - risk_aversion * variance
ret = expected_returns @ w
risk = cp.quad_form(w, cov_matrix)
objective = cp.Maximize(ret - risk_aversion * risk)
constraints = [
cp.sum(w) == 1,
w >= 0, # long-only
w <= 0.10, # max 10% per asset
]
# Sector constraints
if sector_map:
for sector, indices in sector_map.items():
constraints.append(cp.sum(w[indices]) <= max_sector_weight)
# Turnover constraint (rebalancing cost control)
if turnover_limit is not None and current_weights is not None:
constraints.append(cp.norm(w - current_weights, 1) <= turnover_limit)
problem = cp.Problem(objective, constraints)
problem.solve(solver=cp.OSQP)
return w.value
Constraint catalog for institutional portfolios
Real institutional mandates include dozens of constraints:
- Regulatory: minimum bond allocation for insurance companies.
- Liquidity: no more than X% in assets with average daily volume below Y.
- ESG: exclude specific sectors (tobacco, weapons) or require minimum ESG scores.
- Tracking error: deviation from benchmark must stay below a limit.
- Factor exposure: target or limit exposure to value, momentum, size factors.
cvxpy handles all of these as linear or second-order cone constraints within the same optimization.
Risk parity — full implementation
The inverse-volatility approximation is a rough starting point. True risk parity equalizes the marginal risk contribution of each asset:
import numpy as np
from scipy.optimize import minimize
def risk_contribution(weights: np.ndarray, cov_matrix: np.ndarray) -> np.ndarray:
"""Marginal risk contribution of each asset."""
port_vol = np.sqrt(weights @ cov_matrix @ weights)
marginal = cov_matrix @ weights
return weights * marginal / port_vol
def risk_parity_optimize(cov_matrix: np.ndarray) -> np.ndarray:
"""Find weights where all assets contribute equal risk."""
n = len(cov_matrix)
target_risk = 1.0 / n
def objective(weights):
rc = risk_contribution(weights, cov_matrix)
rc_pct = rc / rc.sum()
return np.sum((rc_pct - target_risk) ** 2)
constraints = [{"type": "eq", "fun": lambda w: np.sum(w) - 1}]
bounds = [(0.01, 1.0) for _ in range(n)]
result = minimize(
objective,
x0=np.ones(n) / n,
method="SLSQP",
bounds=bounds,
constraints=constraints,
)
return result.x
Risk parity portfolios tend to overweight bonds (low volatility) and underweight equities (high volatility). Many implementations use leverage to scale the resulting portfolio to a target volatility.
Rebalancing strategies
Optimizing once is not enough — portfolios drift as prices move. Three common approaches:
Calendar rebalancing
Rebalance on a fixed schedule (monthly, quarterly). Simple but may trigger unnecessary trades during low-drift periods.
Threshold rebalancing
Rebalance only when any asset’s weight drifts beyond a band (e.g., ±5% from target). More efficient but requires daily monitoring.
Optimal rebalancing with transaction costs
def rebalance_with_costs(
current_weights: np.ndarray,
target_weights: np.ndarray,
transaction_cost_bps: float = 10.0,
threshold_bps: float = 50.0,
) -> np.ndarray:
"""
Rebalance toward target, but only trade positions
where the drift exceeds the cost-adjusted threshold.
"""
drift = target_weights - current_weights
cost_threshold = threshold_bps / 10_000
# Only trade where the benefit of rebalancing exceeds costs
trades = np.where(np.abs(drift) > cost_threshold, drift, 0)
# Normalize to ensure weights sum to 1
new_weights = current_weights + trades
new_weights /= new_weights.sum()
return new_weights
Tax-aware optimization
For taxable accounts, selling winners triggers capital gains tax. Tax-aware optimizers include tax cost in the objective:
def tax_cost(
current_weights: np.ndarray,
target_weights: np.ndarray,
unrealized_gains: np.ndarray,
tax_rate: float = 0.20,
) -> float:
"""Estimated tax cost of rebalancing."""
sells = np.maximum(current_weights - target_weights, 0)
# Proportional unrealized gain per unit sold
taxable = sells * unrealized_gains / np.maximum(current_weights, 1e-10)
return tax_rate * np.sum(taxable)
Direct indexing platforms like Wealthfront and Aperio use this approach to harvest tax losses while maintaining target factor exposures.
Performance attribution
After optimization and live trading, decompose returns to understand what drove performance:
def brinson_attribution(
portfolio_weights: np.ndarray,
benchmark_weights: np.ndarray,
portfolio_returns: np.ndarray,
benchmark_returns: np.ndarray,
) -> dict:
"""Brinson-Fachler attribution: allocation + selection + interaction."""
allocation = (portfolio_weights - benchmark_weights) * benchmark_returns
selection = benchmark_weights * (portfolio_returns - benchmark_returns)
interaction = (portfolio_weights - benchmark_weights) * (portfolio_returns - benchmark_returns)
return {
"allocation_effect": allocation.sum(),
"selection_effect": selection.sum(),
"interaction_effect": interaction.sum(),
"total_active": allocation.sum() + selection.sum() + interaction.sum(),
}
Backtesting the optimization process
The ultimate test: simulate the full optimize → trade → rebalance cycle over historical data. This captures estimation error, transaction costs, and regime changes that single-period optimization ignores.
def backtest_optimizer(
returns: pd.DataFrame,
lookback: int = 252,
rebalance_freq: int = 21, # monthly
optimize_fn=None,
) -> pd.Series:
"""Walk-forward portfolio optimization backtest."""
portfolio_returns = []
for t in range(lookback, len(returns), rebalance_freq):
hist = returns.iloc[t - lookback : t]
weights = optimize_fn(hist)
# Forward returns until next rebalance
fwd = returns.iloc[t : t + rebalance_freq]
port_ret = (fwd * weights).sum(axis=1)
portfolio_returns.append(port_ret)
return pd.concat(portfolio_returns)
The one thing to remember: Production portfolio optimization is an ongoing process — not a one-time calculation — requiring robust estimation, realistic constraints, tax awareness, and continuous rebalancing that accounts for transaction costs and regime changes.
See Also
- Python Risk Analysis Monte Carlo How rolling a virtual dice thousands of times helps investors understand what could go wrong with their money.
- Python Backtesting Trading Strategies Why traders use Python to test their ideas on old data before risking real money, in plain language.
- Python Fraud Detection Patterns How Python helps banks and companies catch cheaters and thieves before they get away with it.
- Python Quantitative Finance How Python helps people use math and data to make smarter money decisions, explained without any jargon.
- Python Technical Indicators What technical indicators are and how Python calculates them, explained like you have never seen a stock chart.