Scikit-Learn Grid Search — Core Concepts
Why hyperparameter tuning matters
Machine learning models have two types of parameters. Learned parameters (like neural network weights) are optimized during training. Hyperparameters (like the number of trees in a random forest or the regularization strength in logistic regression) are set before training begins and control how learning happens.
Bad hyperparameters can make a powerful model perform worse than a simple baseline. A decision tree with max depth 2 might underfit. The same tree with max depth 50 might memorize noise. Grid search finds the sweet spot.
How GridSearchCV works
GridSearchCV combines two ideas: exhaustive grid search and cross-validation.
You define a grid of hyperparameter values to explore. The tool trains and evaluates the model for every combination using cross-validation (typically 5-fold). This means each combination is tested on multiple train/validation splits, producing a robust score estimate.
The result: the best hyperparameter combination and a fitted model ready for predictions.
Key components:
- param_grid — a dictionary mapping parameter names to lists of values
- cv — the cross-validation strategy (integer for K-fold, or a custom splitter)
- scoring — the metric to optimize (accuracy, F1, ROC AUC, etc.)
- refit — whether to retrain the best model on the full training set (default: True)
Grid search vs. randomized search
GridSearchCV tries every combination. With 4 values for parameter A and 5 for parameter B, that’s 20 combinations × 5 CV folds = 100 model fits. This grows multiplicatively — 3 parameters with 10 values each means 5,000 fits.
RandomizedSearchCV samples a fixed number of random combinations from the parameter space. You control the budget: “try 50 random combinations” regardless of how large the space is. Research shows that randomized search finds comparable results to grid search in far fewer iterations, especially when only a few parameters actually matter.
When to use which:
- Grid search: few parameters (2-3) with small value ranges, or when you need guaranteed coverage
- Randomized search: many parameters, continuous ranges, or limited compute budget
Setting up a parameter grid
For grid search:
param_grid = {
'n_estimators': [100, 200, 500],
'max_depth': [5, 10, 20, None],
'min_samples_split': [2, 5, 10],
}
For randomized search, use distributions instead of lists:
from scipy.stats import randint, uniform
param_distributions = {
'n_estimators': randint(50, 500),
'max_depth': randint(3, 30),
'min_samples_leaf': randint(1, 20),
'learning_rate': uniform(0.01, 0.3),
}
Reading results
After fitting, GridSearchCV stores detailed results:
best_params_— the winning combinationbest_score_— the mean CV score of the best combinationcv_results_— a dictionary with scores for every combination (convertible to DataFrame for analysis)
Inspecting cv_results_ reveals not just the winner but the landscape: are many combinations close in score (flat landscape, not sensitive to tuning) or is there a sharp peak (hyperparameters matter a lot)?
Common misconception
Grid search doesn’t prevent overfitting on its own. If you use the same test set to evaluate many hyperparameter combinations, you’re effectively fitting to the test set. Always hold out a final test set that grid search never sees, or use nested cross-validation for unbiased estimates.
Practical tips
- Start with a coarse grid (few, widely-spaced values) to find the promising region, then zoom in with a fine grid
- Use
n_jobs=-1to parallelize across CPU cores - Set
verbose=2to monitor progress on long searches - For pipelines, prefix parameter names with step names:
{'clf__n_estimators': [100, 200]}
One thing to remember: Grid search is exhaustive but expensive. Start coarse, zoom in, and always keep a final test set untouched by the search process.
See Also
- Activation Functions Why neural networks need these tiny mathematical functions — and how ReLU's simplicity accidentally made deep learning possible.
- Ai Agents Architecture How AI systems go from answering questions to actually doing things — the design patterns that turn language models into autonomous agents that browse, code, and plan.
- Ai Agents ChatGPT answers questions. AI agents actually do things — browse the web, write code, send emails, and keep going until the job is done. Here's the difference.
- Ai Ethics Why building AI fairly is harder than it sounds — bias, accountability, privacy, and who gets to decide what AI is allowed to do.
- Ai Hallucinations ChatGPT sometimes makes up facts with total confidence. Here's the weird reason why — and why it's not as simple as 'the AI lied.'