In machine learning, grid search refers to multiple runs to find the optimal value of parameter(s)/hyperparameter(s) of a model, e.g. mtry for random-forest or alpha, beta, lambda for glm, or C, kernel and gamma for SVM.
Questions tagged [grid-search]
865 questions
6
votes
0 answers
Print from within Joblib Parallel function in Jupyter notebook
Is it possible to print things or debug when using Parallel in a Jupyter notebook.
Here is my code
import pandas as pd
from sklearn.model_selection import ParameterGrid
from joblib import Parallel, delayed
def my_func(a,b):
print("hi")
…

blissweb
- 3,037
- 3
- 22
- 33
6
votes
0 answers
GridSearchCV - How to limit memory usage
I am performing grid search with GridSearchCV (scikit-learn) on Spark and Linux. For this reason, I am running nohup ./spark_python_shell.sh > output.log & at my bash shell to ignite the Spark cluster and I also get my python script running (see…

Outcast
- 4,967
- 5
- 44
- 99
6
votes
1 answer
How to specify positive label when use precision as scoring in GridSearchCV
model = sklearn.model_selection.GridSearchCV(
estimator = est,
param_grid = param_grid,
scoring = 'precision',
verbose = 1,
n_jobs = 1,
iid = True,
cv = 3)
In…

Hachiko
- 83
- 2
- 5
6
votes
2 answers
How does GridSearchCV compute training scores?
I'm having a hard time figuring out parameter return_train_score in GridSearchCV. From the docs:
return_train_score : boolean, optional
If False, the cv_results_ attribute will not include training scores.
My question is: what are the…

Tonechas
- 13,398
- 16
- 46
- 80
6
votes
2 answers
GridSearchCV - access to predicted values across tests?
Is there a way to get access to the predicted values calculated within a GridSearchCV process?
I'd like to be able to plot the predicted y values against their actual values (from the test/validation set).
Once the grid search is complete, I can…

tmn103
- 319
- 1
- 5
- 16
6
votes
1 answer
How many combinations will GridSearchCV run for this?
Using sklearn to run a grid search on a random forest classifier. This has been running for longer than I thought, and I am trying to estimate how much time is left for this process. I thought the total number of fits it would do would be 3*3*3*3*5…

user4446237
- 636
- 8
- 21
6
votes
2 answers
sample_weight parameter shape error in scikit-learn GridSearchCV
Passing the sample_weight parameter to GridSearchCV raises an error due to incorrect shape. My suspicion is that cross validation is not capable of handling the split of sample_weights accordingly with the dataset.
First part: Using sample_weight as…

Manuel Castejón Limas
- 61
- 1
- 10
6
votes
0 answers
Nested GridSearchCV
For a given model type, I want to both 1) tune parameters for various model types and 2) find the best tuned model type. I would like to use GridSearchCV for this.
I was able to run the following, but I am also concerned that this is not working…

mgoldwasser
- 14,558
- 15
- 79
- 103
6
votes
2 answers
Model help using Scikit-learn when using GridSearch
As part of the Enron project, built the attached model, Below is the summary of the steps,
Below model gives highly perfect scores
cv = StratifiedShuffleSplit(n_splits = 100, test_size = 0.2, random_state = 42)
gcv = GridSearchCV(pipe,…

naveenpitchai
- 98
- 1
- 8
6
votes
1 answer
"Parallel" pipeline to get best model using gridsearch
In sklearn, a serial pipeline can be defined to get the best combination of hyperparameters for all consecutive parts of the pipeline. A serial pipeline can be implemented as follows:
from sklearn.svm import SVC
from sklearn import decomposition,…

Oblomov
- 8,953
- 22
- 60
- 106
6
votes
1 answer
GridSearchCV does not give the same results as expected when compared to xgboost.cv
when comparing sklearn.GridSearchCV with xgboost.cv I get different results...below I explain what I would like to do:
1) import libraries
import numpy as np
from sklearn import datasets
import xgboost as xgb
from sklearn.model_selection import…

gabboshow
- 5,359
- 12
- 48
- 98
6
votes
1 answer
How to properly merge outputs from models in the ensemble?
I am trying to figure out how to properly create regression ensembles. I know there are various options. I use the following approach.
First I define models like Linear Regression, GBM, etc. Then I run GridSearchCV for each of these models to know…

Klausos Klausos
- 15,308
- 51
- 135
- 217
5
votes
1 answer
Can you get all estimators from an sklearn grid search (GridSearchCV)?
I recently tested many hyperparameter combinations using sklearn.model_selection.GridSearchCV. I want to know if there is a way to call all previous estimators that were trained in the process.
search = GridSearchCV(estimator=my_estimator,…

Arturo Sbr
- 5,567
- 4
- 38
- 76
5
votes
1 answer
How to determine best parameters and best score for each scoring metric in GridSearchCV
I am trying to evaluate multiple scoring metrics to determine the best parameters for model performance. i.e., to say:
To maximize F1, I should use these parameters. To maximize precision, I
should use these parameters.
I am working off the…

artemis
- 6,857
- 11
- 46
- 99
5
votes
1 answer
LightGBM error : ValueError: For early stopping, at least one dataset and eval metric is required for evaluation
I am trying to train a LightGBM with gridsearch, I get the below error when I try to train model.
ValueError: For early stopping, at least one dataset and eval metric is required for evaluation
I have provided validation dataset and evaluation…

deep
- 91
- 1
- 2
- 8