In machine learning, grid search refers to multiple runs to find the optimal value of parameter(s)/hyperparameter(s) of a model, e.g. mtry for random-forest or alpha, beta, lambda for glm, or C, kernel and gamma for SVM.
Questions tagged [grid-search]
865 questions
5
votes
1 answer
Scikit Pipeline Parameters - fit() got an unexpected keyword argument 'gamma'
Minimum viable example included ;)
Want I want to to is simply to use the parameters from GridSearchCV to use a Pipeline.
#I want to create a SVM using a Pipeline, and validate the model (measure the accuracy)
#import libraries
from sklearn.svm…

Federico Dorato
- 710
- 9
- 27
5
votes
1 answer
Fitting in nested cross-validation with cross_val_score with pipeline and GridSearch
I am working in scikit and I am trying to tune my XGBoost.
I made an attempt to use a nested cross-validation using the pipeline for the rescaling of the training folds (to avoid data leakage and overfitting) and in parallel with GridSearchCV for…

inatos
- 83
- 1
- 4
5
votes
1 answer
Custom scoring function for grid search classification
I would like to perform a GridSearchCV for a RandomForestClassifier in scikit-learn, and I have a custom scoring function that I would like to use.
The scoring function will only work if probabilities are supplied (e.g. rfc.predict_proba(...) must…

mgoldwasser
- 14,558
- 15
- 79
- 103
5
votes
2 answers
Early stopping with GridSearchCV - use hold-out set of CV for validation
I want to employ the early-stopping-option in scikit-learns GridSearchCV-method. An example of this is shown in this SO-thread:
import xgboost as xgb
from sklearn.model_selection import GridSearchCV
trainX= [[1], [2], [3], [4], [5]]
trainY = [1, 2,…

N08
- 1,265
- 13
- 23
5
votes
4 answers
Python and HyperOpt: How to make multi-process grid searching?
I am trying to tune some params and the search space is very large. I have 5 dimensions so far and it will probably increase to about 10. The issue is that I think I can get a significant speedup if I can figure out how to multi-process it, but I…

user1367204
- 4,549
- 10
- 49
- 78
5
votes
0 answers
KerasClassifier issue when using custom scoring with GridSearchCV for NN with multiclass output
Using custom scoring with Multiclass outputs from Keras model returns the same error for cross_val_score or GridSearchCV as below (it's on Iris, so you can run it directly to test):
import numpy as np
from sklearn import datasets
from…

Dan Brice
- 51
- 4
5
votes
2 answers
Can I keyboard interrupt GridSearchCV somehow and still have the best parameters gathered until that point?
So I have the following code-
params = {'n_estimators': [1000, 2000], 'max_depth': [10, 20], 'min_samples_split': [2, 3],
'learning_rate': [0.1, 0.05, 0.01], 'loss': ('ls', 'huber', 'lad', 'quantile'), 'verbose': [1]}
gbr =…

Ravaal
- 3,233
- 6
- 39
- 66
5
votes
1 answer
Using GridsearchCV () with holdout validation
GridsearchCV () has an argument cv whose value by default is 3 means that it is 3fold. Is there any way to use Gridsearch() with a holdout validation scheme. For example 80-20% split???

Khan
- 81
- 2
- 7
5
votes
1 answer
Keras KerasClassifier gridsearch TypeError: can't pickle _thread.lock objects
The following code is throwing an error:
TypeError: can't pickle _thread.lock objects
I can see that it likely has to do with passing the previous method in as a function in def fit(self, c_m). But I think this is correct via the documentations:…

Isaac
- 215
- 2
- 15
5
votes
1 answer
GridseachCV - ValueError: Found input variables with inconsistent numbers of samples: [33 1]
I am trying to use gridsearchCV with on my keras model, but seem to have ran into a error which i am not sure how to interpret.
Traceback (most recent call last):
File "keras_cnn_phoneme_generator_fit.py", line 229, in
…

Fixining_ranges
- 223
- 1
- 13
5
votes
2 answers
How to use `log_loss` in `GridSearchCV` with multi-class labels in Scikit-Learn (sklearn)?
I'm trying to use the log_loss argument in the scoring parameter of GridSearchCV to tune this multi-class (6 classes) classifier. I don't understand how to give it a label parameter. Even if I gave it sklearn.metrics.log_loss, it would change for…

O.rka
- 29,847
- 68
- 194
- 309
5
votes
1 answer
RandomizedSearchCV gives different results using the same random_state
I am using a pipeline to perform feature selection and hyperparameter optimization using RandomizedSearchCV. Here is a summary of the code:
from sklearn.cross_validation import train_test_split
from sklearn.ensemble import…

MhFarahani
- 960
- 2
- 9
- 19
5
votes
3 answers
Parallel error with GridSearchCV, works fine with other methods
I am encounteringt the following problems using GridSearchCV: it gives me a parallel error while using n_jobs > 1. At the same time n_jobs > 1 works fine with the single models like RadonmForestClassifier.
Below is a simple working example showing…

Alessandro
- 845
- 11
- 21
5
votes
1 answer
What is _passthrough_scorer and How Can I Change Scorers in GridsearchCV (sklearn)?
http://scikit-learn.org/stable/modules/generated/sklearn.grid_search.GridSearchCV.html (for reference)
x = [[2], [1], [3], [1] ... ] # about 1000 data
grid = GridSearchCV(KernelDensity(), {'bandwidth': np.linspace(0.1, 1.0, 10)},…
user5790923
5
votes
1 answer
Getting progress updates from GridSearchCV with scikit-learn
I am currently implementing a Support Vector Regression in Python, where I am estimating the parameters C and gamma through the GridSearchCV. I am initially searching from approximately 400 combinations of C and gamma. This is a very exhaustive…

No_Socks
- 65
- 1
- 5