Questions tagged [grid-search]

In machine learning, grid search refers to multiple runs to find the optimal value of parameter(s)/hyperparameter(s) of a model, e.g. mtry for random-forest or alpha, beta, lambda for glm, or C, kernel and gamma for SVM.

865 questions
5
votes
1 answer

Scikit Pipeline Parameters - fit() got an unexpected keyword argument 'gamma'

Minimum viable example included ;) Want I want to to is simply to use the parameters from GridSearchCV to use a Pipeline. #I want to create a SVM using a Pipeline, and validate the model (measure the accuracy) #import libraries from sklearn.svm…
5
votes
1 answer

Fitting in nested cross-validation with cross_val_score with pipeline and GridSearch

I am working in scikit and I am trying to tune my XGBoost. I made an attempt to use a nested cross-validation using the pipeline for the rescaling of the training folds (to avoid data leakage and overfitting) and in parallel with GridSearchCV for…
5
votes
1 answer

Custom scoring function for grid search classification

I would like to perform a GridSearchCV for a RandomForestClassifier in scikit-learn, and I have a custom scoring function that I would like to use. The scoring function will only work if probabilities are supplied (e.g. rfc.predict_proba(...) must…
mgoldwasser
  • 14,558
  • 15
  • 79
  • 103
5
votes
2 answers

Early stopping with GridSearchCV - use hold-out set of CV for validation

I want to employ the early-stopping-option in scikit-learns GridSearchCV-method. An example of this is shown in this SO-thread: import xgboost as xgb from sklearn.model_selection import GridSearchCV trainX= [[1], [2], [3], [4], [5]] trainY = [1, 2,…
N08
  • 1,265
  • 13
  • 23
5
votes
4 answers

Python and HyperOpt: How to make multi-process grid searching?

I am trying to tune some params and the search space is very large. I have 5 dimensions so far and it will probably increase to about 10. The issue is that I think I can get a significant speedup if I can figure out how to multi-process it, but I…
user1367204
  • 4,549
  • 10
  • 49
  • 78
5
votes
0 answers

KerasClassifier issue when using custom scoring with GridSearchCV for NN with multiclass output

Using custom scoring with Multiclass outputs from Keras model returns the same error for cross_val_score or GridSearchCV as below (it's on Iris, so you can run it directly to test): import numpy as np from sklearn import datasets from…
5
votes
2 answers

Can I keyboard interrupt GridSearchCV somehow and still have the best parameters gathered until that point?

So I have the following code- params = {'n_estimators': [1000, 2000], 'max_depth': [10, 20], 'min_samples_split': [2, 3], 'learning_rate': [0.1, 0.05, 0.01], 'loss': ('ls', 'huber', 'lad', 'quantile'), 'verbose': [1]} gbr =…
Ravaal
  • 3,233
  • 6
  • 39
  • 66
5
votes
1 answer

Using GridsearchCV () with holdout validation

GridsearchCV () has an argument cv whose value by default is 3 means that it is 3fold. Is there any way to use Gridsearch() with a holdout validation scheme. For example 80-20% split???
Khan
  • 81
  • 2
  • 7
5
votes
1 answer

Keras KerasClassifier gridsearch TypeError: can't pickle _thread.lock objects

The following code is throwing an error: TypeError: can't pickle _thread.lock objects I can see that it likely has to do with passing the previous method in as a function in def fit(self, c_m). But I think this is correct via the documentations:…
5
votes
1 answer

GridseachCV - ValueError: Found input variables with inconsistent numbers of samples: [33 1]

I am trying to use gridsearchCV with on my keras model, but seem to have ran into a error which i am not sure how to interpret. Traceback (most recent call last): File "keras_cnn_phoneme_generator_fit.py", line 229, in
Fixining_ranges
  • 223
  • 1
  • 13
5
votes
2 answers

How to use `log_loss` in `GridSearchCV` with multi-class labels in Scikit-Learn (sklearn)?

I'm trying to use the log_loss argument in the scoring parameter of GridSearchCV to tune this multi-class (6 classes) classifier. I don't understand how to give it a label parameter. Even if I gave it sklearn.metrics.log_loss, it would change for…
O.rka
  • 29,847
  • 68
  • 194
  • 309
5
votes
1 answer

RandomizedSearchCV gives different results using the same random_state

I am using a pipeline to perform feature selection and hyperparameter optimization using RandomizedSearchCV. Here is a summary of the code: from sklearn.cross_validation import train_test_split from sklearn.ensemble import…
5
votes
3 answers

Parallel error with GridSearchCV, works fine with other methods

I am encounteringt the following problems using GridSearchCV: it gives me a parallel error while using n_jobs > 1. At the same time n_jobs > 1 works fine with the single models like RadonmForestClassifier. Below is a simple working example showing…
Alessandro
  • 845
  • 11
  • 21
5
votes
1 answer

What is _passthrough_scorer and How Can I Change Scorers in GridsearchCV (sklearn)?

http://scikit-learn.org/stable/modules/generated/sklearn.grid_search.GridSearchCV.html (for reference) x = [[2], [1], [3], [1] ... ] # about 1000 data grid = GridSearchCV(KernelDensity(), {'bandwidth': np.linspace(0.1, 1.0, 10)},…
user5790923
5
votes
1 answer

Getting progress updates from GridSearchCV with scikit-learn

I am currently implementing a Support Vector Regression in Python, where I am estimating the parameters C and gamma through the GridSearchCV. I am initially searching from approximately 400 combinations of C and gamma. This is a very exhaustive…
No_Socks
  • 65
  • 1
  • 5