Questions tagged [grid-search]

In machine learning, grid search refers to multiple runs to find the optimal value of parameter(s)/hyperparameter(s) of a model, e.g. mtry for random-forest or alpha, beta, lambda for glm, or C, kernel and gamma for SVM.

865 questions
8
votes
2 answers

How can I plot validation curves using the results from GridSearchCV?

I am training a model with GridSearchCV in order to find the best parameters Code: grid_params = { 'n_estimators': [100, 200, 300, 400], 'criterion': ['gini', 'entropy'], 'max_features': ['auto', 'sqrt', 'log2'] } gs = GridSearchCV( …
Tlaloc-ES
  • 4,825
  • 7
  • 38
  • 84
8
votes
3 answers

How to get decision function in randomforest in sklearn

I am using the following code to get the optimised parameters for randomforest using gridsearchcv. x_train, x_test, y_train, y_test = train_test_split(X, y, random_state=0) rfc = RandomForestClassifier(random_state=42, class_weight =…
EmJ
  • 4,398
  • 9
  • 44
  • 105
8
votes
2 answers

Scoring in Gridsearch CV

I just started with GridSearchCV in Python, but I am confused what is scoring in this. Somewhere I have seen scorers = { 'precision_score': make_scorer(precision_score), 'recall_score': make_scorer(recall_score), 'accuracy_score':…
KMittal
  • 602
  • 1
  • 7
  • 21
8
votes
1 answer

GridSearchCV.best_score not same as cross_val_score(GridSearchCV.best_estimator_)

Consider the following gridsearch : grid = GridSearchCV(clf, parameters, n_jobs =-1, iid=True, cv =5) grid_fit = grid.fit(X_train1, y_train1) According to Sklearn's ressource, grid_fit.best_score_ returns The mean cross-validated score of the…
Eric F
  • 327
  • 2
  • 11
8
votes
0 answers

Using GridSearchCV with a set of multiple scorers errors out

I am trying to use GridSearchCV to optimize an analysis I am doing, and I have read that it supports multiple scoring methods, and I have found an example of this method elsewhere (example), but when I attempt to run a GridSearchCV with multiple…
Alex
  • 143
  • 6
8
votes
2 answers

Python - LightGBM with GridSearchCV, is running forever

Recently, I am doing multiple experiments to compare Python XgBoost and LightGBM. It seems that this LightGBM is a new algorithm that people say it works better than XGBoost in both speed and accuracy. This is LightGBM GitHub. This is LightGBM…
Cherry Wu
  • 3,844
  • 9
  • 43
  • 63
8
votes
3 answers

What is the meaning of 'mean_test_score' in cv_result?

Hello I'm doing a GridSearchCV and I'm printing the result with the .cv_results_ function from scikit learn. My problem is that when I'm evaluating by hand the mean on all the test score splits I obtain a different number compared to what it is…
Dipe
  • 91
  • 1
  • 1
  • 4
8
votes
1 answer

GridSearchCV.best_score_ meaning when scoring set to 'accuracy' and CV

I'm trying to find the best model Neural Network model applied for the classification of breast cancer samples on the well-known Wisconsin Cancer dataset (569 samples, 31 features + target). I'm using sklearn 0.18.1. I'm not using Normalization so…
Taka
  • 659
  • 2
  • 10
  • 17
8
votes
1 answer

using best params from gridsearchcv

I don't know if it is the right question to ask here, but I will ask anyways. If it is not allowed please do let me know. I have used GridSearchCV to tune parameters to find best accuracy. This is what I have done: from sklearn.grid_search import…
Cybercop
  • 8,475
  • 21
  • 75
  • 135
8
votes
1 answer

Combining Recursive Feature Elimination and Grid Search in scikit-learn

I am trying to combine recursive feature elimination and grid search in scikit-learn. As you can see from the code below (which works), I am able to get the best estimator from a grid search and then pass that estimator to RFECV. However, I would…
Mark Conway
  • 106
  • 1
  • 7
7
votes
4 answers

GridSearchCV - FitFailedWarning: Estimator fit failed

I am running this: # Hyperparameter tuning - Random Forest # # Hyperparameters' grid parameters = {'n_estimators': list(range(100, 250, 25)), 'criterion': ['gini', 'entropy'], 'max_depth': list(range(2, 11, 2)), 'max_features': [0.1,…
Outcast
  • 4,967
  • 5
  • 44
  • 99
7
votes
0 answers

Why scikit-learn switches to SequentialBackend?

I try to run the following code on a machine with 16 available CPUs: def tokenizer(text): return text.split() param_grid = [{'vect__stop_words': [None, stop], 'vect__binary': [True, False]}] bow =…
7
votes
2 answers

Are the k-fold cross-validation scores from scikit-learn's `cross_val_score` and `GridsearchCV` biased if we include transformers in the pipeline?

Data pre-processers such as StandardScaler should be used to fit_transform the train set and only transform (not fit) the test set. I expect the same fit/transform process applies to cross-validation for tuning the model. However, I found…
7
votes
1 answer

Grid Search for Keras with multiple inputs

I am trying to do a grid search over my hyperparameters for tuning a deep learning architecture. I have multiple input options to the model and I am trying to use sklearn's grid search api. The problem is, grid search api only takes single array as…
Biswadip Mandal
  • 534
  • 1
  • 4
  • 15
7
votes
3 answers

Random Forest tuning with RandomizedSearchCV

I have a few questions concerning Randomized grid search in a Random Forest Regression Model. My parameter grid looks like this: random_grid = {'bootstrap': [True, False], 'max_depth': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110,…
raffa_sa
  • 415
  • 2
  • 4
  • 13