Questions tagged [grid-search]

In machine learning, grid search refers to multiple runs to find the optimal value of parameter(s)/hyperparameter(s) of a model, e.g. mtry for random-forest or alpha, beta, lambda for glm, or C, kernel and gamma for SVM.

865 questions
7
votes
2 answers

Fitting sklearn GridSearchCV model

I am trying to solve a regression problem on Boston Dataset with help of random forest regressor.I was using GridSearchCV for selection of best hyperparameters. Problem 1 Should I fit the GridSearchCV on some X_train, y_train and then get the best…
Rookie_123
  • 1,975
  • 3
  • 15
  • 33
7
votes
1 answer

What does rank_test_score stand for from the model.cv_results_?

After I have built a model with GridSearchCV, I get the cross validation results with model.cv_results_. But among the results one parameter is confusing to me. What does rank_test_score stand for in this? mean_fit_time …
JimminyCricket
  • 371
  • 3
  • 14
7
votes
1 answer

How to get all the models (one for each set of parameters) using GridSearchCV?

From my understanding: best_estimator_ provides the estimator with highest score; best_score_ provides the score of the selected estimator; cv_results_ may be exploited to get the scores of all estimators. However, it is not clear to me how to get…
7
votes
1 answer

GridsearchCV: can't pickle function error when trying to pass lambda in parameter

I have looked quite extensively on stackoverflow and elsewhere and I can't seem to find an answer to the problem below. I am trying to modify a parameter of a function that is itself a parameter inside the GridSearchCV function of sklearn. More…
Eric F
  • 327
  • 2
  • 11
7
votes
1 answer

Pipeline and GridSearch for Doc2Vec

I currently have following script that helps to find the best model for a doc2vec model. It works like this: First train a few models based on given parameters and then test against a classifier. Finally, it outputs the best model and classifier (I…
Christopher
  • 2,120
  • 7
  • 31
  • 58
7
votes
1 answer

Get standard deviation for a GridSearchCV

Before scikit-learn 0.20 we could use result.grid_scores_[result.best_index_] to get the standard deviation. (It returned for exemple: mean: 0.76172, std: 0.05225, params: {'n_neighbors': 21}) What's the best way in scikit-learn 0.20 to get the…
Neabfi
  • 4,411
  • 3
  • 32
  • 42
7
votes
4 answers

Python: Gridsearch Without Machine Learning?

I want to optimize an algorithm that has several variable parameters as input. For machine learning tasks, Sklearn offers the optimization of hyperparameters with the gridsearch functionality. Is there a standardized way / library in Python that…
user9098929
7
votes
2 answers

Hyperparameter in Voting classifier

So, I have a classifier which looks like clf = VotingClassifier(estimators=[ ('nn', MLPClassifier()), ('gboost', GradientBoostingClassifier()), ('lr', LogisticRegression()), ], voting='soft') And I want to…
frazman
  • 32,081
  • 75
  • 184
  • 269
7
votes
1 answer

How does 'max_samples' keyword for a Bagging classifier effect the number of samples being used for each of the base estimators?

I want to understand how max_samples value for a Bagging classifier effects the number of samples being used for each of the base estimators. This is the GridSearch output: GridSearchCV(cv=5, error_score='raise', …
hkhare
  • 225
  • 3
  • 10
7
votes
1 answer

Grid search with f1 as scoring function, several pages of error message

Want to use Gridsearch to find best parameters and use f1 as the scoring metric. If i remove the scoring function, all works well and i get no errors. Here is my code: from sklearn import grid_search parameters =…
hmmmbob
  • 1,167
  • 5
  • 19
  • 33
7
votes
3 answers

Sklearn: Evaluate performance of each classifier of OneVsRestClassifier inside GridSearchCV

I am dealing with multi-label classification with OneVsRestClassifier and SVC, from sklearn.datasets import make_multilabel_classification from sklearn.multiclass import OneVsRestClassifier from sklearn.svm import SVC from sklearn.grid_search…
Francis
  • 6,416
  • 5
  • 24
  • 32
7
votes
3 answers

GridSearchCV scoring parameter: using scoring='f1' or scoring=None (by default uses accuracy) gives the same result

I'm using an example extracted from the book "Mastering Machine Learning with scikit learn". It uses a decision tree to predict whether each of the images on a web page is an advertisement or article content. Images that are classified as being…
6
votes
1 answer

Why GridSearchCV model results are different than the model I manually tuned?

this is my first question ever here I hope I am doing this right, I was working on titanic dataset which is popular on kaggle, this tutarial if u wanna check A Data Science Framework: To Achieve 99% Accuracy the part 5.2, it teaches how to…
6
votes
1 answer

Combination of GridSearchCV's refit and scorer unclear

I use GridSearchCV to find the best parameters in the inner loop of my nested cross-validation. The 'inner winner' is found using GridSearchCV(scorer='balanced_accuracy'), so as I understand the documentation the model with the highest balanced…
Johannes Wiesner
  • 1,006
  • 12
  • 33
6
votes
2 answers

Ridge Regression Grid Search with Pipeline

I am trying to optimize hyperparameters for ridge regression. But also add polynomial features. So, pipeline looks okay but getting error when try to gridsearchcv. Here: # Importing the Libraries import numpy as np import pandas as pd import…
cepel
  • 93
  • 1
  • 9