Problem
New error on problem:
UserWarning: X has feature names, but LogisticRegression was fitted without feature names
warnings.warn(UserWarning: X has feature names, but LogisticRegression was fitted without feature names
Goal is to use RandomisedSearchCV for tune parameter, then fit two models, one each for OneVsOneClassifier and OneVsRestClassifier. Then check accuracy performance of each models using tuned parameter from RandomizedSearchCV. Define two models to fit, to fit on digit recognition MNIST dataset for multi-label classification prediction.
I setup a tune hyper=parameters using GridSearchCV, just simple for estimator__C': [0.1, 1, 100, 200] for LogisticRegression. For audit, I print the computed grid parameters. Provide a scaled X-train object to the fit model. Then run the fit model.
Problem was running on Kaggle GPU P100. When I execute code: ovr_grid_search.fit() & ovo_grid_search.fit(), finish running and next step error is this by Adding verbose=1 and error_score="raise" to RandomainedSearchCV classifier, and determined that needed to relocate StandardScaler with MinMaxScaler
Error Error is LogisticRegression fit with out features.
/Users/matthew/opt/anaconda3/lib/python3.9/site-packages/sklearn/base.py:443:
UserWarning: X has feature names,
but LogisticRegression was fitted without feature names
Code
from sklearn.linear_model import LogisticRegression
from sklearn.multiclass import OneVsOneClassifier, OneVsRestClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train.astype(np.float64))
ovr_model = OneVsRestClassifier(LogisticRegression())
ovo_model = OneVsOneClassifier(LogisticRegression())
param_grid = {
'estimator__C': [0.1, 1, 100, 200]
}
ovr_grid_param = RandomizedSearchCV(ovr_model, param_grid, cv=5, n_jobs=8)
ovo_grid_param = RandomizedSearchCV(ovo_model, param_grid, cv=5, n_jobs=8)
print("OneVsRestClassifier best params: ", ovr_grid_param)
print("OneVsOneClassifier best params: ", ovo_grid_param)
min_max_scaler = preprocessing.MinMaxScaler()
X_train_scaled = min_max_scaler.fit_transform(X_train)
### below code is the problem area **
ovr_grid_param.fit(X_train_scaled, y_train)
ovo_grid_param.fit(X_train_scaled, y_train)
Data The digit recognition MNIST dataset. X_train scaled data
array([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]])
y_train data
33092 5
30563 2
17064 4
16679 9
30712 0
30177 0
11735 3
1785 8
4382 3
21702 7
37516 3
9476 6
4893 5
22117 0
12646 8
RandomisedSearch execution results
OneVsRestClassifier best params: RandomizedSearchCV(cv=5, error_score='raise',
estimator=OneVsRestClassifier(estimator=LogisticRegression()),
n_jobs=3,
param_distributions={'estimator__C': [0.1, 1, 100, 200],
'estimator__max_iter': [2500, 4500,
6500, 9500,
14000]})
OneVsOneClassifier best params: RandomizedSearchCV(cv=5, error_score='raise',
estimator=OneVsOneClassifier(estimator=LogisticRegression()),
n_jobs=3,
param_distributions={'estimator__C': [0.1, 1, 100, 200],
'estimator__max_iter': [2500, 4500,
6500, 9500,
14000]})