-1

Let's assume that I have data with 1000 features. I want to apply SVM-RFE on this data where each time 10% for the features are removed. How one can get the accuracy overall the levels of the elimination stages. For example, I want to get performance over 1000 features, 900 features, 800 features,....,2 features, 1 feature.

Also, I want to keep track of the features in each level.

Venkatachalam
  • 16,288
  • 9
  • 49
  • 77
  • pls edit your code with some code, what did you try so far? how did you started – PV8 Nov 19 '19 at 09:32

1 Answers1

0

Current framework doesn't score the model / store the feature set at each iteration for RFE.

May be you can get the scoring using the private function, which is intended to used for RFECV class.

>>> from sklearn.datasets import make_friedman1
>>> from sklearn.feature_selection import RFE
>>> from sklearn.svm import SVR
>>> from sklearn.model_selection._validation import _score
>>> X, y = make_friedman1(n_samples=50, n_features=10, random_state=0)
>>> estimator = SVR(kernel="linear")
>>> selector = RFE(estimator, 5, step=1)
>>> from sklearn.metrics import check_scoring
>>> scorer = check_scoring(estimator, 'r2')
>>> selector._fit(
...         X, y, lambda estimator, features:
...         _score(estimator, X[:, features], y, scorer))
RFE(estimator=SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1,
                  gamma='scale', kernel='linear', max_iter=-1, shrinking=True,
                  tol=0.001, verbose=False),
    n_features_to_select=5, step=1, verbose=0)

>>> selector.scores_
[0.6752027280057595, 0.6750531506827873, 0.6722333425078437, 0.6684835939207456, 0.6669024507875724, 0.6751247326304468]
>>> selector.ranking_
array([1, 1, 1, 1, 1, 6, 4, 3, 2, 5])

If you want to retrieve the feature set at each level/iteration, you need to edit the fit method.

Another option, you can iterator on top of rfe and then store the feature set and performance.

Venkatachalam
  • 16,288
  • 9
  • 49
  • 77