I tried to reproduce your case using iris
dataset with 4 features; nevertheless, you can try to extend this experiment with diabetes
dataset, which has 10 features close to your dataset, as shown below:
#Read and load sample dataset
from sklearn.datasets import load_iris, load_diabetes
X, y = load_iris(return_X_y=True)
iris = load_iris()
X, y = iris.data, iris.target
#Data Set Characteristics
#print(iris.DESCR)
#Fit SVC to data and extract results of feature selection
from sklearn.feature_selection import SelectFromModel
from sklearn.svm import SVC
from time import time
tic = time()
selector = SelectFromModel(estimator=SVC(kernel = 'linear')).fit(X, y)
toc = time()
#Plot the results
import matplotlib.pyplot as plt
import numpy as np
importance = np.abs(selector.estimator_.coef_)
feature_names = np.array(iris.feature_names)
print(f"Features selected by SelectFromModel: {feature_names[selector.get_support()]}")
#-->Features selected by SelectFromModel: ['petal length (cm)' 'petal width (cm)']
print(f"Done in {toc - tic:.3f}s") #-->Done in 0.002s
#print(X.shape) #-->(150, 4)
#print(selector.transform(X).shape) #-->(150, 2)
plt.bar(height=importance[1], x=feature_names)
plt.xticks(rotation=45, ha='right')
plt.title("Feature importances via coefficients")
plt.show()
So I inspired this post to reflect the results in plot:

print(selector.threshold_)
print(selector.estimator_.coef_)
print(selector.estimator_.coef_.shape) #-->(3, 4)
#2.1645593987132914
#[[-0.04625854 0.5211828 -1.00304462 -0.46412978]
# [-0.00722313 0.17894121 -0.53836459 -0.29239263]
# [ 0.59549776 0.9739003 -2.03099958 -2.00630267]]
Why sklearn SelectFromModel estimator_.coef_ return a 2d-array?
If you check the input and output of the selector model, both are 2-dimension arrays. It makes sense for the output of selector.estimator_.coef_
since SelectFromModel() as a meta-transformer selects features based on importance weights estimation pairwisely in coef_
matrix.
print(X.ndim) #-->2d-array
print(selector.estimator_.coef_.ndim) #-->2d-array
Also based on this post, you can check if there is a zero vector in matrice or simply use get_support()
to return a boolean array mapping the selection of each feature:
X_new = selector.transform(X)
print(X_new.shape) #-->(150, 2)
print(selector.get_support()) #-->[False False True True]
By the way, what's the difference between SelectKBest and SelectFromModel for feature selection in sklearn.
Based on the documentation of
SelectKBest() it selects features according to the k
highest scores. It "removes all but the k
highest scoring features" Reference. It is used commonly with chi2() to "Compute chi-squared stats between eachfor non-negative feature and class." Here is good start point for comparison.
Edit:
I also found the workaround here about regularisation to remove non-important features from the dataset and played with coef.
and updated the post:
selector.transform(X)
#print(selector.transform(X))
features = array(iris.feature_names)
print("All features:", features)
#All features: ['sepal length (cm)' 'sepal width (cm)' 'petal length (cm)' 'petal width (cm)']
print("Selected features:", features[status])
#Selected features: ['petal length (cm)' 'petal width (cm)']
# note here the absolute transformation before the mean
print("absolute transformation before the mean:", abs(selector.estimator_.coef_).mean()*1.25)
#absolute transformation before the mean: 1.3025301767884683
print('features with coefficients shrank to zero: {}'.format(np.sum(selector.estimator_.coef_ == 0)))
#features with coefficients shrank to zero: 0