Are there any feature selection methods in Scikit-Learn
(or algos in general) that give weights of an attribute's ability/predictive-capacity/importance to predict a specific target? For example, the from sklearn.datasets import load_iris
, ranking each of the 4 attributes weights to predict the 3 iris species separately but for much more complex datasets w/ ~1k-10k attributes.
I'm looking for something analogous to the feature_importances_
from RandomForestClassifier. However, RandomForestClassifer
gives weights to each attribute for the entire prediction process. The weights do not need to add up to one but I want to find a way to correlate a specific subset of attributes to a specific target.
First I tried "overfitting" the models to enrich for a specific target but the results didn't seem to change much between targets. Second, I tried going the ordination route by finding which attributes have the greatest variation but that doesn't directly translate to predictive capacity. Third, I tried sparse models but I encountered the same problem as using feature_importances_
.
A link to an example or tutorial that does exactly this is sufficient. Possibly a tutorial on how to traverse decision trees in a random forest and store the nodes that are predictive of specific targets.