iw ould like to get a dataframe of important features. With the code below i have got the shap_values and i am not sure, what do the values mean. In my df are 142 features and 67 experiments, but got an array with ca. 2500 values.
explainer = shap.TreeExplainer(rf)
shap_values = explainer.shap_values(X_test)
shap.summary_plot(shap_values, X_test, plot_type="bar")
I have tried to store them in a df:
rf_resultX = pd.DataFrame(shap_values, columns = ['shap_values'])
but got: ValueError: Shape of passed values is (18, 142), indices imply (18, 1)
142 - the number of the features. 18 - i have no idea.
I believe it works as follows:
- shap_values need to be averaged.
- and paired with the feature names: pd.DataFrame(feature_names, columns = ['feature_names'])
Does anybody have an experience, how to interpret shap_values? At first i thought, that the number of values are the number of features x number of rows.