Overall, I want to explain NDVI trends at global scale and then visualize which feature was the most important at pixel level. I use SHAP to get the feature importance and can easily visualize them. Is there then a way to disaggregate global feature importance to local scale? I can see that there is shap.image_plot(shap_values), so it should be possible. However, I struggle to implement that for my example:
Y = df['NDVI']
X = df[['X1', 'X2', ...]]
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3, random_state=42)
# parameters previously identified from random search
params = {'subsample': 0.5, 'reg_lambda': 100, 'n_estimators': 2129, 'min_child_weight': 1.0, 'max_depth': 11, 'learning_rate': 0.03375, 'gamma': 0.0, 'colsample_bytree': 0.4, 'colsample_bylevel': 0.5}
xgb_opt = xgb.XGBRegressor(**params).fit(X_train, y_train)
explainer = shap.TreeExplainer(xgb_opt)
shap_values = explainer.shap_values(X_test)
top_features = X_test.columns[np.argsort(np.abs(shap_values).mean(0))][::-1][0:10]
shap.summary_plot(shap_values, X_test, feature_names=X_test.columns, show=False, plot_type="bar", alpha = 0.25, max_display=15)
What I want in the end is actually just an additional column in my original dataframe df, that associated each row (pixel) with a feature X1 or X2 or .... Can anyone help with that?