I have some datasets with 4 features and observations between 100 and 300. I would like to use them to perform a classification. The target variable has 3 possible labels. I have trained a Random Forest and as the interpretation and understanding of the result and the feature selection step are more important than the result itself, I have also calculated SHAP values.
I felt comfortable using them, but I fear that the model is too simple for such an advanced XAI. Since I am still a beginner with ML, I would like to ask your opinion. Would you suggest a different model, a different approach to explain the model and to select the most important features? Thanks a lot in advance
EDIT: Maybe I can also give you some details about my problem: I applied a cluster analysis and identified three clusters in the data. The data set also has other features, but I performed the cluster analysis considering only two numerical features. It is important that only these two features are considered because they lead to a result that can be highly understood by the users of the results of this analysis. Now I want to figure out why these three classes exist. I have therefore fitted a random forest, considering that the class obtained with the cluster analysis is the dependent variable, while the remaining features are the independent variables. By looking at the predictive ability of the random forest and the SHAP values, I can explain which variables are important in predicting the class, and thus how come the three classes exist. Do you think this approach can be reasonable?