I am reading currently this documentation https://scikit-learn.org/stable/auto_examples/inspection/plot_permutation_importance_multicollinear.html#handling-multicollinear-features in order to handle multicollinearity in a dataset. It says "Next, we manually pick a threshold by visual inspection of the dendrogram to group our features into clusters and choose a feature from each cluster to keep, select those features from our dataset, and train a new random forest." I am not sure how to pick a threshold for different datasets? Is there a default value which should always work or I should somehow understand the dendrogram, or is there any Python implementation which does this automatically?
Thank you!