0

I am trying to estimate a cluster dendrogram in R for a structural topic model I produced with 98 topics.

I first ran the following which worked well:

res.hc <- eclust(scale(out_corr$cor), "hclust", nboot = 500)

I then attempting to visualize the dendrogram using the following syntax:

fviz_dend(res.hc, rect = TRUE)

Here, I received the following error: Error in .rect_dendrogram(dend, k = k, palette = rect_border, rect_fill = rect_fill, : k must be between 2 and 97

Is this because the number of topics in my model is 98? If so, is there a way to still visualize the dendrogram without reducing my topics to 97?

Thank you!

Tal Galili
  • 24,605
  • 44
  • 129
  • 187
md_14
  • 125
  • 1
  • 2
  • 9
  • UPDATE: I tried running the same cluster analysis with 97 topics and now I am getting an error that says K must be between 2 and 96 [Error in .rect_dendrogram(dend, k = k, palette = rect_border, rect_fill = rect_fill, : k must be between 2 and 96]. Does anyone know the maximum number of topics that can be clustered and/or how to get around this? – md_14 Apr 02 '22 at 02:50
  • Could you please try to recreate this with dendextend::rect.dendrogram and see which errors you are getting? If you can create a self contained example of the error that would be super helpful in helping to debug this. – Tal Galili Apr 02 '22 at 20:37
  • I will give it a try and see whether it clarifies things. Thank you! – md_14 Apr 05 '22 at 21:49
  • It worked. Thank you so much! I will post the solution shortly. – md_14 Apr 05 '22 at 22:50

1 Answers1

1

The following steps helped to resolve the issue:

  1. estimate cluster dendrogram
res.hc <- eclust(scale(out_corr$cor), "hclust", nboot = 500)
  1. install dendextend
install.packages("dendextend")
library(dendextend)
  1. install dplyr
install.packages("dplyr")
library(dplyr)
  1. save cluster estimate as a dendrogram
dend<-as.dendrogram(res.hc)
  1. color in cluster levels
par(mar=c(1,1,1,7))
dend %>%
  set("labels_col", value = c("skyblue", "red", "grey", "blue"), k=4) %>%
  set("branches_k_color", value = c("skyblue", "red", "grey", "blue"), k = 4) %>%
  plot(horiz=FALSE, axes=FALSE)
abline(v = 350, lty = 2)
md_14
  • 125
  • 1
  • 2
  • 9