4

I'm trying to customize a clustering plot using both base R functions and the package "dendextend". Firstly I generate a cluster with the common hclust() function. Then I'm using "dendextend" to color the branches defined by k=groups. Then I'm using plot(), points() and text() to the final customizations. To use the function set() of "denextend" to color the branches the class of the cluster should be changed from "hclust" to "as.dendrogram". When I use the plot() with the as.dendrogram, the plot area of the figure seems to shink, mainly below the x-axis. I've tried to change a lot of par() arguments but it doesn't work. Although package "dendextend" provides a lot of customizations, it doesn't permit rotate the labels of the cluster objects. Here follow some code and the figure. I'm not a native English speaker, so please ignore grammatical mistakes.

I'm using a dataset "env" in "Doubs.RData" that can be downloaded in the link below: enter link description here

env_scaled <- apply(env, 2, scale)

dist_env <- dist(env_scaled)

cluster_env <- hclust(dist_env, method = "ward.D2")

cluster <- as.dendrogram(cluster_env) %>% 
  set("branches_k_color", color1, k=3) %>% 
  set("labels", "") %>% 
  set("hang", -0.1) %>% 
  set("branches_lwd", 1.5

color_cluster <- c(rep("blue4", 4), rep("forestgreen", 17), rep("firebrick", 9))

plot(cluster, ylab="Euclidean distance", xlab="", sub="", main= "Cluster (Ward)")
points(seq(1:30), rep(0, 30), col=alpha(color_cluster, 0.8), 
       pch=c(rep(19, 4), rep(15,17), rep(17, 9)), cex=1.5)
text(seq(1:30), rep(-0.5,30), labels=cluster_env$order, cex=0.8, 
     col=color_cluster, font=2)

Customized cluster

The data(USArrest) could be used to reproduce the above example.

dend <- USArrests %>%
  dist() %>%
  hclust(method = "ave") %>%
  as.dendrogram() %>% 
  set("labels", "")
d2 <- color_branches(dend, 5)
plot(d2)
text(seq(1:nrow(USArrests)), rep(-5, nrow(USArrests)), labels= 1:nrow(USArrests), cex=0.8, font=2)

Note that when we set the labels in -5.0 (y-axis) with the text() function it does not appears entirely.

  • 2
    I’m voting to close this question because it is about how to use R without a reproducible example. – gung - Reinstate Monica Feb 17 '22 at 01:23
  • Hi, the cluster graph above can be generated with any matrix (row x col) with the variables in the columns. But you can download the same dataset (env_Doubs) in the "Numerical Ecology With R" site (http://adn.biol.umontreal.ca/~numericalecology/numecolR/NEwR-2ed_code_data.zip) –  Feb 17 '22 at 13:34
  • Read the info at the linked thread, & then edit your Q to make it a reproducible example, then this would be an admissible Q. – gung - Reinstate Monica Feb 17 '22 at 13:45
  • It's unclear why this was migrated here, as Stack Overflow has just as strong MCVE requirements. – pppery Feb 19 '22 at 05:18

1 Answers1

0

You need to use the mar argument in par.

E.g.:

library(dendextend)
# ?color_branches
par(mar=c(5 + 5,4,4,2) + 0.1)

dend <- USArrests %>%
  dist() %>%
  hclust(method = "ave") %>%
  as.dendrogram()
d2 <- color_branches(dend, 5)
plot(d2)

enter image description here

Adding the +5 to the first argument of mar makes the bottom part of the plot have more space.

Tal Galili
  • 24,605
  • 44
  • 129
  • 187
  • Hi Tal, in the above example I presented a case in which setting the labels in horizontal (las=1) would be a nice choice. So I removed the original labels with set("labels", "") before adding new labels with text(). I will edit the post using USArrest dataset and you can see that par(mar) adjusts doesn't work. Thanks for Dendextend, it is a powerful tool for customizing clusters, it would be nice if we could also set labels horizontally. – Vitorleo.ar Feb 22 '22 at 18:41