1

Initially I was trying to add the horizontal color side bar to the dendrogram plot (NOT to the whole heat map) using colored_bars from dendextend.

The code below (THANK YOU for your help Tal!) works pretty well. The only issue remaining is how to control the distance of the bar from the leaves labels and the bar width?

Here is an example, data and code
Data (4 variables, 5 cases)

df <- read.table(header=T, text="group class v1 v2 
1          A         1          3.98         23.2  
2          A         2          5.37         18.5  
3          C         1          4.73         22.1  
4          B         1          4.17         22.3  
5          C         2          4.47         22.4  
") 

car_type <- factor(df[,c(1)]) # groups codes (A,B,C)  
cols_4 <- heat.colors(3)  
col_car_type <- cols_4[car_type] 
matrix<-data.matrix(df[,c(3,4)])
rnames<-df[,2]
row.names(matrix)<-rnames
matrix<-data.matrix(df[,c(3,4)])
row.names(matrix)<-rnames
dend<-hclust(dist(matrix))
labels_colors(dend) <- col_car_type[order.dendrogram(dend)]  # Error in order.dendrogram(dend) : 'order.dendrogram' requires a dendrogram
# But I dont think the line above is doing anything so it can be removed...
plot(dend)  
colored_bars(col_car_type, dend)
AussieAndy
  • 101
  • 2
  • 11
  • 1
    Hi AussieAndy, Thank you for the post. I see now there is an issue with the function which I will fix in the next week. In the meantime, could you please update your question with a small self contained example of a dendrogram and groups so to demonstrate the function for you once it works? Thanks. – Tal Galili Dec 31 '15 at 06:47
  • Thank you in advance! I will try to keep it simple and use parallel as my data set has 700+ variables and 200+ cases. Lets use mtcars data. What I am trying to do it to create dendrogram (NOT the heat map) of mpg, qsec, cyl, gear for all the cars (particular models e.g. Merc280C, Merc 240D, etc) and then have horizontal bar below with different colors for Mercedes, Mazda, Toyota etc. so you can quickly see that e.g. all Mercedes cars clustering together. – AussieAndy Dec 31 '15 at 07:34
  • I know that I can create column with labls Mercedes, Mazda, Toyota but and apply it to dendrogram but it will not allow to quick assessment (keep in mind I have over 200 cases ("cars"). My apologies if this not exactly what you have expected! – AussieAndy Dec 31 '15 at 07:38
  • I fixed the function to work better. Please let me know if it helps :) – Tal Galili Jan 01 '16 at 19:05
  • Thank you for your help! As I have separate variable with the codes for the groups I would like to use in the side bar I was hoping for something like this: – AussieAndy Jan 04 '16 at 01:38
  • car_type <- factor(data[,c(1)]) # groups codes (Japan, Germany, USA, Italy, Sweden) are already in variable/column 1, no need to create them n_car_types <- length(unique(car_type)) cols_4 <- heat.colors col_car_type <- cols_4[car_type] # I am getting error here "Error in cols_4[car_type] : object of type 'closure' is not subsettable" and of course can not go further, which I thought would be: labels_colors(dend) <- col_car_type[order.dendrogram(dend)] plot(dend) colored_bars(col_car_type, dend) – AussieAndy Jan 04 '16 at 01:39
  • 1
    You are not talking about mtcars (that data doesn't have the column you speak of). You are probably talking about the cars data from here: http://www.stat.berkeley.edu/~s133/Cluster2a.html But their link to the data is broken so I can't reproduce your example. If you want me to help you debug - I must have a fully reproducible example. (please include it in your original question and not in the comments) – Tal Galili Jan 04 '16 at 09:02

1 Answers1

3

This is possible to do using dendextend.

First to install the latest dendextend version you can use:

install.packages(dendextend)

Here is an example using mtcars:

## mtcars example

# Create the dend:
dend <- as.dendrogram(hclust(dist(mtcars)))

# Create a vector giving a color for each car to which company it belongs to
car_type <- rep("Other", length(rownames(mtcars)))
is_x <- grepl("Merc", rownames(mtcars))
car_type[is_x] <- "Mercedes"
is_x <- grepl("Mazda", rownames(mtcars))
car_type[is_x] <- "Mazda"
is_x <- grepl("Toyota", rownames(mtcars))
car_type[is_x] <- "Toyota"
car_type <- factor(car_type)
n_car_types <- length(unique(car_type))
cols_4 <- colorspace::rainbow_hcl(n_car_types, c = 70, l  = 50)
col_car_type <- cols_4[car_type]

# extra: showing the various clusters cuts 
k234 <- cutree(dend, k = 2:4)

# color labels by car company:
labels_colors(dend) <- col_car_type[order.dendrogram(dend)]
# color branches based on cutting the tree into 4 clusters:
dend <- color_branches(dend, k = 4)

### plots
par(mar = c(12,4,1,1))
plot(dend)
colored_bars(cbind(k234[,3:1], col_car_type), dend, rowLabels = c(paste0("k = ", 4:2), "Car Type"))

# horiz version:
par(mar = c(4,1,1,12))
plot(dend, horiz = TRUE)
colored_bars(cbind(k234[,3:1], col_car_type), dend, rowLabels = c(paste0("k = ", 4:2), "Car Type"), horiz = TRUE)
legend("topleft", legend = levels(car_type), fill = cols_4)

enter image description here

Tal Galili
  • 24,605
  • 44
  • 129
  • 187
  • 1
    I know this is sort of old question now... But, how to assign particular color to particular class (e.g. Mazda-yellow, Mercedes-blue, Other-green, Toyota-red). I might have CRAFT moment but dont know how to do it... Thank you in advance! – AussieAndy Jan 28 '17 at 02:39
  • @AussieAndy you could create a vector with the codes first `col_code=c("Mazda"="yellow", "Mercedes"="blue", "Other"="green", "Toyota"="red")` and then map the vector with the car types to these the color codes doing `col_vec = col_code[car_type]` – Alf Pascu Nov 26 '20 at 15:19