I am trying to make a heatmap showing gene expression across 4 different groups, and I would like to cluster within each group. I have samples sorted by group across the columns. Using cluster_cols = True
clusters across all groups, mixing up the order of samples from each group. How can clustering be done only within each group with pheatmap?
Asked
Active
Viewed 1,481 times
4
-
any updates for this question? – Oct 29 '20 at 09:33
1 Answers
0
I had similar questions recently. Since there are no such updates in recent pheatmap versions, my current solution to this is:
1.Generate orders based on the PC1 eigen values:
#data_heatmap is the data tibble/matrix used to pheatmap heatmap
eigenvalues <- svd(t(scale(t(data_heatmap))),nu=1,nv=1)$v
scaledExpr <- scale(t(data_heatmap))
averExpr <- rowMeans(scaledExpr, na.rm = TRUE)
if(cor(averExpr,eigenvalues) < 0){
eigenvalues <- -eigenvalues
}
index_eigen <- order(eigengenes)
- Cluster within each group, and align with eigen orders
#s2c_f is the dataframe, with one column called "Group" with group info.
index_reorder <- c()
index_pre <- c(1:length(s2c_f$Group))
for(eachgroup in unique(s2c_f$Group)){
index_tempEigen <- index_eigen[index_eigen %in% index_pre[s2c_f$Group == eachgroup]]
sampleDist<-dist(t(data_heatmap[,index_tempEigen]), method="euclidean")
sampleClust<-hclust(sampleDist, method='complete')
index_clust <- sampleClust$order
if(cor(index_clust,c(1:length(index_tempEigen))) < 0){
index_clust <- rev(index_clust)
}
index_reorder <- c(index_reorder,index_tempEigen[index_clust])
}
- Send new parameters to pheatmap with cluster=FALSE
s2c_f <- s2c_f[index_reorder,]
data_heatmap <- data_heatmap[,s2c_f$Sample]
ann_colors = list(Group = c(unique(s2c_f$Color)))
names(ann_colors[[1]]) = unique(s2c_f$Group)
df <- as.data.frame(s2c_f[,"Group",drop=FALSE])
pheatmap(data_heatmap,
scale='row',
color = colorRampPalette(c("navy", "white", "firebrick3"))(50),
show_rownames=TRUE,
cluster_cols=FALSE,
cluster_rows=TRUE,
annotation_colors=ann_colors[1],
annotation_col=df,
gaps_row = NULL, gaps_col = NULL,
silent=TRUE)
I think all above could be easily wrapped in a function. In the example above, I only showed how to do this when you want to cluster columns within groups, and my columns are sample names.
Another potential solution for this is ComplexHeatmap.

Raymond
- 41
- 5