Clustering within groups with pheatmap

Question

I am trying to make a heatmap showing gene expression across 4 different groups, and I would like to cluster within each group. I have samples sorted by group across the columns. Using cluster_cols = True clusters across all groups, mixing up the order of samples from each group. How can clustering be done only within each group with pheatmap?

score 0 · Answer 1 · answered Oct 27 '22 at 18:48

I had similar questions recently. Since there are no such updates in recent pheatmap versions, my current solution to this is:

1.Generate orders based on the PC1 eigen values:

#data_heatmap is the data tibble/matrix used to pheatmap heatmap
eigenvalues <- svd(t(scale(t(data_heatmap))),nu=1,nv=1)$v
scaledExpr <- scale(t(data_heatmap))
averExpr <- rowMeans(scaledExpr, na.rm = TRUE)
if(cor(averExpr,eigenvalues) < 0){
      eigenvalues <- -eigenvalues
}
index_eigen <- order(eigengenes)

Cluster within each group, and align with eigen orders

#s2c_f is the dataframe, with one column called "Group" with group info.
index_reorder <- c()
index_pre <- c(1:length(s2c_f$Group))
for(eachgroup in unique(s2c_f$Group)){
      index_tempEigen <- index_eigen[index_eigen %in% index_pre[s2c_f$Group == eachgroup]]
      sampleDist<-dist(t(data_heatmap[,index_tempEigen]), method="euclidean")
      sampleClust<-hclust(sampleDist, method='complete')
      index_clust <- sampleClust$order
      if(cor(index_clust,c(1:length(index_tempEigen))) < 0){
        index_clust <- rev(index_clust)
      }
      index_reorder <- c(index_reorder,index_tempEigen[index_clust])
}

Send new parameters to pheatmap with cluster=FALSE

s2c_f <- s2c_f[index_reorder,]
data_heatmap <- data_heatmap[,s2c_f$Sample]

ann_colors = list(Group = c(unique(s2c_f$Color)))
names(ann_colors[[1]]) = unique(s2c_f$Group)
df <- as.data.frame(s2c_f[,"Group",drop=FALSE])

pheatmap(data_heatmap, 
         scale='row',
         color = colorRampPalette(c("navy", "white", "firebrick3"))(50),
         show_rownames=TRUE,
         cluster_cols=FALSE, 
         cluster_rows=TRUE, 
         annotation_colors=ann_colors[1],
         annotation_col=df,
         gaps_row = NULL, gaps_col = NULL,
         silent=TRUE)

I think all above could be easily wrapped in a function. In the example above, I only showed how to do this when you want to cluster columns within groups, and my columns are sample names.

Another potential solution for this is ComplexHeatmap.

Clustering within groups with pheatmap

1 Answers1