4

I'm using pheatmap with large data. My purpose is to clusterize rows and columns and to analyze main clusters. I upload the data table and perform the heatmap as follows:

 library (pheatmap)
 data<-read.table ("example.txt", header = TRUE)
 pheatmap(data)

By this I get the heatmap of my data. My example.txt look like this:

    a   b   c   d   e   f
a   1   0.1 0.9 0.5 0.65    0.9
b   0.1 1   0.39    0.83    0.47    0.63
c   0.9 0.39    1   0.42    0.56    0.84
d   0.5 0.83    0.42    1   0.95    0.43
e   0.65    0.47    0.56    0.95    1   0.14
f   0.9 0.63    0.84    0.43    0.14    1

May be this is a very stupid question, but anyway I'll post it. After running pheatmap(data), how can I get the elements corresponding to the clusters? Do I have to save the results in some specific ways and analyse them by other R packages?

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
Gabelins
  • 285
  • 1
  • 2
  • 12
  • Please provide a reproducible example (= ready to copy & paste) on Stackoverflow - this will get you more help. No reader has got "data.txt" and also `library(pheatmap)` is missing. – lukeA Jan 07 '15 at 13:28
  • @lukeA I upload with the example. – Gabelins Jan 07 '15 at 13:47

1 Answers1

19

Grab the result of pheatmap and use cutree. To extract 10 clusters e.g. you could do:

library(pheatmap)
res <- pheatmap(mtcars)
mtcars.clust <- cbind(mtcars, 
                      cluster = cutree(res$tree_row, 
                                       k = 10))
head(mtcars.clust)
# mpg cyl disp  hp drat    wt  qsec vs am gear carb cluster
# Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4       1
# Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4       1
# Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1       2
# Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1       3
# Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2       4
# Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1       3

See ?pheatmap, ?hclust and ?cutree for the help.

lukeA
  • 53,097
  • 5
  • 97
  • 100
  • is there a way to save only the first (header) and last (cluster) columns from the output? – Gabelins Jan 07 '15 at 13:51
  • 1
    That would be `cutree(res$tree_row, k = 10)` alone or e.g. `data.frame(cluster = cutree(res$tree_row, k = 10))`. – lukeA Jan 07 '15 at 13:53