1

I have recently produced an unsupervised hierarchical cluster heat map using 30 different RNA-seq samples. The x axis is labelled as the name of each sample, and the y axis displays the 100 most variable genes presented as mouse Ensembl ID's (e.g. ENSMUSG00000020573).

I was just wondering if there was a way for me to replace the Ensembl ID's with gene names (e.g. Pik3cg) before I input it into the pheatmap() function.

My input table is:

                        WT_Animal1      WT_Animal2     WT_Animal3
ENSMUSG00000094652      0.03463869       0.7333992     -0.29986091
ENSMUSG00000006356     -0.64264559      -0.5609578     -0.06037522
ENSMUSG00000019897      0.09159506      -0.1133322     -0.12974861
ENSMUSG00000027790     -0.25124228       1.2871582     -0.92491260
ENSMUSG00000054999     -0.58618795       1.2079283     -0.89929279
ENSMUSG00000072573      0.16812802       0.1058453     -0.16593449 

I changed the column names manually using colnames(mat) <- c() but I want to know how to change the row names (Ensembl ID's) using a different function so I can reproduce it in further plots.

I have tried to read up on using biomaRt and other packages but can't seem to work out a way to do it.

Any help would be much appreciated!

Phil
  • 7,287
  • 3
  • 36
  • 66

1 Answers1

0

If you want to use biomaRt, here is how you could do it:

library(biomaRt)

# your example matrix
mat <- structure(list(
    WT_Animal1 = c(0.03463869, -0.64264559, 0.09159506, -0.25124228, -0.58618795, 0.16812802),
    WT_Animal2 = c(0.7333992, -0.5609578, -0.1133322, 1.2871582, 1.2079283, 0.1058453),
    WT_Animal3 = c(-0.29986091, -0.06037522, -0.12974861, -0.9249126, -0.89929279, -0.16593449)),
    class = "data.frame", 
    row.names = c("ENSMUSG00000094652", "ENSMUSG00000006356", "ENSMUSG00000019897", 
                  "ENSMUSG00000027790", "ENSMUSG00000054999",  "ENSMUSG00000072573"))

mart <- useMart(biomart = "ensembl", dataset = "mmusculus_gene_ensembl")
symb <- getBM(attributes = c("ensembl_gene_id","mgi_symbol"),
             filters = "ensembl_gene_id", values = rownames(mat),
             mart = mart)
symbs <- symb$mgi_symbol[match(rownames(mat), symb$ensembl_gene_id, nomatch = NA)]
heatmap(as.matrix(mat), labRow = symbs, margins = c(12, 10))

Created on 2020-07-07 by the reprex package (v0.3.0)

user12728748
  • 8,106
  • 2
  • 9
  • 14