1

Below is a small fraction of my code:

library(biomaRt)

ensembl_hsapiens <- useMart("ensembl", 
                        dataset = "hsapiens_gene_ensembl")

hsapien_PC_genes <- getBM(attributes = c("ensembl_gene_id", "external_gene_name"), 
                      filters = "biotype", 
                      values = "protein_coding", 
                      mart = ensembl_hsapiens)

paralogues[["hsapiens"]] <- getBM(attributes = c("external_gene_name",
                                                 "hsapiens_paralog_associated_gene_name"), 
                                  filters = "ensembl_gene_id", 
                                  values = c(ensembl_gene_ID) , mart = ensembl_hsapiens)

This bit of code will only allow me to extract the paralogues for hsapiens, it there a way for me to easily get the same information for mmusculus (mouse) and ggallus (chicken) without having to rewrite the code by using something like Tapply? My code is much longer than the snippet provided, all I would need to do is swap the word hsapiens for mmusulus and ggallus.

Jack Dean
  • 163
  • 1
  • 7
  • Not tested, maybe use *paste* to create dataset names? `x <- "mmusculus"; myMart <- useMart("ensembl", dataset = paste0(x, "_gene_ensembl"))` – zx8754 Apr 25 '18 at 13:19
  • 1
    I was hoping to have a vector containing all of the species names e.g. all_species <- c("hsapiens", "mmusculus", "ggallus") then get R to automatically replace the species in a command like: ensembl[[hsapiens]] <- useMart("ensembl",dataset = ensembl_hsapiens), generating seperate vectors for each species – Jack Dean Apr 25 '18 at 15:27

1 Answers1

0

Easy way is to just wrap it all in a for loop:

library(biomaRt)

species <- c("hsapiens_gene_ensembl", "mmusculus_gene_ensembl", "ggallus_gene_ensembl")

for (s in species) {
tmp <- useMart("ensembl", dataset = paste0(s))
hsapien_PC_genes <- getBM(attributes = c("ensembl_gene_id", "external_gene_name"), 
                          filters = "biotype", 
                          values = "protein_coding", 
                          mart = tmp)
paralogues[[s]] <- getBM(attributes = c("external_gene_name",
                                                 "hsapiens_paralog_associated_gene_name"), 
                                  filters = "ensembl_gene_id", 
                                  values = c(ensembl_gene_ID) , mart = tmp)
}

This should work, I haven't tested it because I don't have those packages installed. I've changed the names of some vars to make more sense (eg tmp)

Amar
  • 1,340
  • 1
  • 8
  • 20