1

I usually use biomaRt to convert gene ids to symbols. However, this time the ensembl IDs I have (for a dog) do not match the ensembl ids of biomart dataset "clfamiliaris_gene_ensembl".

I also tried to use the ensembl web portal, the dog dataset is called ROS_Cfam_1.0 there. Looks like my genes do not match the genes from their dataset. My genes look like this:

"ENSCAFG00000045440" "ENSCAFG00000000001" "ENSCAFG00000000002" "ENSCAFG00000041462" "ENSCAFG00000000005"

Here is my biomaRt code:

ensembl <- useMart("ensembl")
ensembl <- useDataset("clfamiliaris_gene_ensembl",mart=ensembl)
gene_id <- getBM(attributes = c('ensembl_gene_id', 'external_gene_name'),
                 values = rownames(mydata),
                 filters = c('ensembl_gene_id'), mart = ensembl)
gene_id
[1] ensembl_gene_id    external_gene_name
<0 rows> (or 0-length row.names)

it doesn't find my values. Should I use a different dataset for dogs?

Yulia Kentieva
  • 641
  • 4
  • 13

2 Answers2

3

These IDs are from the Boxer dog genome assembly: https://www.ensembl.org/Canis_lupus_familiaris/Info/Strains?db=core

However, BioMart is not available for dog breeds (as well as other species and strains): https://www.ensembl.info/2021/01/20/important-changes-of-data-availability-in-ensembl-gene-trees-and-biomart/

However, you can use the POST lookup/id REST API endpoint to retrieve the gene symbol for a list of gene IDs from any species: http://rest.ensembl.org/documentation/info/lookup_post

0

Had the same issue with my code. As mentioned in a previous update biomart ensembl ids changed for dog.

A temporary fix for me was the following: I checked the various versions you can access with

library('biomaRt')

listEnsemblArchives()

Pick the latest version from may2021 that still contains the needed IDs

(e.g. "ENSCAFG00000000001" instead of "ENSCAFG00845000008" for the ENPP1 gene)

ensembl2use <- useMart('ensembl', dataset = 'clfamiliaris_gene_ensembl', host = 'https://may2021.archive.ensembl.org')

SPECIE_ANNOTATION <- getBM(attributes = c('ensembl_gene_id'), mart = ensembl2use)
Henrik
  • 1