I have extracted from some texts informations about genes and chromosomes in object to classify a database of some texts.
my result missed some informations; in fact some texts could contain just the gene name and the location
and i want to get the omim number, the gene symbol, the gene name, and the chromosome location
this is a part from my results ( using Rcode)
OMIM GENES_SYMBOL GENES CHROMOSOME
1 (NA) (arlts1) (NA) (NA)
2 (NA) (mtr) (NA) (NA)
3 (NA) (hla.g) (NA) (NA)
4 (NA) (nat2, t341c) (NA) (NA)
5 (222300) (wfs1) (NA) (X4p16)
I want to get rid of the NA's: replace each one with the equivalent nae or code; for example something that takes arlts1
and find the specified omim number the gene name and the chromosome location.
I searched a lot but I couldn'it find an exhaustive data base that contains all the informations
May be i can do that with biomart
? I don't know even what is it
could someone help me with some solutions to my problem?