I am new to Stackoverflow. I am trying to automate search process using Biopython. I have two lists, one with protein GI numbers and one with corresponding nucleotide GI numbers. For example:
protein_GI=[588489721,788136950,409084506]
nucleo_GI=[588489708,788136846,409084493]
Second list was created using ELink. However, the nucleotide GIs correspond to whole genomes. I need to retrieve particular CDS from each genome which match the protein GI. I tried using again ELink with different link names ("protein_nucleotide_cds","protein_nuccore") but all I get is id numbers for whole genomes. Should I try some other link names? I also tried the following EFetch code:
import Bio
from Bio import Entrez
Entrez.email = None
handle=Entrez.efetch(db="sequences",id="588489708,588489721",rettype="fasta",retmode="text")
print(handle.read())
This method gives me nucleotide and protein sequences in fasta file but the nucleotide sequence is a whole genome.
I would be very grateful, if somebody could help me. Thanking you in advance!