1

I am using biopython to do blast searches on bacterial strains, !HOWEVER! a normal blast search provides me with ALL strains, but I only want the well annotated strains (not all the data dump that comes with a normal query).

Normal Solution: The browser version of blast allows you to exclude "Uncultured/environmental sample sequences" as well as " limit to "Sequences from type material" by checking off two checkboxes.

Is there a way to implement these options using Biopython?

Basically, I was wondering if there are additional arguments I may be able to use that would give me the effect the browser version gives using checkboxes.

#What I am Doing (and works)
from Bio.Blast import NCBIWWW
my_query = NCBIWWW.qblast("blastn", "nt", query_sequence)
#I want something like
from Bio.Blast import NCBIWWW
my_query = NCBIWWW.qblast("blastn", "nt", query_sequence, exclude = 'uncultured', limit_seqs_from_type_material = True)
Barry
  • 11
  • 3
  • in short no, and the ncbi cloud query does not appear to either http://ncbi.github.io/blast-cloud/dev/api.html- you may have to build your own custom blast db for this – Chris_Rands Jul 22 '19 at 14:27
  • Thank you for your response! That is what I was afraid of. I have the Targeted Loci Project (TLP) database for 16S as well as Greengenes database. What Blast like tool would you suggest I plug these into? – Barry Jul 23 '19 at 17:33
  • I'm not familiar with those databases, but you can build a local blast database from any collection of FASTA sequences https://www.ncbi.nlm.nih.gov/books/NBK279688/ – Chris_Rands Jul 24 '19 at 07:43
  • Thank you for the direction! I will give this a shot later this week and let you know how it works out. – Barry Jul 24 '19 at 13:38

0 Answers0