I wanted to do the sequence search 'CCTTCATTCTTCTGTATTGGAGACTTACAGTTGGCACAAGGCTTGGAGTT' against the pig nucleotide genome sequences and see if I can find the perdect match in the alignment. I used the biopython to access the ncbi blast and fetch the result, which is a _io.StringIO object. I wanted to read that xml file, however is looks different than what I see in the actual ncbi blast tool in the web. Could you please help me with this?
The script I used does not give hits however, has alignment hits in the ncbi blast tool.
seq2 = 'CCTTCATTCTTCTGTATTGGAGACTTACAGTTGGCACAAGGCTTGGAGTT'
from Bio.Blast import NCBIWWW as ncbi
result1 = ncbi.qblast("blastn", "nr", seq2,entrez_query = 'pig (taxid:9823)')
#print(result1) gives <_io.StringIO object at 0x7f89f80a1790>
#tried to open the file #I used the script below from Biopython:overview of blast, however I see no output, there is no hit in #the alignment
with open('results.xml', 'w') as save_file:
blast_results = result1.read()
save_file.write(blast_results)
from Bio.Blast import NCBIXML
E_VALUE_THRESH = 1e-20
for record in NCBIXML.parse(open("results.xml")):
if record.alignments:
print("\n")
print("query: %s" % record.query[:100])
for align in record.alignments:
for hsp in align.hsps:
if hsp.expect < E_VALUE_THRESH:
print("match: %s " % align.title[:100])
#I tried using the script below form the stack overflow as well
result_handle = ncbi.qblast("blastn", "nr", seq2, entrez_query= "pig (taxid:9823)")
records = NCBIXML.parse(result_handle)
for i, record in enumerate(records):
if record.alignments:
for align in record.alignments:
print(align.hit_id)
else:
print("There is no BLAST result for", i)
#I used this script from Biopython:overview of blast, however I see no output, there is no hit in #the alignment
#using the blast tool in ncbi gives some sequence alignments though.
type here