0

Trying to read a file that contains a genome sequence using Seq and SeqIO objects in BioPython. Cannot use the open command. The program should accept a command-line argument containing the name of FASTA file containing the input genome.

It made the file, but there is nothing in the file. Not sure what I am missing?

This is what I have:

    from Bio.Seq import Seq                                                 
    from Bio import SeqIO
    from Bio.SeqRecord import SeqRecord
    from Bio.Alphabet import IUPAC

    recordlist = []

    for SeqRecord in SeqIO.parse('bacterium_genome.fna', 'fasta'):
        myseq = SeqRecord.seq
        myseq.alphabet = IUPAC.unambiguous_dna
        recordlist.append(SeqRecord)


    SeqIO.write(recordlist, 'bacterium_genome.gb', 'gb')
Chris_Rands
  • 38,994
  • 14
  • 83
  • 119

1 Answers1

2

What you're doing should actually work (assuming a valid non-empty input FASTA file) but is not that elegant with unnecessary imports. You could instead modify the alphabet directly and then write the sequence record to the output file handle each iteration:

from Bio import SeqIO
from Bio.Alphabet import IUPAC

with open('bacterium_genome.gb', 'w') as out_f:
    for record in SeqIO.parse('bacterium_genome.fna', 'fasta'):
        record.seq.alphabet = IUPAC.unambiguous_dna
        SeqIO.write(record, out_f, 'genbank')
Chris_Rands
  • 38,994
  • 14
  • 83
  • 119