4

QIIME requests this (here) regarding the fasta files it receives as input:

The file is a FASTA file, with sequences in the single line format. That is, sequences are not broken up into multiple lines of a particular length, but instead the entire sequence occupies a single line.

Bio.SeqIO.write of course follows the format recommendations, and splits the sequence every 80 bps. I could write my own writer to write those "single-line" fastas - but my question is if there's a way that I missed to make SeqIO do that.

Korem
  • 11,383
  • 7
  • 55
  • 72

3 Answers3

7

BioPython's SeqIO module uses the FastaIO submodule to read and write in FASTA format.

The FastaIO.FastaWriter class can output a different number of characters per line but this part of the interface is not exposed via SeqIO. You would need to use FastaIO directly.

So instead of using:

from Bio import SeqIO
SeqIO.write(data, handle, format)

use:

from Bio.SeqIO import FastaIO
fasta_out = FastaIO.FastaWriter(handle, wrap=None)
fasta_out.write_file(data)

or

for record in data:
    fasta_out.write_record(record)
unode
  • 9,321
  • 4
  • 33
  • 44
2

@unode answered the question. I just want to add that write_file() and write_record() from FastaIO are marked OBSOLETE as of today. So an alternative solution is to use as_fasta_2line() function, which converts a fasta record into a plain two-line string.

from Bio.SeqIO import FastaIO
records_list = [FastaIO.as_fasta_2line(record) for record in records]
handle.writelines(records_list)
HongboZhu
  • 4,442
  • 3
  • 27
  • 33
1

Though both @unode and @HongboZhu provided working answers to the question, they used FastaIO.
SeqIO (now at least, and in Python3) provides functionality to write in the format you desire:

from Bio import SeqIO
SeqIO.write(data, handle, 'fasta-2line')
Benjamin
  • 11
  • 2