I have to check if a file is FASTA, FASTQ or none of those. For the FASTA checking i used the module SeqIO
from Bio
:
def is_fasta(filename):
with open(filename, "r") as handle:
fasta = SeqIO.parse(handle, "fasta")
return any(fasta)
Which returns True if the file is FASTA and False if it isn't. But when I use the FASTQ version of this function:
def is_fastq(filename):
with open(filename, "r") as handle:
fastq = SeqIO.parse(handle, "fastq")
return any(fastq)
I get an error message:
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/Bio/SeqIO/Interfaces.py", line 74, in next return next(self.records) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/Bio/SeqIO/QualityIO.py", line 1085, in iterate for title_line, seq_string, quality_string in FastqGeneralIterator(handle): File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/Bio/SeqIO/QualityIO.py", line 932, in FastqGeneralIterator "Records in Fastq files should start with '@' character"
ValueError: Records in Fastq files should start with '@' character
Can someone help me understand why doesn't it work the same way for FASTA and FASTQ? And how can I check if the file is a real FASTQ