I wrote a script that uses a GenBank file and Biopython to fetch the sequences of given genes from the sequence part of the GBK file, which my colleagues use for their work.
We had some problems now with a new data set, and it turned out that the GBK file that was downloaded did not contain a sequence (which can easily happen when you download from the GenBank website at NCBI). Instead of throwing an error, Biopython returns a long sequence of Ns when using record.seq[start:end]
. What is the easiest way to catch that problem right from the start to stop the script with an error message?