I wrote the following code to define blocks of 4 lines in a text file and output the block if the 2nd line of the block is composed of only one type of character. It is assumed (and previously verified) that the 2nd line is always composed of a string of 36 characters.
# filter out homogeneous reads
import sys
import collections
from collections import Counter
filename1 = sys.argv[1] # file to process
with open(filename1,'r') as input_file:
for line1 in input_file:
line2, line3, line4 = [next(input_file) for line in xrange(3)]
c = Counter(line2).values() # count characters in line2
c.sort(reverse=True) # sort values in descending order
if c[0] < 36:
print line1 + line2 + line3 + line4.rstrip()
However, I am getting a StopIteration error as follows. I would appreciate if someone could tell me why.
$ python code.py test.file > testout.file
Traceback (most recent call last):
File "code.py", line 11, in <module>
line2, line3, line4 = [next(input_file) for line in xrange(3)]
StopIteration
Any help would be appreciated, especially of the kind that explains what is wrong with my specific code and how to fix it. Here is an example of input:
@1:1:1323:1032:Y
AGCAGCATTGTACAGGGCTATCATGGAATTCTCGGG
+1:1:1323:1032:Y
HHHBHHBHBHGBGGGH8HHHGGGGFHBHHHHBHHHH
@1:1:1610:1033:Y
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+1:1:1610:1033:Y
HHEHHHHHHHHHHHBGGD>GGD@G8GGGGDHBHH4C
@1:1:1679:1032:Y
CGGTGGATCACTCGGCTCGTGCGTCGATGAAGAACG