I have a document with this structure (it's large, more than 20000 lines)
@A00627:308:H227VDSX3:1:1201:30734:26349 2:N:0:TGGCAGTA+GTACAGTG
CCCAGGAGCACCAGGAAGGGCAAGAGCACCCTGGCCTAGGGGATCATCTGGCCCAGGGTAGGGTAGGAACAGCCTCATGGTCTTCAGAGTTTGCCCCTTCCTGAGGGAAAGACATTTTAATATTTTTGGGTTGGCTGGACCAATCTCATT
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFF:FFFFFF:F:FFFFFFFFFFFF
@A00627:308:H227VDSX3:1:1257:18828:34695 2:N:0:TGGCAGTA+GTACAGTG
CTGGCCTAGGGGATCATCTGGCCCAGGGTAGGGTAGGAACAGCCTCATGGTCTTCAGAGTTTGCCCCTTCCTGAGGGAAAGACATTTTAATATTTTTGGGTTGGCTGGACCAATCTCATTAAGAGAAGAGAAGAAACGCCCACGCCAGGA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFFFFFF:FFFFFFFF,FFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00627:308:H227VDSX3:1:1266:28809:10300 2:N:0:TGGCAGTA+GTACAGTG
CTGGCCCAGGGTAGGGTAGGAACAGCCTCATGGTCTTCAGAGTTTGCCCCTTCCTGAGGGAAAGACATTTTAATATTTTTGGGTTGGCTGGACCAATCTCATTAAGAGAAGAGAAGAAACGCCCACGCCAGGAAACCCACTGGGTGCCCG
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:,FFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFF:FFF:FFFFFFFFFFFFFFFFFFFFFFFF,FFFFF:,F:FFFFFFF
@A00627:308:H227VDSX3:1:1447:29315:13745 2:N:0:TGGCAGTA+GTACAGTG
CCCAGGAGCACCAGGAAGGGCAAGAGCACCCTGGCCTAGGGGATCATCTGGCCCAGGGTAGGGTAGGAACAGCCTCATGGTCTTCAGAGTTTGCCCCTTCCTGAGGGAAAGACATTTTAATATTTTTGGGTTGGCTGGACCAATCTCATT
+
And I want to keep these lines starting with 2 @ and the next one. Like this:
@A00627:308:H227VDSX3:1:1201:30734:26349 2:N:0:TGGCAGTA+GTACAGTG
CCCAGGAGCACCAGGAAGGGCAAGAGCACCCTGGCCTAGGGGATCATCTGGCCCAGGGTAGGGTAGGAACAGCCTCATGGTCTTCAGAGTTTGCCCCTTCCTGAGGGAAAGACATTTTAATATTTTTGGGTTGGCTGGACCAATCTCATT
@A00627:308:H227VDSX3:1:1257:18828:34695 2:N:0:TGGCAGTA+GTACAGTG
CTGGCCTAGGGGATCATCTGGCCCAGGGTAGGGTAGGAACAGCCTCATGGTCTTCAGAGTTTGCCCCTTCCTGAGGGAAAGACATTTTAATATTTTTGGGTTGGCTGGACCAATCTCATTAAGAGAAGAGAAGAAACGCCCACGCCAGGA
@A00627:308:H227VDSX3:1:1266:28809:10300 2:N:0:TGGCAGTA+GTACAGTG
CTGGCCCAGGGTAGGGTAGGAACAGCCTCATGGTCTTCAGAGTTTGCCCCTTCCTGAGGGAAAGACATTTTAATATTTTTGGGTTGGCTGGACCAATCTCATTAAGAGAAGAGAAGAAACGCCCACGCCAGGAAACCCACTGGGTGCCCG
@A00627:308:H227VDSX3:1:1447:29315:13745 2:N:0:TGGCAGTA+GTACAGTG
CCCAGGAGCACCAGGAAGGGCAAGAGCACCCTGGCCTAGGGGATCATCTGGCCCAGGGTAGGGTAGGAACAGCCTCATGGTCTTCAGAGTTTGCCCCTTCCTGAGGGAAAGACATTTTAATATTTTTGGGTTGGCTGGACCAATCTCATT
I have tried this code:
import fileinput
from collections import deque
output_file = 'cola1_fasta.txt'
buscado = '@'
contexto = deque([], 3) # for keeping the last 4 lines
with open(output_file, "w") as f_out:
for line in fileinput.input(files=["cola1.txt"]):
contexto.append(line)
if len(contexto) < 3:
continue
if buscado in contexto[1]:
f_out.writelines(contexto)
But I can obtain this. Do you have any suggestion? Many thanks!!