I have a plain text file with the following contents:
@M00964: XXXXX
YYY
+
ZZZZ
@M00964: XXXXX
YYY
+
ZZZZ
@M00964: XXXXX
YYY
+
ZZZZ
and I would like to read this into a list split into items according to the ID code @M00964
, i.e. :
['@M00964: XXXXX
YYY
+
ZZZZ'
'@M00964: XXXXX
YYY
+
ZZZZ'
'@M00964: XXXXX
YYY
+
ZZZZ']
I have tried using
in_file = open(fileName,"r")
sequences = in_file.read().split('@M00964')[1:]
in_file.close()
but this removes the ID sequence @M00964
. Is there any way to keep this ID sequence in?
As an additional question is there any way of maintaining white space in a list (rather than have /n symbols).
My overall aim is to read in this set of items, take the first 2, for example, and write them back to a text file maintaining all of the original formatting.