I got a data format like:
ATOM 124 N GLU B 12
ATOM 125 O GLU B 12
ATOM 126 OE1 GLU B 12
ATOM 127 C GLU B 12
ATOM 128 O GLU B 14
ATOM 129 N GLU B 14
ATOM 130 OE1 GLU B 14
ATOM 131 OE2 GLU B 14
ATOM 132 CA GLU B 14
ATOM 133 C GLU B 15
ATOM 134 CA GLU B 15
ATOM 135 OE2 GLU B 15
ATOM 136 O GLU B 15
.....100+ lines
From here, I want to filter this data based on col[5]
(starting column count from 0) and col[2]
. Per value
of col[5]
if OE1
or OE2
happens to be only once then the data set to be discarded. But for each value of col[5]
if OE1
and OE2
both be present, it would be kept.
The desired data after filtering:
ATOM 128 O GLU B 14
ATOM 129 N GLU B 14
ATOM 130 OE1 GLU B 14
ATOM 131 OE2 GLU B 14
ATOM 132 CA GLU B 14
I have tried using search_string
like:
for item in stored_list:
search_str_a = 'OE1'+item[3]+item[4]+item[5]
search_str_b = 'OE2'+item[3]+item[4]+item[5]
target_str = item[2]+item[3]+item[4]+item[5]
This is helpful to maintain rest of the col
alike while searching for OE1
or OE2
, but not helpful to filter and eliminate if one of them(or both them) is missing.
Any ideas would be really nice here.