I have a long list of sub-strings (close to 16000) that I want to find where the repeating cycle starts/stops. I have come up with this code as a starting point:
strings= ['1100100100000010',
'1001001000000110',
'0010010000001100',
'0100100000011011',
'1001000000110110',
'0010000001101101',
'1100100100000010',
'1001001000000110',
'0010010000001100',
'0100100000011011',]
pat = [ '1100100100000010',
'1001001000000110',
'0010010000001100',]
for i in range(0,len(strings)-1):
for j in range(0,len(pat)):
if strings[i] == pat[j]:
continue
if strings[i+1] == pat[j]:
print 'match', strings[i]
break
break
The problem with this method is that you have to know what pat is to search for it. I would like to be able to start with the first n sub-list (in this case 3) and search for them, if not match move down one sub-string to the next 3 until it has gone through the entire list or finds the repeat. I believe if the length is high enough (maybe 10) it will find the repeat without being too time demanding.