I am attempting to find the LCS of two DNA sequences. I am outputting the matrix form as well as the string that includes the longest common sequence. However, when I return both matrix and list in my code, I obtain the following error: IndexError: string index out of range
If I were to remove the coding that involves the variable temp and higestcount, my code will nicely output my matrix. I am trying to use similar coding for the matrix to generate my list. Is there a way to avoid this error? Based on the sequences AGCTGGTCAG and TACGCTGGTGGCAT, the longest common sequence should be GCTGGT.
def lcs(x,y):
c = len(x)
d = len(y)
plot = []
temp = ''
highestcount = ''
for i in range(c):
plot.append([])
temp.join('')
for j in range(d):
if x[i] == y[j]:
plot[i].append(plot[i-1][j-1] + 1)
temp.join(temp[i-1][j-1])
else:
plot[i].append(0)
temp = ''
if temp > highestcount:
highestcount = temp
return plot, temp
x = "AGCTGGTCAG"
y = "TACGCTGGTGGCAT"
test = compute_lcs(x,y)
print test