-1

I'm trying to implement the Needleman-Wunsch algorithm to get the minimum score in the global alignment function, but instead of getting the minimum score of 0 when both the sequences are equal I get 8.

What is the problem with this code?

alphabet = ["A", "C", "G", "T"] 
score = [[0, 4, 2, 4, 8], \
     [4, 0, 4, 2, 8], \
     [2, 4, 0, 4, 8], \
     [4, 2, 4, 0, 8], \
     [8, 8, 8, 8, 8]]

def globalAlignment(x, y):
#Dynamic version very fast
    D = []
    for i in range(len(x)+1):
        D.append([0]* (len(y)+1))

    for i in range(1, len(x)+1):
        D[i][0] = D[i-1][0] + score[alphabet.index(x[i-1])][-1]
    for i in range(len(y)+1):
        D[0][i] = D[0][i-1]+ score[-1][alphabet.index(y[i-1])]

    for i in range(1, len(x)+1):
        for j in range(1, len(y)+1):
            distHor = D[i][j-1]+ score[-1][alphabet.index(y[j-1])]
            distVer = D[i-1][j]+ score[-1][alphabet.index(x[i-1])]
            if x[i-1] == y[j-1]:
                distDiag = D[i-1][j-1]
            else:
                distDiag = D[i-1][j-1] + score[alphabet.index(x[i-1])][alphabet.index(y[j-1])]

            D[i][j] = min(distHor, distVer, distDiag)

    return D[-1][-1]

x = "ACGTGATGCTAGCAT"
y = "ACGTGATGCTAGCAT"
print(globalAlignment(x, y))
merv
  • 67,214
  • 13
  • 180
  • 245
Maloki
  • 19
  • 5
  • Welcome to StackOverflow. Please read and follow the posting guidelines in the help documentation, as suggested when you created this account. [On topic](http://stackoverflow.com/help/on-topic), [how to ask](http://stackoverflow.com/help/how-to-ask), and [... the perfect question](https://codeblog.jonskeet.uk/2010/08/29/writing-the-perfect-question/) apply here. – Prune Dec 14 '18 at 17:53
  • Specifically, you have failed to "make it easy for others to help you". You have some sort of "magic" distance computation, which you are still hiding from us. One-letter variable names, sequences of eye-straining subscripts -- with no documentation, explanation, or debugging trace. *Why* do you expect 0 from these computations? *How* did it get to `8` instead? What are the intermediate steps that failed? – Prune Dec 14 '18 at 17:55
  • See this lovely [debug](https://ericlippert.com/2014/03/05/how-to-debug-small-programs/) blog for help. – Prune Dec 14 '18 at 17:55
  • I've got this code from the course of Algorithms of DNA sequencing brought to us by Ben Langmead in coursera. the code ran normally in his machine but I can't get the same result in my own machine. – Maloki Dec 14 '18 at 18:19

3 Answers3

1

just change -> for i in range(len(y)+1): to for i in range(1, len(y) + 1): and -> distVer = D[i-1][j]+ score[-1][alphabet.index(x[i-1])] to

distVer = D[i - 1][j] + score[alphabet.index(x[i - 1])][-1]
0

At least

distHor = D[i][j-1]+ score[-1][alphabet.index(y[j-1])]
distVer = D[i-1][j]+ score[-1][alphabet.index(x[i-1])]

is suspect, as you didn't use the same place for the [-1] in the initialisations, and both distances are not likely to use the same direction in the weights...
I guess it should be

score[alphabet.index(x[i-1])][-1]

but it may not be the only error...

B. Go
  • 1,436
  • 4
  • 15
  • 22
  • Hi, I fixed the problem by putting 0 instead of 8 in the last list of socre score = [[0, 4, 2, 4, 8], \ [4, 0, 4, 2, 8], \ [2, 4, 0, 4, 8], \ [4, 2, 4, 0, 8], \ [0, 0, 0, 0, 0]] – Maloki Dec 14 '18 at 19:10
0

I fixed the problem by putting 0 instead of 8 in the last list of score;

alphabet = ["A", "C", "G", "T"] 
score = [[0, 4, 2, 4, 8], \
     [4, 0, 4, 2, 8], \
     [2, 4, 0, 4, 8], \
     [4, 2, 4, 0, 8], \
     [0, 0, 0, 0, 0]]

def globalAlignment(x, y):
#Dynamic version very fast
D = []
for i in range(len(x)+1):
    D.append([0]* (len(y)+1))

for i in range(1, len(x)+1):
    D[i][0] = D[i-1][0] + score[alphabet.index(x[i-1])][-1]
for i in range(len(y)+1):
    D[0][i] = D[0][i-1]+ score[-1][alphabet.index(y[i-1])]

for i in range(1, len(x)+1):
    for j in range(1, len(y)+1):
        distHor = D[i][j-1]+ score[-1][alphabet.index(y[j-1])]
        distVer = D[i-1][j]+ score[alphabet.index(x[i-1])][-1]
        if x[i-1] == y[j-1]:
            distDiag = D[i-1][j-1]
        else:
            distDiag = D[i-1][j-1] + score[alphabet.index(x[i-1])][alphabet.index(y[j-1])]

        D[i][j] = min(distHor, distVer, distDiag)

return D[-1][-1]    
Maloki
  • 19
  • 5