0

For some reason, when I try and implement the following code (I'm using Sublime Text 2) it gives me the error "Invalid Syntax" on line 18. I'm not sure why this is, I found the code here and it apparently should work, so I have no idea why it doesn't. Any tips? Here is the code:

def damerau_levenshtein_distance(word1, word2):
    distances = {}
    len_word1 = len(word1)
    len_word2 = len(word2)
    for i in xrange(-1, (len_word1 + 1)):
        distances[(i,-1)] = i + 1
    for j in xrange(-1, (len_word2 + 1)):
        distances[(-1,j)] = j + 1

    for i in xrange(len_word1):
        if word1[i] == word2[j]:
            distance_total = 0
        else:
            distance_total = 1
        distances[(i, j)] = min(
            distances[(i-1,j)] + 1, # deletion
            distances[(i,j-1)] + 1 # insertion
            distances[(i-1,j-1)] + distance_total #substitution
            )
        if i and j and word1[i] == word2[j-1] and word1[i-1] == word2[j]:
            distances[(i,j)] = min(distances[(i,j)], distances[i-2,j-2] + distance_total) # transposition

    return distances[len_word1-1,len_word2-1]    
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343

2 Answers2

3

there is an error should be:

,#insertion
tk.
  • 626
  • 4
  • 14
  • I corrected this error, and it works now, for the most part, but whenever I run the code, and call the function (for example, I used "damerau_levenshtein_distance('hellp', 'hello')" it throws the error "IndexError: string index out of range" on line 11 "if word1[i] == word2[j]:" – missmayhem13 Oct 25 '13 at 23:15
  • perhaps a missing : for j in xrange(len_word2): – tk. Oct 25 '13 at 23:17
  • @missmayhem13: You didn't copy the code from the blog post correctly; there is a loop missing `for j in xrange(len_word2):`. – Martijn Pieters Oct 25 '13 at 23:24
1

Looks like you've fixed this issue, but if you don't want to implement all of these yourself, you can use the jellyfish package found in pypi: https://pypi.python.org/pypi/jellyfish. I've used it to great success in the past.

It contains several distance functions, including Damerau-Levenshtein distances.

VooDooNOFX
  • 4,674
  • 2
  • 23
  • 22