0

This is a minimal Levenshtein (edit distance) implementation using Clojure with recursion:

(defn levenshtein [s1, i1, s2, i2]
  (cond
    (= 0 i1) i2
    (= 0 i2) i1
    :else
    (min (+ (levenshtein s1 (- i1 1) s2 i2) 1)
         (+ (levenshtein s1 i1 s2 (- i2 1)) 1)
         (+ (levenshtein s1 (- i1 1) s2 (- i2 1)) (if (= (subs s1 i1 (+ i1 1)) (subs s2 i2 (+ i2 1))) 0 1))
         )
    )
  )

(defn levenshteinSimple [s1, s2]
  (levenshtein s1, (- (count s1) 1), s2, (- (count s2) 1)))

Which can be used like this:

(println (levenshteinSimple "hello", "hilloo"))
(println (levenshteinSimple "hello", "hilloo"))
(println (levenshteinSimple "bananas", "bananas"))
(println (levenshteinSimple "ananas", "bananas"))

And prints this:

2
2
0
1

How can you add memoize to this implementation to improve performance?

Please note: I am a Clojure beginner. These are my first lines in Clojure

Carcigenicate
  • 43,494
  • 9
  • 68
  • 117
gil.fernandes
  • 12,978
  • 5
  • 63
  • 76

1 Answers1

1

The simplest way is to just make use of the memoize function. It takes a function, and returns a memoized function:

(let [mem-lev (memoize levenshteinSimple]
  (println (mem-lev "hello", "hilloo"))
  (println (mem-lev "hello", "hilloo"))
  (println (mem-lev "bananas", "bananas"))
  (println (mem-lev "ananas", "bananas")))

mem-lev will memorize every argument you give it and the result of what your function returns, and will return the cached result if it's already seen the arguments that you gave it.

Note that this will not cause the recursive calls to become memoized, but it's unlikely that any recursive calls would benefit from memoization anyways.

This will also not cause your original function to become memoized. In this example, only mem-lev will be memoized. If you really wanted to have your global function memoized, you could change your definition to something like:

(def levenshteinSimple
  (memoize
    (fn [s1, s2]
      ...

But I wouldn't recommend doing this. This causes the function itself to hold state, which isn't ideal. It'll also hold onto that state for the length of the program, which could cause memory issues if abused.

(As a great exercise, try writing your own version of memoize. I learned a lot by doing that).

Carcigenicate
  • 43,494
  • 9
  • 68
  • 117
  • Many thanks for this. I will look into creating my own version. For the time being I am still struggling with the syntax. – gil.fernandes Dec 07 '17 at 16:56
  • 1
    @gil.fernandes lisps are difficult at first because their syntax is so different than most other languages. Stick with it though and you'll begin to appreciate the simplicity of it all. Make sure you're using a program to manage your braces for you. I use IntelliJ for my IDE, and use the Cursive plugin so I can use it with Clojure. Cursive has "par-infer" capabilities, so I don't have to manually manage all my close braces. I just indent my code like I would in Python, and it uses that to close all the braces for me. And the exercise suggestion should probably wait a bit. Give it a month or 2. – Carcigenicate Dec 07 '17 at 17:45