0

I'm trying to implement Levenshtein distance in Prolog.

The implementation is pretty straightforward:

levenshtein(W1, W2, D) :-
    atom_length(W1, L1),
    atom_length(W2, L2),
    lev(W1, W2, L1, L2, D),
    !.

lev(_, _, L1, 0, D) :- D is L1, !.
lev(_, _, 0, L2, D) :- D is L2, !.
lev(W1, W2, L1, L2, D) :-
  lev(W1, W2, L1 - 1, L2, D1),
  lev(W1, W2, L1, L2 - 1, D2),
  lev(W1, W2, L1 - 1, L2 - 1, D3),
  charAt(W1, L1, C1),
  charAt(W2, L2, C2),
  ( C1 = C2 -> T is 0; T is 1 ),
  min(D1, D2, D3 + T, D).

% Returns the character at position N in the atom A
% The position is 1-based
% A: The atom
% N: The position at which to extract the character
% C: The character of A at position N
charAt(A, N, C) :- P is N - 1, sub_atom(A, P, 1, _, C).

% min(...): These rules compute the minimum of the given integer values
% I1, I2, I3: Integer values
% M:          The minimum over the values
min(I1, I2, M) :- integer(I1), integer(I2), ( I1 =< I2 -> M is I1; M is I2).
min(I1, I2, I3, M) :- min(I1, I2, A), min(I2, I3, B), min(A, B, M).

However, this code failures with this error:

?- levenshtein("poka", "po", X).
ERROR: Out of local stack

I'm using SWIPL implementation on Mac OS X Sierra.

Guy Coder
  • 24,501
  • 8
  • 71
  • 136
s1ddok
  • 4,615
  • 1
  • 18
  • 31
  • 1
    I know nothing about Prolog but your code is obviously recursive, and the recursion is probably going unbounded and blowing the stack. I would double check the terminating conditions. – Jonathon Reinhart Nov 28 '16 at 15:17

2 Answers2

5

There is a good reason for which your program does not work: your recursive calls lead into an infinite loop.

This is caused by those lines:

lev(W1, W2, L1 - 1, L2, D1),

lev(W1, W2, L1, L2 - 1, D2),

lev(W1, W2, L1 - 1, L2 - 1, D3),

min(D1, D2, D3 + T, D)

In Prolog things like L1 - 1 are expressions that do not get evaluated to numbers. Therefore your code will recursively call lev with the third argument as L1 -1, then L1 - 1 - 1, etc. which does not match your terminating rules.

To fix this you need to use temporary variables where you evaluate the result of e.g. L1 - 1.

This fixes it:

lev(W1, W2, L1, L2, D) :-
    L11 is L1 - 1,
    L22 is L2 - 1,
    lev(W1, W2, L11, L2, D1),
    lev(W1, W2, L1, L22, D2),
    lev(W1, W2, L11, L22, D3),
    charAt(W1, L1, C1),
    charAt(W2, L2, C2),
    ( C1 = C2 -> T is 0; T is 1 ),
    D4 is D3 + T,
    min(D1, D2, D4, D).

Now this does this:

?- levenshtein("poka","po",X).
X = 0.

Which is probably not the result you want, but at least it does not error. I will leave it to you to fix your predicate.

Fatalize
  • 3,513
  • 15
  • 25
3

There are several problems with your program.

The loop

@Fatalize already gave you a reason, here is a general method how you can localize such problems, using a by which some goals false are inserted into your program. If the remaining program loops, also the original version did:

?- levenshtein("poka","po",X), false.

levenshtein(W1, W2, D) :-
    atom_length(W1, L1),
    atom_length(W2, L2),
    lev(W1, W2, L1, L2, D), false,
    !.

lev(_, _, L1, 0, D) :- D is L1, !.
lev(_, _, 0, L2, D) :- D is L2, !.
lev(W1, W2, L1, L2, D) :-
  lev(W1, W2, L1 - 1, L2, D1), false,
  lev(W1, W2, L1, L2 - 1, D2),
  lev(W1, W2, L1 - 1, L2 - 1, D3),
  charAt(W1, L1, C1),
  charAt(W2, L2, C2),
  ( C1 = C2 -> T is 0; T is 1 ),
  min(D1, D2, D3 + T, D).

You have to modify something in the remaining, visible part. Otherwise, this problem will persist.

Use lists!

Instead of using atoms or strings, better use lists to represent words. The best is to add into your .swiplrc or .sicstusrc:

:- set_prolog_flag(double_quotes, chars).

In this manner, the following holds:

?- "abc" = [a,b,c].

Avoid cuts

Cuts somehow, sometimes work, but such programs are hard-to-debug. In particular for beginners. Therefore, avoid them at all costs

Use clean arithmetics

You are using the "olde" arithmetic of Prolog which is highly moded. Instead use_module(library(clpfd)) to get purer code.

false
  • 10,264
  • 13
  • 101
  • 209