2

I wanted to implement a modification of the basic edit distance algorithm. That is, the weighted edit distance. (Context: Spelling errors while trying to create a search engine)

For example, the cost of substituting s by a would be lesser than substituting s by, say, p.

The algorithm for this using DP would require a simple change, i.e.,

d[i, j] := minimum(d[i-1, j] + 1,                         // deletion
                         d[i, j-1] + 1,                   // insertion
                         d[i-1, j-1] + substitutionCost)  // substitution

I looked, but I could not find such a matrix anywhere, that would give me the appropriate substitutionCost for all pairs of letters. I mean, I want the costs to be based on the distance between letters on the keyboard. Has nobody explicitly defined such a matrix yet?

Mallika
  • 55
  • 1
  • 2
  • 12

1 Answers1

0

I have written a c++ code that should work, i have also made the assumption that the keys are placed symmetrically:

#include<bits/stdc++.h>

using namespace std;

string s[3];
int mat[35][35];

int main() {
    s[0] = "qwertyuiop";
    s[1] = "asdfghjkl;";
    s[2] = "zxcvbnm,./";

    for(int i = 0;i < 10;i++){
        for(int j = 0;j < 3;j++){
            for(int k = 0;k < 10;k++){
                for(int l = 0;l < 3;l++){
                    if(j == 1 && i > 8) continue;if(l == 1 && k > 8) continue;
                    if(j == 2 && i > 6) continue;if(l == 2 && k > 6) continue;
                    int st1 = s[j][i] - 'a';
                    int st2 = s[l][k] - 'a';
                    mat[st1][st2] = abs(j-l) + abs(i-k);
                }
            }
        }
    }
    for(int i = 0;i < 26;i++){
        for(int j = 0;j < 26;j++){
            cout << (char)(i+'a') << " " << (char)(j+'a') << " " << mat[i][j] << endl;
        }
    }

return 0;
}

Link to output on Ideone : http://ideone.com/xq7kKp

Here mat[i][j] contains the distance between keys.

uSeemSurprised
  • 1,826
  • 2
  • 15
  • 18