Questions tagged [edit-distance]

A string metric describing the differences between two strings. More specifically, it is the number of operations that transform one string into another string. Operations include the insertion, deletion, substitution, or transposition of a character in the string. Operations can be considered in combinations and may have different costs.

References

Edit distance (Wikipedia)

256 questions
1
vote
1 answer

how do you make a string dictionary function in lua?

Is there a way if a string is close to a string in a table it will replace it with the one in the table? Like a spellcheck function, that searches through a table and if the input is close to one in the table it will fix it , so the one in the table…
1
vote
1 answer

Distance edit array output

I am doing an edit distance with the user input. I am storing my values in array. then the edit distance will compare the user input with my array of strings. I am doing a loop that if the edit distance is more than 2 it will display invalid else…
1
vote
2 answers

Maintaining headers in edit distance

I am running edit distance using stringdist. The output replaces the input with a numbered list instead of the actual string being compared. This is currently what I have: library(stringdist) a <- c("foo", "bar", "bear", "boat", method =…
El David
  • 375
  • 2
  • 3
  • 11
1
vote
0 answers

Spelling Assistance using Levenshtein Algorithm edge case debugging

Greetings fellow developers! BACKGROUND I have implemented a spelling assistance algorithm that corrects user inputs to an edit distance of 1, using a dictionary file. I primarily use maps in C++ for the implementation i.e. to store the dictionary…
J. Doe
  • 17
  • 4
1
vote
1 answer

TypeError: unhashable 'list'

I am building a program for comparing each promocode(might contain ocr error) in a list to all the promocode in another list(list of correct promocodes) the expected output is edit distance and the promo code with least edit distance to the one…
1
vote
4 answers

Perl module to check whether a string was created by inserting text into another string

Problem I have two strings $base and $succ and want to check whether $succ can be created from $base by inserting arbitrary strings at arbitrary positions. Some examples, written as isSucc($base, $succ): isSucc("abc", "abcX") is true, since we can…
Socowi
  • 25,550
  • 3
  • 32
  • 54
1
vote
1 answer

Select enum value by string similarity

I have an enum with six distinct values: One Two Three Four Five Six which is filled from a config file (i.e. string) Let's say someone writes into the config file any of the values On one onbe or other common misspellings/typos, I want to set the…
Alexander
  • 19,906
  • 19
  • 75
  • 162
1
vote
1 answer

edit distance solution with O(n) space issue

Found a few different solutions and debugging, and especially interested in below solution which requires only O(n) space, other than store a matrix (M*N). But confused about what is the logical meaning of cur[i]. If anyone have any comments, it…
Lin Ma
  • 9,739
  • 32
  • 105
  • 175
1
vote
0 answers

Calculating the edit-distance between two lists of integers in R

I would like to use R to compare a lot of integer lists, using edit-distance. Ex: list1[231, 3883, 21099, 12, 2] and list2[433, 3883, 12, 919, 2] I'd like to get just the distance between those two list. Ex. with the lists above, the distance would…
1
vote
0 answers

Solr spellcheck's top suggestion is unexpected

I'm using solr 4.6.1 spellcheck component for spelling suggestions. I configured it to use DirectSolrSpellChecker with default distance function and comparator, which, as I understand, means the suggestions are ranked by edit distance (primary key),…
redoc
  • 146
  • 1
  • 3
1
vote
1 answer

Calculating levenshtein distance within a list Python

I have a list of strings and I want to filter out the strings that are too similar based on levenstein distance. So if lev(list[0], list[10]) < 50; then del list[10]. Is there any way I can calculate such distance between every pair of strings in…
Blue482
  • 2,926
  • 5
  • 29
  • 40
1
vote
2 answers

Variation of Edit distance algorithm that only tracks substitutions and insertions

Does anyone know of edit-distance algorithm that only counts substitutions and insertions. So basically, it would be Levenshtein Distance algorithm without deletions.
kchoi
  • 1,205
  • 5
  • 18
  • 32
1
vote
1 answer

Separately counting the number of deletions in the Levenshtein distance algorithm

So I'm aware that Levenshtein Distance algorithm takes into account the minimum number of deletions, insertions and substitutions required to change a String A into String B. But, I was wondering how you can separately keep track of number of…
kchoi
  • 1,205
  • 5
  • 18
  • 32
1
vote
1 answer

Edit Distance solution for Large Strings

I'm trying to solve the edit distance problem. the code I've been using is below. public static int minDistance(String word1, String word2) { int len1 = word1.length(); int len2 = word2.length(); // len1+1, len2+1, because finally…
prime
  • 14,464
  • 14
  • 99
  • 131
1
vote
1 answer

Scalding: Compare strings pairwise?

With Scalding I need to: Group string fields by first 3 chars Compare strings in all pairs in every group using edit-distance metric ( http://en.wikipedia.org/wiki/Edit_distance) Write results in CSV file where record is string; string;…
DarqMoth
  • 603
  • 1
  • 13
  • 31