Questions tagged [edit-distance]

A string metric describing the differences between two strings. More specifically, it is the number of operations that transform one string into another string. Operations include the insertion, deletion, substitution, or transposition of a character in the string. Operations can be considered in combinations and may have different costs.

References

Edit distance (Wikipedia)

256 questions
1
vote
0 answers

What is an idiomatic way to generate a sequence of graphs from networkx.optimal_edit_paths?

I recently learned how to interpret the output of networkx.optimal_edit_paths, and I am interested in generating a sequence of graphs based on the sequences of edits. Let us suppose the example from the NetworkX documentation: import networkx as…
Galen
  • 1,128
  • 1
  • 14
  • 31
1
vote
0 answers

Edit distance for a four-digit sequential ranking in R? (stringdist)

Right now, I am trying to create scale scores for participants who ranked four job candidates (A, B, C, and D) to a role from best fit to worst fit. The correct order is A, D, C, B. As far as my dataframe goes, the correct sequence for columns A, B,…
xenotharm
  • 11
  • 1
1
vote
1 answer

How can I improve error reporting using PlUnit?

I'm writing tests using SWI-Prolog's PlUnit and would like to provide a better error message, perhaps by diffing what I've got from what I was expecting. The following minimal working example (MWE) exemplifies what I'm after: :- module(mwe,…
Bruno Kim
  • 2,300
  • 4
  • 17
  • 27
1
vote
2 answers

Edit distance leetcode

So I am doing this question of EDIT DISTANCE and before going to DP approach I am trying to solve this question in recursive manner and I am facing some logical error, please help.... Here is my code - class Solution { public int minDistance(String…
1
vote
1 answer

The difference in application between SequenceMatcher in edit distance and that in difflib?

I know the implementation of the edit distance algorithm. By dynamic programming, we first fill the first column and first row and then the entries immediately right and below of the filled entries by comparing three paths from the left, above, and…
Lerner Zhang
  • 6,184
  • 2
  • 49
  • 66
1
vote
0 answers

How compare two codes with antlr?

I have two codes and I would like to be able to see their identical syntax with antlr String s1 = "public class Test { public static int add (int i, int j) { int g = i + j; return g;}}"; String s2 = "public class Essai { public static int add (int…
1
vote
1 answer

How to implement a filter that takes into account the user's typo?

I have the following pseudocode with the filter implementation, it works by coincidence with what the user entered in the input field. When the user enters the word "tes", he sees item "test", if he writes "tesz", then he will not see items. How can…
Ksenia
  • 950
  • 6
  • 24
1
vote
1 answer

Worst case time complexity of edit distance?

I am trying to calculate the worst case scenario time complexity for finding the edit distance from T test words to D dictionary words, where all words have a length MAX_LEN.
blurred42
  • 19
  • 3
1
vote
1 answer

Is it always true that the edit distance of two strings is equal to the edit distance of their substrings?

Suppose we have two strings: ccttgg gacgct The edit distance of these two strings is 6. Possible substrings are: cctt-- gacg-- Their edit distance is 4. The remaining parts to equal the original two strings are: ----gg ----ct and their edit…
John
  • 23
  • 4
1
vote
3 answers

Is there a way to perform edit distance (Levenshtein) character by character between two string columns?

I have two datasets: dataset1 & dataset2, which have a common column called SAX which is a string object. dataset1= SAX 0 gangsyu 1 zicobgm 2 eerptow 3 cqbsynt 4 zvmqben .. ... 475 rfikekw 476 bnbzvqx 477 rsuhgax 478…
udkr
  • 55
  • 6
1
vote
0 answers

Algorithm to Check if Strings are Equal Allowing Rotations and up to K Replacements

Imagine the problem of finding if one string "STR1" is a rotated version of another string "STR2". This problem is simple and just requires searching for either string in the other string concatenated with itself. However how would you solve this…
1
vote
2 answers

Python Edit distance algorithm with dynamic programming and 2D Array - Output is not optimal

I have encountered the edit distance (Levenshtein distance) problem. I have looked at other similar stackoverflow questions, and is certain that my question is distinct from them - either from the language used or the approach. I have used a 2D…
1
vote
0 answers

How does lucene proximity search work with multiple out of order words?

I'm working in elastic search and have a sentence such as this: "Vet caring dog license cat bird" If I want to search for "bird dog vet" I would need to use at least ~7 as the proximity parameter ("bird dog vet"~7). Why is it ~7?
Fanylion
  • 364
  • 2
  • 5
  • 14
1
vote
1 answer

Finding which error(s) are detected by Damerau-Levenshtein edit distance algorithm

I'm creating a spelling correction tool and wanted to implement a noisy channel with Bayes theorem. In order to do so, I need to calculate the probability P(X|W), where X is the given (misspelled) word, and W is the possible correction. The…
1
vote
2 answers

Discriminate edit distance

The levenshtein edit distance cares only about how many edits are done and not on what exactly they are, so the following two pairs will have the same edit distance. ("A P Moller - Maersk A", "A.P. Moller - Maersk A/S Class A") ("A P Moller - Maersk…
user9231304