Questions tagged [edit-distance]

A string metric describing the differences between two strings. More specifically, it is the number of operations that transform one string into another string. Operations include the insertion, deletion, substitution, or transposition of a character in the string. Operations can be considered in combinations and may have different costs.

References

Edit distance (Wikipedia)

256 questions
1
vote
1 answer

Random error core dump :Error in `./a.out': free(): invalid next size (fast): 0x00000000010e8d70 *** Aborted (core dumped)

#include #include #include #include using std::string; int edit_distance(const string &str1, const string &str2) { std::vector> strMat(str1.length()+1,std::vector(str2.length(),0)); …
RajGM
  • 37
  • 6
1
vote
1 answer

Levenshtein distance with substitution, deletion and insertion count

There's a great blog post here https://davedelong.com/blog/2015/12/01/edit-distance-and-edit-steps/ on Levenshtein distance. I'm trying to implement this to also include counts of subs, dels and ins when returning the Levenshtein distance. Just…
user12314098
1
vote
0 answers

How to find edit distance between two s-expressions?

Would such an edit distance be a good measure of expression similarity? What if we wanted a semantic difference of two s-expressions? Can such an edit distance be used for s-expression compression?
X10D
  • 600
  • 2
  • 13
1
vote
3 answers

An algorithm for computing the edit-distance between two words

I am trying to write Python code that takes a word as an input (e.g. book), and outputs the most similar word with similarity score. I have tried different off-the-shelf edit-distance algorithms like cosine, Levenshtein and others, but these cannot…
1
vote
0 answers

How to find the edit distance / levenshtein distance between a string and a language?

Edit distance of a string x from a language S is the edit distance of x from the 'closest' string y ∈ S. Given a string x∈{0,1,...,9,(,),+,-,*,/} * , I want to find an efficient algorithm that calculates the edit distance between x and the language…
1
vote
2 answers

How to check if a similar sub-string has appeared in string with customise tolerance level

How to check whether a substirng is inside a string with specific edit distance tolerance. For example: str = 'Python is a multi-paradigm, dynamically typed, multipurpose programming language, designed to be quick (to learn, to use, and to…
R.yan
  • 2,214
  • 1
  • 16
  • 33
1
vote
1 answer

Something wrong with char type in my code for EditDistance Recursive

I 'm reading "The Algorithm Design Manual (2nd Edition)". C++ is new for me. I try to use example of author: string_compare(), and only code by myself main(). Output is wrong. I guess my main 's having problem with char s[], pointer. Anyone can help…
user1879108
1
vote
1 answer

Variation of finding edit distance with only insertions and deletions?

I need to find the edit distance between a word and its sorted word (ex: apple and aelpp), using only insertions and deletions recursively. I have found some sources that used insertions, deletions, and substitutions, but I am not sure how to only…
Altaaf Ackbar
  • 99
  • 1
  • 9
1
vote
1 answer

Python: Efficient way to find Levenshtein edit distance in a matrix

I would like to identify the similarity between two lists after that I want to do clustering of descriptions. L2D1 L2D2 L2D2 .........L2Dn L1D1 0 0.3 0.8............0.5 L1D2 0.2 0.7 …
PCH
  • 43
  • 5
1
vote
1 answer

Levenshtein edit distance and different sets of edits

I was just going over some questions but I got stuck at a Levenshtein edit distance question. So the first part of the question was: What is the Levenshtein edit distance between the strings STRONGEST and TRAINERS? Which I calculated as 6. But the…
Min
  • 528
  • 1
  • 7
  • 26
1
vote
1 answer

Edit Distance between all the columns of a pandas dataframe

I am interested in calculating the edit distances across all the columns of a given pandas DataFrame. Let's say we have a 3*5 DataFrame - I want to output something like this with the distance scores - (column*column matrix) col1 col2 col3 col4…
1
vote
0 answers

Similarity between two ordered lists of numbers with fuzziness

I have ordered lists of numbers (like barcode positions, spectral lines) that I am trying to compare for similarity. Ideally, I would like to compare two lists to get a value from 1.0 (match) degrading gracefully to 0. The lists could be offset by…
1
vote
0 answers

Count numbers with Hamming distance less than or equal than given k from a given set of integers

I admit that this problem is part (although small) of school programming assignment, but I wasn't able to find much of a hint online and my solutions are very slow so far. Here is my problem defined more precisely: Given a vector of integers v and…
Karel Křesťan
  • 399
  • 2
  • 17
1
vote
0 answers

Setting an upper bound for a Levenshtein (edit) distance in python

I need to calculate the edit distance between DNA sequences in order to group similar sequences together. If sequences have a distance that is larger than some threshold they are considered different and I don't care what the actual value is, so…
kenissur
  • 171
  • 1
  • 2
  • 7
1
vote
1 answer

Java error "main" java.lang.OutOfMemoryError: Java heap space

SO i have written the following main function in java to compute editdistancee of 1000 random generated pairs of length 10, 20, 50 and 100. It is running fine for the lengths 10 n 20 but for length 50 it is giving this error. "Exception in thread…
Usman Malik
  • 11
  • 2
  • 7