Questions tagged [levenshtein-distance]

A metric for measuring the amount of difference between two sequences. The Levenshtein distance allows deletion, insertion and substitution.

In information theory and computer science, the Levenshtein distance is a metric for measuring the amount of difference between two sequences. The Levenshtein distance between two strings is defined as the minimum number of edits needed to transform one string into the other. It is named after Vladimir Levenshtein, who considered this distance in 1965.

Levenshtein distance is a specific algorithm of edit distance algorithms.

References:
Wikipedia
RosettaCode
Edit Distance (Wikipedia)
Hirschberg's algorithm (Wikipedia)

967 questions

votes

2 answers

Levinshtein Distance of two words from text file with Python

I have a small 30 line text file with two similar words on each line. I need to calculate the levenshtein distance between the two words on each line. I also need to use a memoize function while calculating the distance. I am pretty new to Python…

python algorithm levenshtein-distance

asked Oct 09 '12 at 15:50

Ty Bailey

2,392
11
46
79

votes

1 answer

How do I compare 1 word with many and output a list of levenstien scores

I have a form where I can input two words then compare the levenshtein score, that works fine. I want to be able to compare 1 word with a string of words delimited by ", ". The whole lot then needs to echo out. Here's what I have so far: Levenstien…

arrays list levenshtein-distance

asked Oct 04 '12 at 20:12

user1721230

votes

1 answer

using levenshtein distance ratio to compare 2 records

I've created the mysql user function using the levenshtein distance and ratio source codes. I am comparing 2 records and based on a 75% match I want to select the record. Order comes into table paypal_ipn_orders with an ITEM title A query executes…

mysql distance levenshtein-distance

asked Aug 06 '12 at 21:47

user1542036

votes

1 answer

Comparing 2 strings to find if they contain the same words with java

I am using Levenshtein distance which is a string metric for measuring the amount of difference between two sequences to find the percent of difference between two strings. I want to use a better method to declare the strings are similar using words…

java search levenshtein-distance

asked Jul 23 '12 at 16:26

PrettyGirl

votes

1 answer

Any known javascript/php dictionaries like 'word1', 'word2'?

Just recently I was looking up about Levenshtein algorithm and after searching for an hour I couldn't find a javascript file like: var dictionary = [ 'coke', 'cokeman', 'cokeney' ] Is there a faster way to do this? I…

php javascript dictionary levenshtein-distance

asked Jul 14 '12 at 07:18

keji

5,947
3
31
47

votes

1 answer

levenshtein distance with items in list in python

I have two list, below, and i want to compare if words that are similar levenshtein distance of less than 2. I have a function to find the levenshtein distance, however as parameters it needs the two words. I can find which words are not in the…

python algorithm distance levenshtein-distance

asked Jul 11 '12 at 16:25

jacobLoz

votes

4 answers

mySQL showing as array

Im trying to get this code to work and for the life of me can not get it going... I want a search that shows a Did you mean. with the code i have all i get it "Did you mean: Array l:6" what is wrong with what i have here? $my_word =…

php mysql arrays levenshtein-distance

asked Jul 04 '12 at 14:05

David Morin

votes

2 answers

Is there any modified Minimum Edit Distance (Levenshteina Distance ) for incomplete strings?

I've sequences builded from 0's and 1's. I want to somehow measure their distance from target string. But target string is incomplete. Example of data I have, where x is target string, where [0] means the occurance of at least one '0' : x…

algorithm levenshtein-distance

asked Apr 21 '12 at 11:36

Qbik

5,885
14
62
93

-1

votes

1 answer

Can't install pandas-dedupe on Windows Python 3.9

Running pip install pandas-dedupe, I get the following error: I tried manually installing python-Levenshtein first and got the same problem with the addition . What can I do?

python pandas pip duplicates levenshtein-distance

asked Sep 23 '22 at 12:04

Corram

-1

votes

1 answer

Speeding up fuzzy match on large list

I am working on a project that uses fuzzy logic on a list of names that could go about 100,000 unique records. On the recent screening that we have conducted, the functions that we use can complete a single name within 2.20 seconds on average. This…

python pandas levenshtein-distance

asked Sep 06 '22 at 04:04

jsv

-1

votes

2 answers

Replace values in a column with similar values in another column with different size - Python

I have a dataframe with different values in a column (about 6,000 rows), which I need to replace with similar (but differents) values found in another dataframe, which has fewer rows. Store Values to replace Store A 05/15/21 Store…

python levenshtein-distance

asked Sep 01 '22 at 01:42

Eduardo

-1

votes

1 answer

Delete "almost duplicates" rows of string based on fuzzy matching with a lot of lines (>50 000)

I have 50 000 words like : add to add chicken a chicken eat the chicken to eat ... And i want to drop the line which have a high fuzzy similarity with other lines. Then the output should be: add to eat chicken ... I can't calculate every fuzzy…

python duplicates nearest-neighbor levenshtein-distance fuzzy-search

asked Jan 16 '22 at 17:35

Arnaud Hureaux

-1

votes

1 answer

Similarity between lists of floats

I have a list of floats that I want to compare to other lists and get the similarity ratio in python : The list that I want to compare: [0.0000,0.0003,-0.0001,0.0002, 0.0001,0.0003,0.0000,0.0000, -0.0002,0.0002,-0.0002,0.0002,…

python python-3.x levenshtein-distance difflib

asked May 31 '21 at 15:10

Elyes Lounissi

-1

votes

2 answers

SQL Left Fuzzy Join with Levenshtein Distance

I have two data sets from two different systems being merged together within SQL, however, there is a slight difference within the naming conventions on the two systems. The change in convention is not consistent across the larger data sample but…

sql left-join ssms levenshtein-distance fuzzyjoin

asked Mar 31 '21 at 17:30

tg00222

-1

votes

2 answers

Is Levenshtein distance algorithm performs better than Needleman Wunsch Algorithm?

I know that both Levenshtein and Needleman Wunsch has the time complexity of O(N*M) but I was curious to know which one performs better than the other and why?

performance time-complexity levenshtein-distance needleman-wunsch

asked Feb 10 '21 at 04:55

Azher Ahmed Efat

Prev 1 2 3

…

64 65 Next