Questions tagged [fuzzy-search]

A search mechanism where the objective is to find all approximate, relevant or possibly relevant results for the search-key rather than finding an exact match.

Fuzzy search is a search mechanism based on , where the objective is to find all approximate, relevant or possibly relevant results for keywords rather than finding an exact match. This allows for matches even where the keywords are misspelled or only hint at a concept.


Related tags

954 questions
12
votes
1 answer

Apply fuzzy matching across a dataframe column and save results in a new column

I have two data frames with each having a different number of rows. Below is a couple rows from each data set df1 = Company City State ZIP FREDDIE LEES AMERICAN GOURMET SAUCE St. Louis MO…
Jstuff
  • 1,266
  • 2
  • 16
  • 27
12
votes
2 answers

Fuzziness settings in ElasticSearch

Need a way for my search engine to handle small typos in search strings and still return the right results. According to the ElasticSearch docs, there are three values that are relevant to fuzzy matching in text queries: fuzziness, max_expansions,…
Clay Wardell
  • 14,846
  • 13
  • 44
  • 65
11
votes
1 answer

How to group words whose Levenshtein distance is more than 80 percent in Python

Suppose I have a list:- person_name = ['zakesh', 'oldman LLC', 'bikash', 'goldman LLC', 'zikash','rakesh'] I am trying to group the list in such a way so the Levenshtein distance between two strings is maximum. For finding out the ratio between two…
python
  • 4,403
  • 13
  • 56
  • 103
11
votes
3 answers

Python Fuzzy Matching (FuzzyWuzzy) - Keep only Best Match

I'm trying to fuzzy match two csv files, each containing one column of names, that are similar but not the same. My code so far is as follows: import pandas as pd from pandas import DataFrame from fuzzywuzzy import process import csv save_file =…
Kvothe
  • 1,341
  • 7
  • 20
  • 33
11
votes
1 answer

Fuzzy Bit Matching

I have a very long sequence of bits, called A, and a shorter sequence of bits, x. Two bit sequences of the same length are fuzzy-matched when after aligning them, there are k or fewer mismatched bits. I want to find all such fuzzy occurrences of x…
darksky
  • 1,955
  • 16
  • 28
11
votes
4 answers

Lucene fuzzy search on a phrase (FuzzyQuery + SpanQuery)

I am looking for a way of coding the lucene fuzzy query that searches all the documents, which are relevant to an exact phrase. If I search "mosa employee appreciata", a document contains "most employees appreciate" will be returned as the result.…
user2660171
  • 139
  • 1
  • 1
  • 8
10
votes
2 answers

Fuzzy Matching with threshold filter C#

I need to implement some kind of this: string textToSearch = "Extreme Golf: The Showdown"; string textToSearchFor = "Golf Extreme Showdown"; int fuzzyMatchScoreThreshold = 80; // One a 0 to 100 scale bool searchSuccessful =…
Nazar Grynko
  • 101
  • 1
  • 3
10
votes
1 answer

fuzzy searching with query_string Elasticsearch

i have a record saved in Elasticsearch which contains a string exactly equals to Clash of clans now i want to search this string with Elasticsearch and i using this { "query_string" : { "query" : "clash" } } its working perfectly…
maq
  • 1,175
  • 3
  • 17
  • 34
10
votes
4 answers

Lucene query: bla~* (match words that start with something fuzzy), how?

In the Lucene query syntax I'd like to combine * and ~ in a valid query similar to: bla~* //invalid query Meaning: Please match words that begin with "bla" or something similar to "bla". Update: What I do now, works for small input, is use the…
10
votes
0 answers

Can I combine fuzzy and proximity searches in Apache Lucene's standard syntax?

I am searching a codebase indexed by OpenGrok, the -a option has been enabled to allow the first character of a search term to be a wildcard. I would like to find all occurrences of method foo that take some string parameter (foo("") with one or…
MilesHampson
  • 2,069
  • 24
  • 43
9
votes
3 answers

Is it possible to use fzf (command line fuzzy finder) with windows 10 git-bash?

I downloaded the .exe file and placed it into my PATH variable. fzf seems to work in command prompt. But I would like to use it in git-bash. When i use fzf in git-bash it seems to start but nothing happens. Any advice would be helpful. I'm trying…
warnerm06
  • 654
  • 1
  • 9
  • 20
9
votes
2 answers

Mongodb partial matching

How to get all documents in mongodb with one levenshtein distance. I have collection for football teams. { name: 'Real Madrir', nicknames: ['Real', 'Madrid', 'Real Madrir' ... ] } And user searched Real Madid of Maddrid or something…
Gor
  • 2,808
  • 6
  • 25
  • 46
9
votes
2 answers

Advice on how to improve a current fuzzy search implementation

I'm currently working on implementing a fuzzy search for a terminology web service and I'm looking for suggestions on how I might improve the current implementation. It's too much code to share, but I think an explanation might suffice to prompt…
AHungerArtist
  • 9,332
  • 17
  • 73
  • 109
9
votes
1 answer

How do I fuzzy match word to a full word (and only full word) in a sentence?

Most commonly misspelled English words are within two or three typographic errors (a combination of substitutions s, insertions i, or letter deletions d) from their correct form. I.e. errors in the word pair absence - absense can be summarized as…
zelusp
  • 3,500
  • 3
  • 31
  • 65
9
votes
1 answer

Fuzzy text search in Python

I am wondering if there is a Python library can conduct fuzzy text search. For example: I have three keywords "letter", "stamp", and "mail". I would like to have a function to check if those three words are within the same paragraph (or certain…
TTT
  • 4,354
  • 13
  • 73
  • 123