Questions tagged [fuzzy]

DO NOT USE - ambiguous: see fuzzy-search, fuzzy-logic, or image-processing for more appropriate tags.

Do not use this tag. It is ambiguous: see , , or for more appropriate tags.

363 questions
7
votes
2 answers

Fuzzy date/time management library for .NET

I am searching for a .NET library that can store and manage fuzzy (i.e. uncertain) dates/times, that is, temporal expressions that do not follow the usual precise pattern of day, month, year, hour, minute, second. I need something that can handle…
CesarGon
  • 15,099
  • 6
  • 57
  • 85
7
votes
2 answers

Very Fast string fuzzy matching in R

I have a set of 40.000 rows x 4 columns and I need to compare each column to itself in order to find the most closest result or the minimum levenshtein distance. The idea is to get an "almost duplicate" for every row. I have calculated with "adist"…
ecp
  • 319
  • 1
  • 6
  • 18
6
votes
1 answer

How to limit fuzzy join only returning one match

I am trying to create a program in R to replace city names or airport names with the three digit airport code. I want to do fuzzy matching to allow more flexibility since the data with the city/airport names I am trying to replace is coming in from…
sarahbarnes
  • 103
  • 2
  • 7
6
votes
4 answers

How to spot and analyse similar patterns like Excel does?

You know the functionality in Excel when you type 3 rows with a certain pattern and drag the column all the way down Excel tries to continue the pattern for you. For example Type... test-1 test-2 test-3 Excel will continue it…
dr. evil
  • 26,944
  • 33
  • 131
  • 201
5
votes
2 answers

Order-independent fuzzy matching of "Firstname Lastname"/"Lastname Firstname" in R?

I have two lists of names for the same set of students which have been collected separately. There are numerous typographical errors and I have been using fuzzy matching to link the two lists. I am 99+% there with agrep and similar, but am stuck on…
Jonathan Burley
  • 771
  • 1
  • 6
  • 8
5
votes
2 answers

Does stemming and fuzzy search work together in Apache Solr

I am using porter filter factory for a field which has 3 to 4 words in it. Eg : "ABC BLOSSOM COMPANY" I expect to fetch the above document when i search for ABC BLOSSOMING COMPANY as well. When i query this: name:ABC AND name:BLOSSOMING AND…
Bhavana67
  • 116
  • 8
5
votes
0 answers

Potential Bug in Apache's Jaro Winkler implementation?

We have been using the Jaro Winkler fuzzy matching algorithm implementation from Apache Commons text and whilst studying the code we found a potential flaw. It seems that this implementation is based on the very comprehensible Wikipedia article…
gil.fernandes
  • 12,978
  • 5
  • 63
  • 76
4
votes
1 answer

returning fuzzy match percentage in solr query result

I've implemented solr/lucene fuzzy match for my system and its working perfectly. I have requirement to display percentage fuzzy match after query sends response back. As an example if my index data is "rushikupadhyay" and if my query is…
Rushik
  • 1,121
  • 1
  • 11
  • 34
4
votes
2 answers

Matching strings with abbreviations; fuzzy matching

I am having trouble matching character strings. Most of the difficulty centers on abbreviation I have two character vectors. I am trying to match words in vector A (typos) to the closes match in vector B. vec.a <- c("ce", "amer", "principl") vec.b…
YouLocalRUser
  • 309
  • 1
  • 9
4
votes
1 answer

How can I use fuzzy search with synonyms?

Fuzziness stopped working after me adding synonym file to the index. It seems like , it's not possible to use them at the same time. My query: "query": { "dis_max": { "queries": [{ "multi_match": { …
Rashida
  • 41
  • 5
4
votes
1 answer

How two check if two unstructured street adresses strings are the same?

I need to compare two unstructured addresses and be able to identify if they are the same (or similar enough). Scenario Address is supplied by the end user in plain text. There is nothing to help the user to write on a more identifiable manner (no…
Minduca
  • 1,121
  • 9
  • 19
4
votes
1 answer

Levenshtein Generalization for Graphs?

Is there a generalization of the levenshtein distance for searching for structures in graphs?
234523458
  • 151
  • 1
  • 3
3
votes
2 answers

Search word in string and Split concatenated string based on another string

how can i do below in php ? I have two inputs $bankdata and $databasedata. Problem: Split word in string if it matches with other string word String which has more spaces will be treated as base string in below case $databasedata would be treated…
user10655999
3
votes
2 answers

Fuzzy Match in Conditional Karate API Testing Tool

We recently started using Karate as our integrated testing tool on the project we're currently developing and I've faced an issue recently which I'd like to know why is happening. Let's go through this: One of the tests we do in all our APIs is…
3
votes
2 answers

fuzzy matching in sql

given two tables with client information. One is with sales data, other is an enrichment mapping. Field for client name is present in both tables, also is the country of residence and the city of residence. The latter two are clean…
Peter Blazsik
  • 77
  • 3
  • 11
1
2
3
24 25