Questions tagged [fuzzy]

DO NOT USE - ambiguous: see fuzzy-search, fuzzy-logic, or image-processing for more appropriate tags.

Do not use this tag. It is ambiguous: see , , or for more appropriate tags.

363 questions
2
votes
2 answers

Matching ID's to a varied set of names

I have a dataset containing a list of company names, and a respective ID for them. There are multiple instances of each company, with some appearing differently. There is at least one instance of each company name that has an ID, but not all of them…
mkh
  • 85
  • 3
2
votes
0 answers

Are there fuzzy logic type extensions for Drools 7.x?

Drools-chances was an initial implementation for fuzzy logic. Discussion in 2013 on http://blog.athico.com/2009/05/take-your-chance.html indicated that there might be progress for the brave. Since I plan on implementing a fuzzy rule system it…
datadriven
  • 21
  • 2
2
votes
3 answers

How can I use ~ to fuzzy match two fields of a table?

I'm trying to perform a join over two tables that contain info about the same companies, but sometimes the companies are stored with slightly different names (e.g. table 1: Company X -> Table 2: Company X and Friends). My idea was to full join each…
2
votes
1 answer

Jess and FuzzyJ assistance

I'm trying to learn Jess and FuzzyJ but am having problems getting a simple program to run. I have looked at it for hours and am no quite sure why it doesn't run. If someone could point me in the right direction it would be very much…
Roy McAvoy
  • 21
  • 2
2
votes
0 answers

Fuzzy string matching with group_by

I have to identify payments that are not always transferred with the same combination of NAME - IBAN. Let's say I have a table called "payments" that looks like this: IBAN NAME ABCD James Dito ABCD James D. ABCD J Dito ABCD …
AllanLC
  • 167
  • 2
  • 11
2
votes
1 answer

fuzzy string matching with agrep()

I´m matching a list of company names against itself with R and agrep() because the data was stored wrong in a legacy system - No 4th normal form, companys were recorded on the same level as customers, which means a new company entry for every new…
Salfii
  • 87
  • 1
  • 1
  • 9
2
votes
1 answer

Using FuzzyFinder in vim (+MiniBuffer), open file in current buffer

I'm using FuzzyFinder in vim together with MiniBufExplorer (with this setting in my .vimrc: g:miniBufExplorerMoreThanOne = 1). I'm using FuzzyFinder in coverage-file mode (where it works pretty much like command-t, from what I understand). The…
Edan Maor
  • 9,772
  • 17
  • 62
  • 92
2
votes
3 answers

Damerau–Levenshtein distance for language specific quirks

To Dutch speaking people the two characters "ij" are considered to be a single letter that is easily exchanged with "y". For a project I'm working on I would like to have a variant of the Damerau–Levenshtein distance that calculates the distance…
2
votes
3 answers

Fuzzy sort implementation in PHP

I am dealing with specific request from my users, that I am unable to "break". Situation: we work with historic data, that sometimes have uknown date values. We, for example, know, that something happend in year 1943, but we do not know when…
Radek
  • 519
  • 5
  • 16
2
votes
1 answer

Fuzzy mapping in R

I am trying to use agrep command for fuzzy matching. I have a data frame in which one column contains the audience response and another dataframe in which segment and subsegment are listed. the column audience response contains the words that are…
Shaz
  • 25
  • 4
2
votes
1 answer

multiple columns similarity comparison with trigram similarity operator %

I need to perform fuzzy match filtering (in WHERE clause) in PostgreSQL by using trigram similarity operator %. For comparing a field pair it is simply table1.field1 % table2.field2 and GIN or GIST indexes can be used to dramatically increase…
zlatko
  • 596
  • 1
  • 6
  • 23
2
votes
2 answers

fuzzy matching two strings uring r

I have two vectors, each of which includes a series of strings. For example, V1=c("pen", "document folder", "warn") V2=c("pens", "copy folder", "warning") I need to find which two are matched the best. I directly use levenshtein distance. But it…
Feng Chen
  • 2,139
  • 4
  • 33
  • 62
2
votes
0 answers

FuzzyWuzzy using two pandas dataframes python

I want to find the fuzz.ratio of strings that are in two dataframes. Let's say I have 2 dataframes df with columns A, B and bt_df with columns A1, B1.. I want to compare the column df['B'] and bt_df['B1'] and return the best matching score and its…
User1090
  • 859
  • 6
  • 13
  • 19
2
votes
1 answer

Teradata SQL to Extract Records Based on Approximate String Matching

We are on version TD 14 and I come from Netezza / Postgre(Redshift) background. I have been asked to extract a login data from audit logs to find out records/transactions where the same ip is submitting similar looking usernames with small changes.…
Samir Parmar
  • 21
  • 1
  • 3
2
votes
2 answers

Golang Sort package - Fuzzy sorting error

I tried to modify standard sorting approach and add certain randomness to sorting Less interface. when if (u[i] - u[j]) <= 0 or if u[i] < u[j] it works as expected But if (u[i] - u[j]) <= rv condition produces panic after several…
Y01rY5Ogfl
  • 41
  • 1
  • 1
  • 8