Questions tagged [fuzzy]

DO NOT USE - ambiguous: see fuzzy-search, fuzzy-logic, or image-processing for more appropriate tags.

Do not use this tag. It is ambiguous: see , , or for more appropriate tags.

363 questions
3
votes
3 answers

Damerau–Levenshtein distance algorithm, disable counting of delete

How can i disable counting of deletion, in this implementation of Damerau-Levenshtein distance algorithm, or if there is other algorithm already implemented please point me to it. Example(disabled deletion counting): string1: how are you? string2:…
croisharp
  • 1,926
  • 5
  • 25
  • 40
3
votes
7 answers

Identifying if 2 HTML pages are similar

I'm trying to identify differences between a base case and supplied case. Looking for a library to tell me similarity in percentage or something like that. For Example: I've 10 different HTML pages. * All of them are 404 responses with only one 2…
Dev Dona
2
votes
1 answer

How can I detect if a sentence is contained in a page (fuzzy)?

I've been searching for a while now, but found nothing that suits my need so far. (This was helpful, but not convincing) From two different sources, I get two different strings. I want to check, if the shorter one is contained within the larger one.…
Dan Soap
  • 10,114
  • 1
  • 40
  • 49
2
votes
1 answer

Fuzzy K-modes clustering how to find the cluster centers

I'm trying to understand fuzzy k-modes algorithm (look mainly at page 3) in order to implement it. I'm stuck at the calculation of cluster centers they said as shown in the pic I need to know whether the following is true or false and please…
2
votes
1 answer

Fuzzy c- means categorical data

Can the fuzzy c-means applied on non numerical data sets ? i.e categorical or mixed numerical and categorical.. if yes (I hope so :( ): how we calculate cluster centers ? If NO , what is the alternative .. how to fuzzy clusters these data ? I…
AWRAM
  • 333
  • 2
  • 16
2
votes
2 answers

How to fuzzy match two character vectors in r

Context I have a df,where the id refers to a different person and the fruits_eat refers to the fruit that person eats. Also, I have a vector fruits_list storing a list of fruits. Question I want to generate a new variable fruits_in_list to indicate…
zhiwei li
  • 1,635
  • 8
  • 26
2
votes
2 answers

Fuzzy matching and grouping

I am trying to do fuzzy match and grouping using Python on multiple fields. I want to do the comparison on each column on a different fuzzy threshold. I tried to search on google but could not find any solution which can do deduplication and then…
sadashiv
  • 43
  • 6
2
votes
1 answer

Rank the row based on the similar text using python?

How to rank the data frame based on the row value. i.e I have a row that contains text data want to provide the rank based on the similarity? Expected output i have tried with the levistian distance but not sure how can i do for the whole…
Kum_R
  • 368
  • 2
  • 19
2
votes
0 answers

Javascript function or library to fuzzy search string into larger string as get fuzzy matched string's index,size and matching ratio

I have string a s1 = 'abcd' and s2 = 'this is demo abc' is there library or way to do following fuzzy matching in javascript - I want index of fuzzy matched string , it's size and ratio Ex.output - index - 13 fuzzy matched string size - 3 ratio -…
2
votes
1 answer

Relabeling categorical values in pandas data frame using fuzzy

I have a large data frame with 371 unique categorical entries, however some of the entries are similar and in some cases I want to merge certain categories that may have been seperated, for example I have 3 categories that I know…
2
votes
1 answer

Fuzzy regex match on million rows Pandas df

I am trying to check for fuzzy match between a string column and a reference list. The string series contains over 1 m rows and the reference list contains over 10 k entries. For eg: df['NAMES'] = pd.Series(['ALEXANDERS', 'NOVA XANDER', 'SALA…
StarMunch
  • 33
  • 4
2
votes
0 answers

Cannot install fuzzy python 3.8 on windows 10

I got this error when I try to install fuzzy in python 3.8 on windows 10 I had installed visual C++ 14.0 and windows 10 SDK Any help please The command : pip install Fuzzy The error: ERROR: Command errored out with exit status 1: command:…
2
votes
1 answer

Python FuzzyWuzzy ratio: how does it work?

Inside the FuzzyWuzzy ratio description it says: The FuzzyWuzzy ratio raw score is a measure of the strings similarity as an int in the range [0, 100]. For two strings X and Y, the score is defined by int(round((2.0 * M / T) * 100)) where T is the…
s900n
  • 3,115
  • 5
  • 27
  • 35
2
votes
2 answers

Remove duplicate approximate word matching using fuzzy python

I would like to ask on how to remove duplicate approximate word matching using fuzzy in python or ANY METHOD that is feasible. I have an excel that contains approximate similar name, at this point, I would like to remove the name that contains high…
Edison Toh
  • 87
  • 1
  • 11
2
votes
1 answer

Fuzzy (text / string) matching with AI (for handling common abbreviations)

Hi, i have the following task: 1) i have a list A of 700.000 train/bus stations with name 2) i have a list B of 300.000 train/bus stations with name (slightly different spelled of course) 3) for lets say 150.000 elements of B i know the exact match…