Questions tagged [fuzzywuzzy]

FuzzyWuzzy is a Python package to perform fuzzy string matching.

FuzzyWuzzy is a Python package to perform fuzzy string matching.

Useful links

522 questions
0
votes
3 answers

Applying function to every cell in a Dataframe based on index and col

I have a pandas dataframe with a format exactly like the one in this question and I'm trying to achieve the same result. In my case, I am calculating the fuzz-ratio between the row's index and it's corresponding col. If I try this code (based on the…
mcansado
  • 2,026
  • 4
  • 25
  • 39
0
votes
1 answer

Unable to detect gibberish names using Python

I am trying to build Python model that could classify account names as either legitimate or gibberish. Capitalization is not important in this particular case as some legitimate account names could be comprised of all upper-case or all lower-case…
Stanleyrr
  • 858
  • 3
  • 12
  • 31
0
votes
1 answer

TypeError: 'NoneType' object is not subscriptable when creating a pd.Dataframe dictionary

The error I am receiving is : TypeError: 'NoneType' object is not subscriptable In this Method I am trying to do string matching along two files (test&master). The master file contains correctly spelled product names, while the test file contains…
Tim
  • 161
  • 7
  • 24
0
votes
3 answers

fuzzywuzzy to normalize string in pandas column

I have a dataframe like this now i want to normalize the string in the 'comments' column for the word 'election' . I tried using fuzzywuzzy but wasn't able to implement it on pandas dataframe to partially match the word 'election'. The output…
nOObda
  • 123
  • 1
  • 2
  • 9
0
votes
0 answers

Python - FuzzyWuzzy - Loop Issue

I am using fuzzy wuzzy for some word matching in Python. When I try and run a loop, I simply get no output. I'm wondering if I'm doing something wrong? I've recreated my problem with the code below: from fuzzywuzzy import fuzz, process choices =…
user5847481
  • 55
  • 1
  • 6
0
votes
1 answer

Multiple Spelling Results in a Dataframe 1

I have some data containing spelling errors. I'm correcting them and scoring how close the spelling is using the following code: import pandas as pd import difflib Li_A = ["potato", "tomato", "squash", "apple", "pear"] Q = {'one' :…
R. Cox
  • 819
  • 8
  • 25
0
votes
1 answer

String matching in Python referring to same entity

I'm working on some entity matching problem where I have to check if the records reference to the same business entity or not, Look at the below two records separated by pipes, Now the words on both side of the pipes refer to same entity, 1st record…
min2bro
  • 4,509
  • 5
  • 29
  • 55
0
votes
1 answer

Fuzzy Match List with Column in a data frame

I have a list of strings that I am trying to match to values in a column. If it is a low match (below 95) I want to return the current column value if it is above 95 then I want to return the best fuzzy match from the list . I am trying to put all…
EEPBAH
  • 113
  • 12
0
votes
1 answer

Matching similar strings with common significant words

I want to match similar strings with same significant word. Problem: I have two files one master and one input file. I have to iterate through the input file and find similar record from master. Currently I have indexed the master file in…
The6thSense
  • 8,103
  • 8
  • 31
  • 65
0
votes
1 answer

Python FuzzyWuzzy Score on Row in Pandas Dataframe

I want to iterate through a Pandas dataframe and get the fuzz.ratio score only for each row pair (not for all combinations). My dataframe looks like this: Acct_Owner, Address, Address2 0, Name1, NaN, 33 Liberty Street 1, Name2, 330 N Wabash Ave…
mwhee
  • 652
  • 2
  • 6
  • 17
0
votes
0 answers

Pandas DataFrame fuzzy/closest match merge

I have a Pandas DataFrame 1 (snippet below): df 1 I have generated another Dataframe 2 with some similar headings. I need to match up and merge the rows in both df's under the columns "Date" (datetime object), Latitude and Longitude (integers). I…
deanpwr
  • 191
  • 9
0
votes
1 answer

How to run PySpark with 3rd party Jars e.g. fuzzywuzzy?

Tried --jars option and --driver-class-jars etc. It still gave me 'no module fuzzywuzzy' found error.
user3610141
  • 305
  • 1
  • 4
  • 14
0
votes
0 answers

Cleaning semi redundant text

First a little background: Link to attachments I have a lot of text generated by some voice-to-text application (I honestly do not know the name of the application since I do not have physical access, however I have access to the live output). I am…
TobyFP
  • 1
  • 3
0
votes
1 answer

Appropriate matching of the string as per scoring using fuzzywuzzy and python3.6

I am trying to match the string using the fuzzy logic library fuzzywuzzy in my python application. I found that the fuzzywuzzy is not giving the appropriate results even after the scoring is equal, it is listing the wrong result in the first…
Jaffer Wilson
  • 7,029
  • 10
  • 62
  • 139
0
votes
2 answers

Django. How to return QuerySet order_by result of a fuzzy wuzzy method?

Here is my model: class Item(models.Model): status = models.IntegerField(choices=STATUS_CHOICES, default=3) def __str__(self): return 'Item: {0}'.format(self.id) class Name(models.Model): name = models.CharField(,…
gerpaick
  • 801
  • 2
  • 13
  • 36