Questions tagged [fuzzywuzzy]

FuzzyWuzzy is a Python package to perform fuzzy string matching.

FuzzyWuzzy is a Python package to perform fuzzy string matching.

Useful links

522 questions
0
votes
1 answer

Program fails on AWS EMR with hadoop (OK on local machine)

I am trying to use python's fuzzywuzzy package in mapper program for computing edit distance. My program runs fine on local machine but it fails on AWS emr cluster. I tried below two approaches(on both local machine and also on AWS EMR cluster): 1.…
Chandra
  • 526
  • 2
  • 9
  • 26
0
votes
1 answer

Error in installing fuzzywuzzy in IPython Notebook

I really don't know how to install a library with code. I've tried to intall fuzzywuzzy in IPython Notebook with the module pip but I get an error message: In [45]: import pip $ pip install fuzzywuzzy==0.3.1 File…
CreamStat
  • 2,155
  • 6
  • 27
  • 43
0
votes
2 answers

python highest fuzzy ratio to print line from list

I have a list consisting of some lines.I want to print the line matching word 'good' with highest fuzzyratio. Problem: Its only printing word instead of line in the list Coding: from fuzzywuzzy import fuzz c = ['I am python', 'python is good',…
asey raam
  • 23
  • 1
  • 3
0
votes
1 answer

Python missing module v 2.7.3 and Windows 7: Installed fuzzywuzzy, imports in powershell, not in IDLE

I'm betting there's a simple solution to this problem that I don't know, and from googling and stackoverflowing around it seems to have something to do with setting a path. I have anaconda installed on my computer and it seems to use python 2.7.4. …
Knut Knutson
  • 163
  • 2
  • 8
-1
votes
1 answer

Fuzzy calculation?

I would like to know how to enable the fuzzy evaluation/calculation. I found that scikit-fuzzy might be useful. But I can't find the consistent fuzzy matrix function. I assume that there will be some data platform or python code that can implement…
James
  • 165
  • 5
-1
votes
1 answer

Getting indexes from the results of a Fuzzy Matching stored in a Dictionary using Process.Extract

Using the below code, I was able to get the fuzzy-ly matched results from a dictionary, find_desc_dict, and store it inside another dictionary called complete_dict. for i, a in enumerate(recognized_keywords_search_desc): complete_dict[i+1] =…
f4tihkurt
  • 1
  • 2
-1
votes
1 answer

Fuzzy Wuzzy to match names

I am trying to match names with a list of names text_to_match = "sa" print(process.extract(text_to_match, ['sachin','saurabh','Amol'],scorer=fuzz.WRatio)) The results I got are as below [('sachin', 90), ('saurabh', 90), ('Amol', 33)] However i was…
Gupta
  • 314
  • 4
  • 17
-1
votes
3 answers

fuzzywuzzy with dictionary python

Below code works fine for array: g = ['hello how are you', 'how are you guys','what is your name'] s = ['how','guys'] MIN_MATCH_SCORE = 38 guessed_word = [word for word in g if fuzz.token_set_ratio(s, word) …
Titan
  • 244
  • 1
  • 4
  • 18
-1
votes
1 answer

Removing partial duplicates within the same column, while retaining the longer text?

so I'm new to Python and I was looking to remove partially similar entries within the same column. For example these are the entries in one of the columns in a dataframe- Row 1 - "I have your Body Wash and I wonder if it contains animal ingredients.…
Shrumo
  • 47
  • 7
-1
votes
1 answer

standardize company names in pandas dynamically

i have a dataframe with company names df: company_name abc Inc abc Inc Bolingbrook enterprise badh Shah enterprise Financial enterprise Financial Shah bass Dance bass School of Dance david Warner david Warner Real Estate…
user15330211
-1
votes
1 answer

How to eliminate the duplicate string from the list based on similarity score calculated with fuzzywuzzy ratio?

Let's say there are 4 lists: 1) [12b, shanti vihar, 12b shanti bihar, 201 Anupam residency, 401 enclaves] 2) [12b, shanti vihar, 12b shanti bihar, 12b shanti bihar, 401 enclaves] 3) [12b, shanti vihar, 12b shanti vihar, 12b shanti bihar, 12b shanti…
peeps
  • 43
  • 7
-1
votes
2 answers

How to combine data columns with similar column names Pandas

I have a data with many similar column names (basically mis-spell words), for example: apple grapes apples bana apyles grayes graph banana Here, I want to combine the columns 'apple, apples, apyles', then 'grapes, grayes,…
Math Avengers
  • 762
  • 4
  • 15
-1
votes
1 answer

Fuzzy matching with pyspark or python

I'm trying to do fuzzy matching using pyspark or python, where I have 2 lists. i. cities standard values list Clarksburg Fremont San Leandro Albuquerque Columbus San Jose Martinez New York Alhambra Unknown Las Vegas Dublin Niagara Falls ii.…
-1
votes
2 answers

How do I get rid of attribute Error when running fuzzywuzzy?

I'm trying to compare 2 lists and get a distance ratio for each item on the list. My code below returned an attribute error: 'Series' object has no attribute 'fuzz'. How do i fix this? 'differences' is a result from my earlier code for a list of…
-1
votes
1 answer

How to compare strings of two pandas column using fuzzywuzzy module

I have a dataframe with multiple columns i want to compare two columns to each other. I tried to use fuzzywuzzy module than create function and than apply it on column import pandas as pd import itertools import re import pymorphy2 import…
Naglyj.Spamer
  • 71
  • 2
  • 9
1 2 3
34
35