Questions tagged [fuzzywuzzy]

FuzzyWuzzy is a Python package to perform fuzzy string matching.

FuzzyWuzzy is a Python package to perform fuzzy string matching.

Useful links

522 questions
1
vote
1 answer

fuzzywuzzy process.extract() not returning a list

I'm relatively new to programming and while doing my university assignment, I've been running into problems with the process.extract() function from the fuzzywuzzy package. The documentation says the function should return a list, however, my code…
Adreto
  • 11
  • 5
1
vote
0 answers

Fuzzy String Matching two lists of characters in R

Hello I'm needing some advice on string matching. I have a large dataset of investors and various deals they have been involved with. Some of the investors and their deal data are not relevant to my investigation and I have a master list of approved…
R Wd
  • 59
  • 5
1
vote
1 answer

delete row from dataframe, df.drop not removing rows

I am making a table in which the index of similar products are recorded, then data from those rows are pasted into a single row which agglomorates all the data. After this the row is deleted. The code is below: matchedproducts_df =…
Judoo123
  • 49
  • 6
1
vote
0 answers

Fuzzy Matching Names Using State (Location) as a Conditional

I am trying to fuzzy match names of companies from two CSV files (each has company name in one column and the state that the company is located in in another), but want to limit the matching to be conditional on state (e.g., if a company from list A…
Alex
  • 11
  • 2
1
vote
1 answer

Nested loops for comparing and grouping strings using fuzzywuzzy python

I am struggling to make a faster code to group similar product names(column "prep") within same "person_id" and same "TNVED". So sample of my dataframe looks like this: sample_of_dataframe So I did dictionary on IIN_BINs and the keys to this…
1
vote
1 answer

Is it possible with Python to reconstruct a jumbled sentence to match a full sentence?

I have a CSV of sentences and another CSV where the same sentences are broken and jumbled up. For example, one CSV has: The quick brown fox jumps over the lazy dog. And the other CSV has: jumps over the The quick brown fox lazy dog. Each CSV has…
1
vote
2 answers

Calculate highest score in fuzzy string matching

Looking out to find the highest accuracy percentage between 2 column values by using Fuzzy string matching. I have 2 dataframes where i am trying to use fuzzy match between an specific columns values from both the dataframes. Lets say df1 has 5 rows…
NKJ
  • 457
  • 1
  • 4
  • 11
1
vote
0 answers

Why ```token_sort_ratio()``` is not working?

I have two strings as follows - key_up = "DATE OF DISCHARGE" key_low = "date of discharge" t1 = "blah blah blah blah DATE OF DISCHARGE more blah blah" Now I am calculating the fuzz ratio of t1 with kew_up and key_low as - s9 =…
Mohammad Amir
  • 133
  • 1
  • 6
1
vote
0 answers

What's the best aprpoach to finding a specific city name in a given adress list?

The dataset I have is manually filled addresses. The data is big and has a LOT of variations. The address column contains information of the full address from apart number to city and street name to neighbor name and the city. Since it's manually…
1
vote
3 answers

Fuzzy matching and iteration through DataFrame

I have these two DataFrames: I want to fuzzy match the Surname strings to the corresponding Names dico = {'Name': ['Arthur','Henri','Lisiane','Patrice'], "Age": ["20","18","62","73"], "Studies":…
Arthur Langlois
  • 137
  • 1
  • 9
1
vote
1 answer

match_most_similar in Python string_grouper returning original strings

I have a list of strings that are messy, and I want to find, for each one, its best match from a list of cleanly-formatted strings, which also contains metadata about each. The strings in the messy list are repeated randomly through the list…
Tom Clark
  • 11
  • 3
1
vote
1 answer

fuzzy duplicated with pandas

I have 1 DataFrame contain 2 columns of string data. i need to compare columns 'NameTest'and'Name'. and i want each name in columns'NameTest' compare too all name in columns 'Name'. and if they matching more than 80% print closest match name. *My…
1
vote
1 answer

How to match use Fuzz with array input

I want to try to get match values from the two arrays I input. I use Fuzzy for that. but I still can't get that value, maybe because the input form is an array. help me, please :) Thank you and I'm sorry if my question does not clear. Thank you :) a…
1
vote
0 answers

fuzzy wuzzy token sort vs difflib Sequence matcher

I am trying to figure out the difference between the two. I get the same results(similarity scores) using the two for the same strings. Can somebody please explain the difference between the two using the formula for each of them? Any idea if one…
Samit Saxena
  • 99
  • 1
  • 9
1
vote
2 answers

Using Process.extract in fuzzywuzzy and the all max similar choices

I have the following input- query = 'Total replenishment lead time (in workdays)' choices = ['PLANNING_TIME_FENCE_CODE', 'BUILD_IN_WIP_FLAG','Lead_time_planning', 'Total replenishment lead time 1', 'Total replenishment lead time …