0

I have a dataframe that lists objects and their different qualities in columns. One of those columns is the objects' colors. My goal is to write a function that creates a NEW column listing the fuzzy partial_ratio scores between the color of a SINGLE object (i.e., 'orange') to all the other objects' colors (i.e., "navy blue", "white", "red-orange").

Firstly, I have a function that searches the dataframe to find the color of the object. Its title is findcolor(object). If I call findcolor(pumpkin) it searches the row 'pumpkin' and column 'Color' and returns the string "orange" (which is in that cell). I am calling this function inside another function, below, which allows me to compare two objects' colors inside the dataframe.

def getsingleFuzzyScore(object1,object2):
    x = findcolor(object2)
    b = findcolor(object1)
    if b in x:  
       return 100
    elif x in b:
       return 100
    else:
       return(fuzz.partial_ratio(b,x))

So, essentially if any of object1's color is contained in object2's color and vice versa, my score will be 100, otherwise, it will take the partial fuzzy ratio of the two colors in comparison. This information-- that is, my rendition of FuzzyScore-- I want as a new column in the dataframe.

Where I am struggling is inputting the 'Color' column from the dataframe. This question (Comparing a single string to an array of strings in C) is what I am looking to do (but in python), and I would like to be able to call the object's column, i.e., 'Color' and have the color of pumpkin be compared to EACH of the remaining objects in the 'Color' column.

In conclusion, I would like to see the output be a column of numbers that are the outputs of getsingleFuzzScore for EACH object compared to the color of the object I input.

kosnik
  • 2,342
  • 10
  • 23
Ella B.
  • 1
  • 2
  • Please provide an example and what you have tried so far. https://stackoverflow.com/help/mcve – kosnik Jul 03 '18 at 16:05
  • def listfuzzscoreColor(object): a = findcolor(object) colorarray = df['Color'] strcolorarray = str(colorarray) return strcolorarray for x in strcolorarray: if a in x: return 100 elif x in a: return 100 else: return 11#(fuzz.partial_ratio(b,x)) And then I put this in to create a column in the df. #df['FuzzScores'].apply(listfuzzscoreColor(pumpkin)) – Ella B. Jul 04 '18 at 20:33

0 Answers0