Suppose I have a list of Person objects :
class Person:
def __init__(self, name, items):
self.name = name
self.items = items
I want to remove duplicates in the following way. If the two persons have a name similar enough, as evaluated by this function :
def have_similar_names(person1, person2):
...
(suppose this function is already coded and uses the edit distance from the fuzzywuzzy package ; for example it would return True for arguments "Tomas"
and "Tomàs"
and False for "Catherine"
and "Cathleen"
), then combine the two persons using :
def combine_persons(person1, person2):
return Person(max([person1.name, person2.name], key=len), person1.items+person2.items)
My question is on how to create a function that would take the list with fuzzy duplicates in input and output a list of combined persons.
I could do it with loops, but I wonder if there is a more efficient and pythonic way to achieve this?