Returning list of similar persons

Question

I have a dictionary containing a list of similar persons. So from the above declarations, I want David and charles to be returned as one list of similar persons,based on common interest(s) (in this case data mining) and Ramesh and Suresh as a second list of similar persons (genetics common in both). How to accomplish this (result without a function is fine)?

dataset={
'David':['Artificial Intelligence','Machine learning', 'Neural networks', 'data mining'],
'Charles':['embedded computing','data mining','digital filters','signal processing','virtual reality','augmented reality'],
'Ramesh':['molecular biology','genetics','neuro surgery','oncology','ophthalmology'],
'Suresh':['genetics','neurology','ENT','bioinformatics','gene processing','radiology','pharmacology']
}

def commoninterest(personi,personj):
    similar_persons=[]

for interest in dataset[personi]:
    if interest in dataset[personj]:
        similar_persons.append(personi,personj)
return similar_persons

Save the duplicate interest in a list. Then iterate through the list for the keys in the dictionary. Possible duplicate:http://stackoverflow.com/questions/40985281/python-comparing-values-in-the-same-dictionary — Aly Abdelaziz, Apr 16 '17 at 14:04

score 0 · Answer 1 · answered Apr 16 '17 at 20:54

The problem has not defined exactly. The example suggest that one common attribute is enough for making two persons similar. In this case you should create as many list of persons as many topics you have. (Perhaps, you can eliminate the empty lists.)

I you would like to make a more sophisticated measure you should define a metric between persons based on the number of common interests. In that case I advice to use sets for interests instead of lists, because

it guarantees that the elements remain unique,
the order of attributes does not matter (as I see from the example),
you can use intersect set operation for calculating the common attributes, and
it makes your code faster.

score 0 · Accepted Answer · answered Apr 16 '17 at 21:33

Like Imre Piller said, you want to store interests in lists. Here is one possible sollution. In addition, the function tells you what interests the pairs have in common, but you can get rid of that if you want.

dataset={
'David':set(['Artificial Intelligence','Machine learning', 'Neural networks', 'data mining']),
'Charles':set(['embedded computing','data mining','digital filters','signal processing','virtual reality','augmented reality']),
'Ramesh':set(['molecular biology','genetics','neuro surgery','oncology','ophthalmology']),
'Suresh':set(['genetics','neurology','ENT','bioinformatics','gene processing','radiology','pharmacology'])
}

def get_common_intrests(people):
    pairs = []
    p_list = list(people)
    for i, p1 in enumerate(p_list):
        for p2 in p_list[:i]:
            common_interests = people[p1].intersection(people[p2])
            if len(common_interests) > 0:
                pairs.append([p1, p2, common_interests])
    return pairs

print get_common_intrests(dataset)

result (python 2):

[['Suresh', 'Ramesh', set(['genetics'])], ['David', 'Charles', set(['data mining'])]]

Thanks @Jonathan. But i get this error: AttributeError: 'list' object has no attribute 'intersection' how do i rectify it? — pmdav, Apr 16 '17 at 22:06
converting people[p1] into set solves the issue. But how does the for within for functioning? cant get it. Thanks — pmdav, Apr 16 '17 at 22:21
@PrashantMahato The first error is probably because you didn't change your dataset dictionary to containing sets instead of lists. the code is kind of messy (sorry about that). The nested for loops function to test each persons interests intersection with everyone else's. The slicing on the second for loop ensures that the function does not return two of the same pairs (i.e. `['David', 'Charles', set(['data mining']` and `['Charles', 'David', set(['data mining']`). — Jonathan, Apr 17 '17 at 02:04

Returning list of similar persons

2 Answers2