I am using Python and Pandas, trying to sum up, in an efficient way, dataframe's values in different rows based on lists of IDs instead of unique IDs.
df:
Name - ID - Related IDs - Value
z - 123 - ['aaa','bbb','ccc'] - 10
w - 456 - ['aaa'] - 20
y - 789 - ['ggg','hhh','jjj'] - 50
x - 012 - ['jjj','hhh'] - 60
r - 015 - ['hhh'] - 15
It will be possible to try to explode each row by the element of the list but it may duplicate the values to sum and it might not be an efficient solution in terms of timing and resources.
```python
f = {'Sum': 'sum'}
df = df.groupby(['Related IDs']).agg(f)
#it is not working has is matching element wise
#rather then by element
df = df.reset_index()
```
What I am expecting is a new column "Sum" that sum up the values "Value" of rows which have one or more Related IDs in common. As the following:
Name - ID - Related IDs - Value - Sum
z - 123 - ['aaa','bbb','ccc'] - 10 - 30
w - 456 - ['aaa'] - 20 - 30
y - 789 - ['ggg','hhh','jjj'] - 50 - 125
x - 012 - ['jjj','hhh'] - 60 - 125
r - 015 - ['hhh'] - 15 - 125