0

I have a DataFrame that looks like this:

Issue       Options     Points
Bonus       10          4000
Bonus       8           3000
Bonus       6           2000
Bonus       4           1000
Bonus       2           0
Assignment  A           0
Assignment  B           -600
Assignment  C           -1200
Assignment  D           -1800
Assignment  E           -2400
Leave       35          1600
Leave       30          1200
Leave       25          800
Leave       20          400
Leave       15          0

Is there a clever way to create different combinations of each of the issues?

The ideal output is something like this:

Combination_1: Bonus 10 + Assignment A + Leave 35 = 4000 + 0 + 1600 = 5600
Combination_2: Bonus 10 + Assignment A + Leave 30 = 4000 + 0 + 1200 = 5400
Combination_3: Bonus 10 + Assignment A + Leave 25 = 4000 + 0 + 800 = 4800
Combination_4: Bonus 10 + Assignment A + Leave 20 = 4000 + 0 + 400 = 4400
Combination_5: Bonus 10 + Assignment A + Leave 15 = 4000 + 0 + 0 = 4000

I want to give each combination a score to see which is the strongest option

Is this possible with the shape of my DataFrame?

I have attempted the following:

stuff = ['10', '8', '6', '4', '2', 'A', 'B', 'C', 'D', 'E']

results = []

for combination in range(0, len(stuff) + 1):
    for subset in enumerate(itertools.combinations_with_replacement(stuff, combination)):
        results.append(subset)

But it gives me every combination which is not possible in this situation as you can only pick one issue, not multiple.

Mabel Villalba
  • 2,538
  • 8
  • 19
AdrianC
  • 383
  • 4
  • 18
  • You want to print output as above or just calculated value? – Sociopath Feb 17 '18 at 08:03
  • Both if possible? But unique combinations will suffice. I just did the calculations to give a better idea on what I'm trying to achieve – AdrianC Feb 17 '18 at 08:08
  • I will send my code shortly. Thanks for pointing out. – AdrianC Feb 17 '18 at 08:34
  • 4
    If you just want to see which combination results in the stronger option and you're merely adding the values from each category together, why not pick the maximum value of each and be done with it? – Reti43 Feb 17 '18 at 08:35

1 Answers1

3

You can try to separate the different issues into dictionaries and then get the permutations:

>>> df

         Issue Options  Points
0        Bonus      10    4000
1        Bonus       8    3000
2        Bonus       6    2000
3        Bonus       4    1000
4        Bonus       2       0
5   Assignment       A       0
6   Assignment       B    -600
7   Assignment       C   -1200
8   Assignment       D   -1800
9   Assignment       E   -2400
10       Leave      35    1600
11       Leave      30    1200
12       Leave      25     800
13       Leave      20     400
14       Leave      15       0

Now let's create a dictionary with all the possible issues as keys and as values, a dictionary with all the possible rows this way:

>>> d = {issue: df[df['Issue']==issue].copy().drop('Issue',
         axis=1).to_dict(orient='records') 
         for issue in df['Issue'].unique()}
>>> d
{'Assignment': [{'Options': 'A', 'Points': 0},
  {'Options': 'B', 'Points': -600},
  {'Options': 'C', 'Points': -1200},
  {'Options': 'D', 'Points': -1800},
  {'Options': 'E', 'Points': -2400}],
 'Bonus': [{'Options': '10', 'Points': 4000},
  {'Options': '8', 'Points': 3000},
  {'Options': '6', 'Points': 2000},
  {'Options': '4', 'Points': 1000},
  {'Options': '2', 'Points': 0}],
 'Leave': [{'Options': '35', 'Points': 1600},
  {'Options': '30', 'Points': 1200},
  {'Options': '25', 'Points': 800},
  {'Options': '20', 'Points': 400},
  {'Options': '15', 'Points': 0}]}

And next we can get all the permutations between the dictionarties this way:

>>> from itertools import product
>>> combinations = [dict(zip(d, v)) for v in product(*d.values())]
>>> combinations
[{'Assignment': {'Options': 'A', 'Points': 0},
  'Bonus': {'Options': '10', 'Points': 4000},
  'Leave': {'Options': '35', 'Points': 1600}},
 {'Assignment': {'Options': 'A', 'Points': 0},
  'Bonus': {'Options': '10', 'Points': 4000},
  'Leave': {'Options': '30', 'Points': 1200}},
 {'Assignment': {'Options': 'A', 'Points': 0},...]

For the first combination we can obtain:

>>> issues = df['Issue'].unique()

>>> issues
array(['Bonus', 'Assignment', 'Leave'], dtype=object)

>>> c1 = ' + '.join([issue + ' %s'%combinations[0][issue]['Options'] 
                     for issue in issues])

>>> c1 
'Bonus 10 + Assignment A + Leave 35'

>>> c2 = ' + '.join([' %s'%combinations[0][issue]['Points'] for issue in issues])

>>> c2
' 4000 +  0 +  1600'

# Eval ' 4000 +  0 +  1600' to obtain the sum

>>> c3 = str(eval(c2))

>>> c3
'5600'

It can all be joined this way:

>>> 'Combination_%d: %s'%(0,' = '.join([c1, c2, c3]))
'Combination_0: Bonus 10 + Assignment A + Leave 35 =  4000 +  0 +  1600 = 5600'

We can define a function to get all the strings from the list of combinations:

>>> def get_output(i,combination, issues):                       
        c1 = ' + '.join([issue + ' %s'%combination[issue]['Options']
                         for issue in issues])
        c2 = ' + '.join([' %s'%combination[issue]['Points'] 
                         for issue in issues])
        c3 = str(eval(c2))
        return 'Combination_%d: %s'%(i,' = '.join([c1, c2, c3]))

>>> [get_output(i+1,c, issues) for i, c in enumerate(combinations)]

['Combination_1: Bonus 10 + Assignment A + Leave 35 =  4000 +  0 +  1600 = 5600',
 'Combination_2: Bonus 10 + Assignment A + Leave 30 =  4000 +  0 +  1200 = 5200',
 'Combination_3: Bonus 10 + Assignment A + Leave 25 =  4000 +  0 +  800 = 4800',
 'Combination_4: Bonus 10 + Assignment A + Leave 20 =  4000 +  0 +  400 = 4400',
 'Combination_5: Bonus 10 + Assignment A + Leave 15 =  4000 +  0 +  0 = 4000',
 'Combination_6: Bonus 10 + Assignment B + Leave 35 =  4000 +  -600 +  1600 = 5000',
 'Combination_7: Bonus 10 + Assignment B + Leave 30 =  4000 +  -600 +  1200 = 4600',
 'Combination_8: Bonus 10 + Assignment B + Leave 25 =  4000 +  -600 +  800 = 4200',
 'Combination_9: Bonus 10 + Assignment B + Leave 20 =  4000 +  -600 +  400 = 3800',
 'Combination_10: Bonus 10 + Assignment B + Leave 15 =  4000 +  -600 +  0 = 3400',,...]
Mabel Villalba
  • 2,538
  • 8
  • 19
  • This worked perfectly. I don't understand a lot of it, but thank you so much for helping me. Question: What does the * do in this code? combinations = [dict(zip(d, v)) for v in product(*d.values())] – AdrianC Feb 18 '18 at 04:43
  • It's a function of itertools to combine elements. Basically we are trying to combine each element of each dictionary with each of the others to obtain all combinations. Here you have a question about how to combine different dictionaries: [https://stackoverflow.com/questions/15211568/combine-python-dictionary-permutations-into-list-of-dictionaries](https://stackoverflow.com/questions/15211568/combine-python-dictionary-permutations-into-list-of-dictionaries) – Mabel Villalba Feb 18 '18 at 12:37
  • And the asterisk is used in Python when passing variable arguments to a function and you want to unpack them. So `[dict(zip(d, v)) for v in product(*d.values())]` would be equivalent to make the product between the 3 dictionaries: `[dict(zip(d, v)) for v in product(d.values()[0],d.values()[1],d.values()[2])]` Here's a link to undestand the asterisk t in Python: [https://medium.com/understand-the-python/understanding-the-asterisk-of-python-8b9daaa4a558](https://medium.com/understand-the-python/understanding-the-asterisk-of-python-8b9daaa4a558) – Mabel Villalba Feb 18 '18 at 12:42