1

Ok I have a dataset regarding game outcomes that is incomplete and I want to generate a plot with either the data present or zero values for the players that have no data in that game. Furthermore I want to add the data present via a list: some players are attackers and some defenders My data is like this:

raw data:

Game    Player  Goal    Assits  Fouls
1   Alpha       1       1       0
1   Beta        2       0       1
2   Alpha       0       1       1
2   Gamma       2       0       0
3   Beta        3       0       1
4   Alpha       1       1       1
4   Beta        2       0       1
5   Alpha       0       1       1
5   Beta        1       0       0
5   Gamma       0       1       1

desired result with Points for Goals + Assists and Attackers = ['Alpha','Beta'] and Defenders=['Gamma']

Game    Attackers   Defenders
1       4           0
2       1           2
3       3           0
4       4           0
5       2           1

I have all the raw data in a pandas dataframe and I have tried using isin function to get the data out. This leaves me with different length results, ie if it is "not in" then there is no data added. I would (as shown just like zeros instead . ==> ie in Game 1 Gamma is not mentioned so he has zero points.

thank you for your help

2705114-john
  • 762
  • 1
  • 6
  • 10

1 Answers1

2

This is a bit messy, but certainly doable.

First of all, you'll need to reset_index() on df, to make grouping easier. Groupby doesn't handle grouping on an index and a column at the same time gracefully (GH issue).

In [64]: df = df.reset_index()

Define a mapping from player to position (attacker or defender):

In [65]: kind = {'Alpha': 'Attackers', 'Beta': 'Attackers', 'Gamma': 'Defenders'}

Ideally you'd be able to do the next 3 steps in one line, but I was having trouble with the aggregation. First get the grouping by position and game.

In [66]: grouped = df.groupby(['Game', df.Player.map(kind)]).sum()

In [67]: grouped
Out[67]: 
                Goal  Assits  Fouls
Game Player                        
1    Attackers     3       1      1
2    Attackers     0       1      1
     Defenders     2       0      0
3    Attackers     3       0      1
4    Attackers     3       1      2
5    Attackers     1       1      1
     Defenders     0       1      1

[7 rows x 3 columns]

Then calculate the points, which gives a Series:

In [68]: points = grouped['Goal'] + grouped['Assits']

In [69]: points
Out[69]: 
Game  Player   
1     Attackers    4
2     Attackers    1
      Defenders    2
3     Attackers    3
4     Attackers    4
5     Attackers    2
      Defenders    1
dtype: int64

Finally unstack(). This creates NaNs where there aren't any values (e.g. Game 1, Defenders), which we'll fill with 0.

In [70]: points.unstack('Player').fillna(0)
Out[70]: 
Player  Attackers  Defenders
Game                        
1               4          0
2               1          2
3               3          0
4               4          0
5               2          1

[5 rows x 2 columns]
TomAugspurger
  • 28,234
  • 8
  • 86
  • 69
  • hi quick follow-up question: is it possible to map to an item in a dictionary list, ie: ` `kind = {'Alpha': ['Attackers',1976], 'Beta': ['Attackers',1980], 'Gamma': ['Defenders',1977]}` `grouped = df.groupby(['Game', df.Player.map(kind**[0]**)]).sum()` this does not seem to work and I am not sure if it just is not possible, I can always generate separate dictionaries from a master dictionary to get around having to update data in multiple locations thanks – 2705114-john Feb 15 '14 at 12:48
  • It may be better to create those as separate columns in your data frame. Then you can do `df.groupby(['Game', 'kind', 'year']).sum()` or something like that. – TomAugspurger Feb 15 '14 at 16:00
  • thanks @TomAugspurger but my problem is not so much having a list in the data frame but rather getting the mapping to work, ie when I am mapping to the say `'Gamma':['Defenders', 1977] ` I do not know how to just get list item [0] into the field : `df.Player.map(kind[0])` does not seem to work ----- Thanks – 2705114-john Feb 21 '14 at 10:38