1

I am working with pandas and I have a list between 1949 and 1960, with months (January to December), and associated to each month a number (of people). Months are in column A, nb of people in column B. I would like to calculate the mean of people for every month and determine the month with the maximum of people, over the time-period.

How can I do that ? I had the idea of using a rolling mean but I wanted to make sure there is a simpler way to do it before jumping into it too much.

It is organized as:

nf = 
A     B

Jan   3
Feb   5
...  ...
Jan   4
Feb   1
...  ...
Jan   0
Feb   9
...  ...

Nihilum
  • 549
  • 3
  • 11

2 Answers2

1

u can achieve this task useing groupby() method:

nf.groupby(['A'],as_index=false).mean()
adir abargil
  • 5,495
  • 3
  • 19
  • 29
0

You can do it like this:

df = nf.groupby('A').mean()

This will give you the mean for each month. Then you can sort the results:

df.sort_values(by=['B'], ascending = False)
gtomer
  • 5,643
  • 1
  • 10
  • 21