3

Suppose I have following Data-Series:

Date      Category
2014-8    Facebook
2014-8    Vimeo
2014-8    Facebook
2014-8    Facebook
2014-9    Facebook
2014-9    Orkut
2014-9    Facebook
2014-9    Facebook
2014-9    Facebook
...
2014-10    Youtube
2014-10    DailyMotion
2014-10    Facebook
2014-10    Vimeo
2014-10    Facebook
2014-10    Facebook

I would like to make a count of each category (Unique Value/Factor in the Time Series) per month and year.

Category     Date        Count
Facebook     2014-01     5
             2014-02     6
             2014-03     8
Vimeo        2014-01     3
             2014-02     10
             2014-03     9
youtube      2014-01     13
             2014-02     61
             2014-03     8

So, when I call Facebook, I can see how many times facebook occured on each month.

What I tried is:

df['Date'] = df['Date'].map(lambda x: '{year}-{month}'.format(year=x.year,
                                                              month=x.month,
                                                              day=x.day))
a = df.groupby(['Category','year-month']).size()
pppery
  • 3,731
  • 22
  • 33
  • 46
Mohammad Saifullah
  • 1,113
  • 5
  • 16
  • 33

1 Answers1

3

You need to group by both Category and Date and then do a count on Date:

>>> df.groupby(['Category', 'Date']).Date.count()
Category     Date   
DailyMotion  2014-10    1
Facebook     2014-10    3
             2014-8     3
             2014-9     4
Orkut        2014-9     1
Vimeo        2014-10    1
             2014-8     1
Youtube      2014-10    1
Name: Date, dtype: int64

To get the monthly totals for a specific category (e.g. 'Facebook'), you first need to filter on the category:

>>> df[df.Category == 'Facebook'].groupby(['Category', 'Date']).Date.count()
Category  Date   
Facebook  2014-10    3
          2014-8     3
          2014-9     4
Name: Date, dtype: int6
Alexander
  • 105,104
  • 32
  • 201
  • 196