How to convert monthly data to yearly data with the value as the mean average over the 12 months? (Python Pandas)

Question

This is what my data looks like:

       month  total_mobile_subscription
0    1997-01                     414000
1    1997-02                     423000
2    1997-03                     431000
3    1997-04                     479000
4    1997-05                     510000
..       ...                        ...
279  2020-04                    9056300
280  2020-05                    8928800
281  2020-06                    8860000
282  2020-07                    8768500
283  2020-08                    8659000

[284 rows x 2 columns]

Basically, I'm trying to change this into a dataset sorted by year with the value being the mean average of total mobiles subscriptions for each year.

I am not sure what to do as I am still learning this.

jezrael · Answer 1 · 2023-02-02T06:37:35.153

1

Convert values of month to datetimes and aggregate by years:

y = pd.to_datetime(df['month']).dt.year.rename('year')
df1 = df.groupby(y)['total_mobile_subscription'].mean().reset_index()

df['month'] = pd.to_datetime(df['month'])
df1 = (df.groupby(df['month'].dt.year.rename('year'))['total_mobile_subscription']
         .mean().reset_index())

Or aggregate by first 4 values of month column:

df2 = (df.groupby(df['month'].str[:4].rename('year'))['total_mobile_subscription']
         .mean().reset_index())

edited Feb 02 '23 at 06:37

answered Feb 02 '23 at 06:13

jezrael

822,522
95
1,334
1,252

Thank you so much! I'll need to study harder!!! – Lim Ze Kai Feb 02 '23 at 06:37

score 0 · Answer 2 · answered Feb 02 '23 at 06:25


import pandas as pd
import numpy as np
df = pd.DataFrame({
    'year': ['1997-01', '1997-02', '1997-03', '1998-01', '1998-02', '1998-03'],
    'sale': [500, 1000, 1500, 2000, 1000, 400]
    })


for a in range(1,13):
    x = '-0'+ str(a)
    df['year'] = df['year'].str.replace(x, '')
    

df2 = (df.groupby(['year']).mean('sale'))
print(df2)

How to convert monthly data to yearly data with the value as the mean average over the 12 months? (Python Pandas)

2 Answers2