1

I have the following dataframe

data = pd.DataFrame({
'date': [1988, 1989, 1990, 1991],
'value': [11558522, 12323552, 13770958, 18412280]
}) 

Out[1]: 
   date     value
0  1988  11558522
1  1989  12323552
2  1990  13770958
3  1991  18412280

I then change the date colum to datetime index

data['date'] = pd.to_datetime(data['date'],format = '%Y')

Out[2]: 
        date     value
0 1988-01-01  11558522
1 1989-01-01  12323552
2 1990-01-01  13770958
3 1991-01-01  18412280

I set the date column as the index

data = data.set_index('date')

Out[3]: 

date          value      
1988-01-01  11558522
1989-01-01  12323552
1990-01-01  13770958
1991-01-01  18412280

Now i want to take a date value i.e 1988 and create the months ( 01 to 12 ) and take the value of that date (11558522) and divide it between the 12 months. so ultimately i want the dataset to look like this

    date          value      
1988-01-01  889117.077
1988-02-01  889117.077
1988-03-01  889117.077
1988-04-01  889117.077
      ...
1988-12-01  889117.077
1989-01-01  947965.538
1989-02-01  947965.538
1989-03-01  947965.538
1989-04-01  947965.538
      ...
1989-12-01  947965.538
etc..

i want to do this for each date value in the dataframe. How best can i do this?

Tamarie
  • 125
  • 2
  • 6
  • 18

1 Answers1

0

Use itertools.product for all combinations of years with DataFrame.merge of all rows by original:

data = pd.DataFrame({
'date': [1988, 1989, 1990, 1991],
'value': [11558522, 12323552, 13770958, 18412280]
}) 

from  itertools import product

y = data['date']
m = np.arange(1,13)
d = [1]

cols = ['year','month','day']
df = (pd.DataFrame(list(product(y, m, d)), columns=cols)
        .merge(data.rename(columns={'date':'year'}))
        .assign(value = lambda x: x['value'] / 12))

df = df.set_index(pd.to_datetime(df[cols])).drop(cols, axis=1)
print (df.head())
                    value
1988-01-01  963210.166667
1988-02-01  963210.166667
1988-03-01  963210.166667
1988-04-01  963210.166667
1988-05-01  963210.166667

Another idea is create Series with values of column value mapped by dictionary:

data = pd.DataFrame({
'date': [1988, 1989, 1990, 1991],
'value': [11558522, 12323552, 13770958, 18412280]
}) 


from  itertools import product

di = data.set_index('date')['value'].to_dict()
y = data['date']
m = np.arange(1,13)

comp = {f'{y}-{m}-01': di.get(y) for y,m,d in product(y, m, d)}
df = pd.Series(comp).div(12).to_frame('value')
df.index = pd.to_datetime(df.index)
print (df.head())
                    value
1988-01-01  963210.166667
1988-02-01  963210.166667
1988-03-01  963210.166667
1988-04-01  963210.166667
1988-05-01  963210.166667
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • the dates worked but i want the value of 1 year to be divided by 12. So in the year 1989 the value was 11558522. I want to divide this value by 12 and then that will now be the new value that populates the months of that year. i.e 1988-01-01 to 1988-12-01 will have the value 889117.077 – Tamarie Feb 21 '20 at 10:30
  • @Tamarie - I think first solutions was overcomlicated, so added simplier, also there is division by 12 – jezrael Feb 21 '20 at 10:34
  • getting the following error if i try that on my dataset `TypeError: Start and end cannot both be tz-aware with different timezones` – Tamarie Feb 21 '20 at 10:39
  • @Tamarie - Added original solution, only added division by 12 – jezrael Feb 21 '20 at 10:43
  • None working. still getting the type error i mentioned before. if i try the first examples you gave i get this errro `ValueError: cannot assemble the datetimes: time data '-1661174702541897627' does not match format '%Y%m%d' (match)` – Tamarie Feb 21 '20 at 10:48
  • @Tamarie - Not working with `data = pd.DataFrame({ 'date': [1988, 1989, 1990, 1991], 'value': [11558522, 12323552, 13770958, 18412280] })` ? Or with real data? – jezrael Feb 21 '20 at 10:52
  • @Tamarie - added full code solutions also with sample data – jezrael Feb 21 '20 at 10:58
  • @Tamarie - All 3 solutions failed? What is `print (df['date'].unique())` – jezrael Feb 21 '20 at 11:18
  • the first code example ran but the months are not incrementing – Tamarie Feb 21 '20 at 12:11
  • @Tamarie - No idea, it seems some data related problem, because with sample data working nice. – jezrael Feb 21 '20 at 12:14