2

I have multiple monthly timeseries in a Dataframe.

In order to make a dynamic vizualisation, I need to resample them monthly with the same end date for all of them.

import pandas as pd

evolution = [{'date': '2017-09-01', 'Name': 'A', 'Value': 200},
         {'date': '2017-12-10', 'Name': 'A', 'Value': 400},
         {'date': '2017-09-01', 'Name': 'B', 'Value': 200},
         {'date': '2018-01-20', 'Name': 'B', 'Value': 600},
            ]
df = pd.DataFrame(evolution)
df

Out[57]: 
  Name  Value        date
0    A    200  2017-09-01
1    A    400  2017-12-10
2    B    200  2017-09-01
3    B    600  2018-01-20

I resampled to have a normalized index/frequency:

df.index = pd.DatetimeIndex(df['date'])
df = df.groupby(['Name']).resample('M').max()
df = df.drop(['date', 'Name'], axis=1)

df = df.interpolate(method='linear')
df

Out[58]: 
                      Value
Name date                  
A    2017-09-30  200.000000
     2017-10-31  266.666667
     2017-11-30  333.333333
     2017-12-31  400.000000
B    2017-09-30  200.000000
     2017-10-31  300.000000
     2017-11-30  400.000000
     2017-12-31  500.000000
     2018-01-31  600.000000

But from this, I cannot figure how to extend the DateTimeIndex of A to get :

                     Value
Name date                  
A    2017-09-30  200.000000
     2017-10-31  266.666667
     2017-11-30  333.333333
     2017-12-31  400.000000
     2018-01-31  400.000000   <=== Extended Index
B    2017-09-30  200.000000
     ...
     2018-01-31  600.000000
mxdbld
  • 16,747
  • 5
  • 34
  • 37

1 Answers1

2

I think you need:

print (df.unstack().ffill(axis=1).stack())
                      Value
Name date                  
A    2017-09-30  200.000000
     2017-10-31  266.666667
     2017-11-30  333.333333
     2017-12-31  400.000000
     2018-01-31  400.000000
B    2017-09-30  200.000000
     2017-10-31  300.000000
     2017-11-30  400.000000
     2017-12-31  500.000000
     2018-01-31  600.000000
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thanks, works. It also extends the beginning of the timeserie but adding dropna gives me the expected values. – mxdbld Feb 10 '18 at 20:43