0

I am trying to do forecasting on a data using dates. The problem is i have a dataset like this

    LeaveStartDate  TotalLeaveDays
0   2020-03-14  1.0
1   2020-03-18  2.0
2   2020-03-20  1.0
3   2020-01-13  3.0
4   2020-02-15  1.0

I want to expand total leaves like:

  LeaveStartDate   TotalLeaveDays

    0  2020-03-14       1.0

    1   2020-03-18      1.0

    2   2020-03-19      1.0

    3   2020-01-20      1.0

    4   2020-01-13      1.0

    5   2020-01-14      1.0

    6   2020-01-15      1.0

    7   2020-02-15      1.0 

what should i do to get this form of data

1 Answers1

1

Use Index.repeat by column TotalLeaveDays, then add counter values converted to days timedeltas by GroupBy.cumcount and to_timedelta and last set 1 to TotalLeaveDays column:

df['LeaveStartDate'] = pd.to_datetime(df['LeaveStartDate'])

df = df.loc[df.index.repeat(df['TotalLeaveDays'])]
df['LeaveStartDate'] += pd.to_timedelta(df.groupby(level=0).cumcount(), unit='D')
df['TotalLeaveDays'] = 1
df = df.reset_index(drop=True)
print (df)
  LeaveStartDate  TotalLeaveDays
0     2020-03-14               1
1     2020-03-18               1
2     2020-03-19               1
3     2020-03-19               1
4     2020-03-20               1
5     2020-03-21               1
6     2020-01-13               1
7     2020-02-17               1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252