0

I already a group by done with getting previous row values in a data frame.

groupby_result = df.groupby([df['DateAssigned'].dt.strftime('%m').rename('Month')]).agg({'count'}).cumsum().reset_index()

The above lines give me a result like this

 Month   DateAssigned  DateCompleted        
  05        1             0

  06        18            4

So the code gets the cumulative values for the months that are only present. I want the cumulative values to be carried forwarded even if a month doesn't have a record for it.

my expected result :

Month   DateAssigned   DateCompleted   diff
05            1             0           0
06           18             4           17
07           18             4           0
08           20             4           2

4th column diff is to just explain things better

Let me explain my requirement - the 7th month doesn't have any value to it, yet it should get the values from the previous month(6th) and carry forward the value to the 8th month

data frame for reference :

DateAssigned DateCompleted month

2020-06-18    2020-06-19  2020-06-18
2020-06-18           NaT  2020-06-18
2020-06-19           NaT  2020-06-19
2020-06-18    2020-06-18  2020-06-18
2020-06-23           NaT  2020-06-23
2020-06-04           NaT  2020-06-04      
2020-06-18           NaT  2020-06-18
2020-06-18    2020-06-18  2020-06-18
2020-06-05           NaT  2020-06-05
2020-06-18    2020-06-18  2020-06-18
2020-06-05           NaT  2020-06-05
2020-06-05           NaT  2020-06-05
2020-06-17           NaT  2020-06-17
2020-06-18           NaT  2020-06-18
2020-06-17           NaT  2020-06-17
2020-06-18           NaT  2020-06-18
2020-06-04           NaT  2020-06-04
2020-06-05           NaT  2020-06-05

3 Answers3

1

This should do, use a proper index to adapt it to your actual dataframe

df.groupby(df["DateAssigned"].dt.month.rename("Month")).count().cumsum().reindex([5,6,7,8]).ffill().reset_index()
filippo
  • 5,197
  • 2
  • 21
  • 44
  • I tried your solution mate. If I reindex all I get is NaN is all the columns expect months – SivakumarTG Jun 25 '20 at 10:04
  • @SivakumarTG works fine here after converting the columns to datetime, are you sure you shared an extract of your actual dataframe? it would be great if it would include items from several months so there is a clear outcome you want to reproduce (there is only june at the moment) – filippo Jun 26 '20 at 00:20
  • I am extremely sorry brother. Your solution was right, I was the culprit here. Your solution solved my purpose. Thanks a lot – SivakumarTG Jun 26 '20 at 06:07
0

Try using ffill method. This will Forwardfill the previous values if the values are blank or NaN.

df.fillna(method='ffill')
PerlBatch
  • 200
  • 2
  • 2
  • 10
  • Thank you.....like you said fillna is used to fill the values in a certain row or column of a data frame as NaN or zero. I looking for a missing month after groupby. I appreciate your idea. – SivakumarTG Jun 25 '20 at 04:03
0

Calculate the difference using 'shift()' with the grouped and counted DFs. The output example is an arbitrary result I created, you can replace it with your own.

df["DateAssigned"] = pd.to_datetime(df["DateAssigned"])
df["DateCompleted"] = pd.to_datetime(df["DateCompleted"])
df = df.groupby(df["DateAssigned"].dt.month.rename("Month")).count()
df['diff'] = df['DateAssigned']-df['DateAssigned'].shift()

df

DateAssigned    DateCompleted   month   diff
Month               
5       1   0   1   NaN
6       18  4   18  17.0
7       9   2   9   -9.0
8       12  2   12  3.0
r-beginners
  • 31,170
  • 3
  • 14
  • 32
  • That's a really good answer, but how does this work when a certain month doesn't even have any values for it? Anyway, I appreciate your answer here. Thank you – SivakumarTG Jun 25 '20 at 04:08
  • This code groups them by the months that exist, so you can aggregate them even if you don't have a certain month. – r-beginners Jun 25 '20 at 04:15
  • I understand mate, what I want here is Even if a particular month doesn't have any values, it should get value from the previous month and pass on the values to the coming month. You can refer to my expected result section in question for a note. – SivakumarTG Jun 25 '20 at 10:32