2

I have a df with which can be represented below with a date column and 4 variable columns.

Date          A B C D
2015-10-31    6 7 3 7
2015-11-01    1 3 9 4
2015-11-02    4 5 8 1
2015-11-03    4 2 5 9

I am looking to filter the df on the last business day of each month by adding an additional column called 'EndofMonth' and taking the trailing nth row from each month:

df['EndOfMonth'] = pd.to_datetime(df['DATE'], format="%Y%m") + MonthEnd(1)
df.apply(lambda x: x.sort('dt', ascending=True).tail(1))

This is giving me an error. Any ideas on how to better solve this?

Zeugma
  • 31,231
  • 9
  • 69
  • 81
spacedinosaur10
  • 695
  • 3
  • 10
  • 24

1 Answers1

2

IIUC

df.resample('M', on='Date').apply(pd.DataFrame.tail, n=2)

enter image description here


response to comment

d1 = df.resample('M').apply(pd.DataFrame.tail, n=2)
d1.index = d1.index.droplevel(0)
d1

enter image description here

piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • I get an error on this- TypeError: resample() got an unexpected keyword argument 'on'. Maybe because it is reading my Data col. not as a datetime? @piRSquared – spacedinosaur10 Nov 15 '16 at 14:56
  • 1
    `on` is a new parameter for `resample` in pandas 0.19. you can do `df.set_index('Date').resample('M').apply(pd.DataFrame.tail, n=2)` – piRSquared Nov 15 '16 at 15:01
  • @piRSquared- i am sorry for the continued questions but- this created a multi-index (2 dates). I would ideally like to get rid of one of the index columns. How can i do this? I know how to delete a column (df.drop) but i am not sure how to drop the index. – spacedinosaur10 Nov 15 '16 at 22:03