13

I'm trying to subset a DataFrame on the condition that is the last of the month. I used:

df['Month_End'] = df.index.is_month_end
sample = df[df['Month_End'] == 1]

This works, but I'm working with stock market data, so I'm missing all the instances where the actual end of the month is during the weekend, I need a way to select the "last business day of the month".

Alex Riley
  • 169,130
  • 45
  • 262
  • 238
hernanavella
  • 5,462
  • 8
  • 47
  • 84

3 Answers3

13

You can generate a time series with the last business day of each month by passing in freq='BM'.

For example, to create a series of the last business days of 2014:

>>> pd.date_range('1/1/2014', periods=12, freq='BM')
[2014-01-31 00:00:00, ..., 2014-12-31 00:00:00]
Length: 12, Freq: BM, Timezone: None

You could then use this timeseries to subset/reindex your DataFrame.

Alex Riley
  • 169,130
  • 45
  • 262
  • 238
  • 2
    This solution works. The only issue is that in some series, the last business day of the month could be a 'custom holiday', so you would need to factor that into the equation. – hernanavella Nov 30 '14 at 22:00
  • I am using the accepted answer in this question: https://stackoverflow.com/questions/45644857/pandas-dataframe-get-rows-where-index-matches-a-certain-condition but is there a better way to `subset/reindex`? – Koray Tugay Jul 12 '18 at 01:12
  • 1
    The comment from hernanavella is critical. How do we go about this simple question in Python? We just want a clean and easy implementation of this simple question: last stock trading day of a month. – data-monkey Sep 28 '20 at 13:09
3

Instead of generating the series, you can also parse the business month end from your datetime index as this:

df['BMonthEnd'] = (df.index + pd.offsets.BMonthEnd(1)).day

Though note this currently throws a harmless warning - see http://pandas.pydata.org/pandas-docs/stable/timeseries.html#using-offsets-with-series-datetimeindex

Note: if day (d) is already the last business day of the month then d + pd.offsets.BMonthEnd(1) will give the last business day of the following month. If this is undesired, use pd.offsets.BMonthEnd(0) instead:

df['BMonthEnd'] = (df.index + pd.offsets.BMonthEnd(0)).day

Edit: To actually filter the df as requested by OP:

df = df[(df.index + pd.offsets.BMonthEnd(0)).day == df.index.day]
fantabolous
  • 21,470
  • 7
  • 54
  • 51
tsando
  • 4,557
  • 2
  • 33
  • 35
  • 1
    Note, that this method is not perfect. If day "d" is the last business day of the month then d+BMonthEnd(1) gives you the last business day of the following month. – g.a Jul 19 '20 at 09:02
0

This is to filter the last business day of each month from your DataFrame object given the index is of datetime type.
df.resample('BM').mean()

Tèo
  • 1
  • 2