0

I'm trying to select certain rows in a dataframe based on a list. If you set up the index of a dataframe as a DatetimeIndex you can select just by:

example_df['2018-12-12']

But you can't select multiple dates like this:

example_df[['2018-12-12', '2018-12-05']]

I know I can do the following, but I don't want to type the whole list in case it was longer:

example_df['2018-12-12'] & example_df['2018-12-05'] & ...

Also I know I can use the isin() method but I want to take advantage of the native date selector in pandas because I belive is faster.

Here is the code:

genesis_block_date = pd.to_datetime('01/03/2009 18:15:05 GMT')
end_date = pd.to_datetime('01/03/2029')

# Halving dates
halving_dates = ['November 28, 2012', 'July 9th, 2016', '14 May, 2020']
halving_dates = pd.to_datetime(halving_dates)

approx_block_gen_time = pd.to_timedelta('10m')
date_range = pd.date_range(start=genesis_block_date, end=end_date, freq=approx_block_gen_time)

columns = ['days_until_halving']
df_new_features = pd.DataFrame(index=date_range, columns=columns)
df_new_features[halving_dates] = ...
skarit
  • 11
  • 1
  • 3

1 Answers1

0

The issue is that you have a datetime index, but you try to select from it using strings (which it does not contain).

You have to supply a list of datetime objects to the .loc[ ] selection method. pd.to_datetime([list of dates]) gets this job done:

example_df.loc[pd.to_datetime(['2018-12-12', '2018-12-05'])]

Please bear in mind that you can only select columns by providing a list to the selection:

example_df[['2018-12-12', '2018-12-05']]

So you get an error, because there are no such columns...

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
eordogh
  • 1
  • 1
  • Thank you very much, it worked. I don't know why but it was obvious that the .loc was expecting the same object type (the index type I mean). But it's interesting that the regular selector does accept just one date as I said in the OP: example_df['2018-12-12'] – skarit Mar 23 '20 at 16:52