0

I have a Pandas data frame that look like the following:

df = pd.DataFrame({'foo' : ['spam', 'ham', 'eggs', 'spam'],
                   'timestamp' : ['2022-04-20 15:03:05.325618982-04:00',
                                  '2022-04-19 19:22:43.569068909-04:00', 
                                  '2022-04-18 06:38:28.928778887-04:00', 
                                  '2022-04-15 21:04:28.928778887-04:00']
                        })

The timestamp column is a datetime object, which is created using the following:

df['timestamp'] = df['timestamp'].dt.tz_localize('GMT').dt.tz_convert('America/New_York')

I'd like to subset the df such that only the rows between the start_date and end_date range are returned.

I tried the following:

start_date = '2022-04-18 00:00:00.000000000'
end_date = '2022-04-19 00:00:00.000000000'

df = df[df['timestamp'].isin(pd.date_range(start_date, end_date))]

But, this results in an empty dataframe.

How would I do the subset by defining start_date and end_date only in YYYY-MM-DD format?

Thank you!

zeroes_ones
  • 171
  • 7
  • 1
    Does this answer your question? [Select DataFrame rows between two dates](https://stackoverflow.com/questions/29370057/select-dataframe-rows-between-two-dates). Your problem is that `pd.date_range` generate day sequence, which is not suitable to use `isin` to compare. – Ynjxsjmh Apr 22 '22 at 13:56

1 Answers1

0

You can check between

out = df[df['timestamp'].between(start_date,end_date)]
Out[219]: 
    foo                           timestamp
2  eggs 2022-04-18 06:38:28.928778887-04:00
BENY
  • 317,841
  • 20
  • 164
  • 234