I found this example in this post from Stackoverflow:
import pandas as pd
earliest = pd.Timestamp('2012-01-01 06:00:00')
latest = pd.Timestamp('2014-12-01 23:00:00')
dr = pd.date_range(start=earliest, end=latest,freq="30min")
df_freq = pd.DataFrame(index=dr, columns=['freq'])
df_freq = df_freq.fillna(0)
# use str datetime as key
df_freq['2012-03-04']
That post was published about five years ago, so the pandas API may have changed. Even when the code still works in 2022, I got this warning:
FutureWarning: Indexing a DataFrame with a datetimelike index using a single string
to slice the rows, like `frame[string]`, is deprecated and will be removed in a future
version. Use `frame.loc[string]` instead.
However, it seems that such indexing by datetime only works for specific frequencies.
If you change the frequency from "30min"
to "D"
(daily), the same snippet just does not work:
import pandas as pd
earliest = pd.Timestamp('2012-01-01')
latest = pd.Timestamp('2014-12-01')
dr = pd.date_range(start=earliest, end=latest, freq="D") # if freq is D, it does not work
df_freq = pd.DataFrame(index=dr, columns=['freq'])
df_freq = df_freq.fillna(0)
df_freq['2012-03-04']
In that case, I got a KeyError: '2012-03-04'
error message.
The workaround is easy, you must use df_freq.loc['2012-03-04']
for indexing.
However, I do not know why do I need to change the frequency for accessing a dataframe.
Is this behavior documented elsewhere?