0

I'm working with CRYPTO_INTRADAY timeseries data from the AlphaVantage API. JSON format:

{'Meta Data': {'1. Information': 'Crypto Intraday (15min) Time Series', '2. Digital Currency Code': 'ZEC', '3. Digital Currency Name': 'Zcash', '4. Market Code': 'USD', '5. Market Name': 'United States Dollar', '6. Last Refreshed': '2022-02-15 18:15:00', '7. Interval': '15min', '8. Output Size': 'Full size', '9. Time Zone': 'UTC'}, 'Time Series Crypto (15min)': {'2022-02-15 18:15:00': {'1. open': '125.60000', '2. high': '126.00000', '3. low': '125.60000', '4. close': '125.90000', '5. volume': 711}, '2022-02-15 18:00:00': {'1. open': '125.70000', '2. high': '125.90000', '3. low': '125.50000', '4. close': '125.60000', '5. volume': 1059}, '2022-02-15 17:45:00': {'1. open': '125.80000', '2. high': '125.90000', '3. low': '125.50000', '4. close': '125.70000', '5. volume': 857}, '2022-02-15 17:30:00': {'1. open': '125.30000', '2. high': '126.20000', '3. low': '125.30000', '4. close': '125.70000', '5. volume': 1106}, '2022-02-15 17:15:00': {'1. open': '125.90000', '2. high': '126.10000', '3. low': '125.30000', '4. close': '125.30000', '5. vol

Reviewing the various functions for selecting a date range in a dataframe, it seems pandas.DataFrame.between_time is the best option.

However, that function returns an error message:

TypeError: Index must be DatetimeIndex

In trying to make my Index into DatetimeIndex, I'm using the following:

print (df.index.name)

Which returns:

timeStamp

And then run:

df["timeStamp"]=pd.to_datetime(df["timeStamp"])

But that gives a long error message ending with:

KeyError: 'timeStamp'

So I'm wondering if I'm going about creating a DatetimeIndex correctly, or if pandas.DataFrame.between_time is even the right way to go about selecting for a date range within a DataFrame.

Thank you for any guidance on this.

dsx
  • 167
  • 1
  • 12
  • 1
    `df.index = pd.to_datetime(df.index)`. – Quang Hoang Feb 15 '22 at 19:09
  • Thx @QuangHoang , that worked for making the object into datetime64[ns]. However, now when I run the `between_time` function: `start_time = '2022-02-10 00:00:00'` `end_time = '2022-02-11 00:00:00'` `df.between_time(start_time, end_time, include_start=True, include_end=True, axis=None)` I get the following error: `ValueError: Cannot convert arg ['2022-02-10 00:00:00'] to a time` If that's the DatetimeIndex format, don't see what other format the argument should be in? – dsx Feb 15 '22 at 19:32
  • 1
    `between_time` only works for *time*, not *datetime*. You can use usual comparison `('2022-02-10' <= df.index) & (df.index <= '2022-02-14')`, or use `pd.Series.between`: `df.index.to_series().between('2022-02-10', '2022-02-14')`. – Quang Hoang Feb 15 '22 at 20:39
  • Thank you @QuangHoang. I used the 'usual comparison' approach, which returned a list of true/false values. I understand what's happening there, but not how to get the data back into a dataframe view showing only the rows in which the dates are within the target range. Do you know what section of the pandas documentation would cover that? – dsx Feb 15 '22 at 22:16
  • 1
    Boolean indexing: `df[that_series]` – Quang Hoang Feb 15 '22 at 22:17
  • @QuangHoang Thx. Any idea what I'm doing wrong with this? `df['5. volume']('2022-02-10' <= df.index) & (df.index <= '2022-02-11')`. I'm getting error message `TypeError: 'Series' object is not callable`. The Series object would be the `index`, correct? I also tried placing `df['5. volume']('2022-02-10' <= df.index) & (df.index <= '2022-02-11')` before `df.index = pd.to_datetime(df.index)`, but received the same error message. (`5. volume` is one of my column names.) – dsx Feb 16 '22 at 01:23

0 Answers0