1

I'm having difficulty accessing a pandas dataframe using a DateTimeIndex. I have created a dataframe with a DateTimeIndex and a column of zeros

import pandas as pd
earliest = pd.Timestamp('2012-01-01 06:00:00')
latest = pd.Timestamp('2014-12-01 23:00:00')
dr = pd.date_range (start=earliest, end=latest,freq="30min" )
df_freq = pd.DataFrame(index=dr, columns=['freq'])
df_freq = df_freq.fillna(0)

I can reference the dataframe using a date formatted as a string:

df_freq['2012-03-04']

gives

                         freq
2012-03-04 00:00:00     0
2012-03-04 00:30:00     0
2012-03-04 01:00:00     0
2012-03-04 01:30:00     0
2012-03-04 02:00:00     0
2012-03-04 02:30:00     0
2012-03-04 03:00:00     0
2012-03-04 03:30:00     0
2012-03-04 04:00:00     0
2012-03-04 04:30:00     0
2012-03-04 05:00:00     0
2012-03-04 05:30:00     0
2012-03-04 06:00:00     0
2012-03-04 06:30:00     0
2012-03-04 07:00:00     0
2012-03-04 07:30:00     0
2012-03-04 08:00:00     0
2012-03-04 08:30:00     0
2012-03-04 09:00:00     0
2012-03-04 09:30:00     0
2012-03-04 10:00:00     0
2012-03-04 10:30:00     0
2012-03-04 11:00:00     0
2012-03-04 11:30:00     0
2012-03-04 12:00:00     0
2012-03-04 12:30:00     0
2012-03-04 13:00:00     0
2012-03-04 13:30:00     0
2012-03-04 14:00:00     0
2012-03-04 14:30:00     0
2012-03-04 15:00:00     0
2012-03-04 15:30:00     0
2012-03-04 16:00:00     0
2012-03-04 16:30:00     0
2012-03-04 17:00:00     0
2012-03-04 17:30:00     0
2012-03-04 18:00:00     0
2012-03-04 18:30:00     0
2012-03-04 19:00:00     0
2012-03-04 19:30:00     0
2012-03-04 20:00:00     0
2012-03-04 20:30:00     0
2012-03-04 21:00:00     0
2012-03-04 21:30:00     0
2012-03-04 22:00:00     0
2012-03-04 22:30:00     0
2012-03-04 23:00:00     0
2012-03-04 23:30:00     0

but if I reference a specific datetime, I get an error:

df_freq['2012-03-04 21:00:00']

gives

        Traceback (most recent call last):
      File "...\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\indexes\base.py", line 1945, in get_loc
        return self._engine.get_loc(key)
      File "pandas\index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)
      File "pandas\index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)
      File "pandas\hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)
      File "pandas\hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)
    KeyError: '2012-03-04 21:00:00'
    During handling of the above exception, another exception occurred:
    Traceback (most recent call last):
      File "...AppData\Local\Continuum\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
        exec(code_obj, self.user_global_ns, self.user_ns)
      File "<ipython-input-31-d748d9ec4f91>", line 1, in <module>
        df_freq['2012-03-04 21:00:00']
      File "...AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py", line 1997, in __getitem__
        return self._getitem_column(key)
      File "...AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2004, in _getitem_column
        return self._get_item_cache(key)
      File "...\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\generic.py", line 1350, in _get_item_cache
        values = self._data.get(item)
      File "...\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py", line 3290, in get
        loc = self.items.get_loc(item)
      File "...\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\indexes\base.py", line 1947, in get_loc
        return self._engine.get_loc(self._maybe_cast_indexer(key))
      File "pandas\index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)
      File "pandas\index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)
      File "pandas\hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)
      File "pandas\hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)
    KeyError: '2012-03-04 21:00:00'

Also, I don't understand why I can't reference the dataframe using a Timestamp object, rather than a string

    ts=pd.Timestamp('2012-03-04')
    df_freq[ts]

gives this error:

    Traceback (most recent call last):
  File "...AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\indexes\base.py", line 1945, in get_loc
    return self._engine.get_loc(key)
  File "pandas\index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)
  File "pandas\index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)
  File "pandas\hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)
  File "pandas\hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)
KeyError: Timestamp('2012-03-04 00:00:00')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "...AppData\Local\Continuum\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-37-29e033bc2394>", line 1, in <module>
    df_freq[ts]
  File "...AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py", line 1997, in __getitem__
    return self._getitem_column(key)
  File "...AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2004, in _getitem_column
    return self._get_item_cache(key)
  File "...AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\generic.py", line 1350, in _get_item_cache
    values = self._data.get(item)
  File "...AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py", line 3290, in get
    loc = self.items.get_loc(item)
  File "...AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\indexes\base.py", line 1947, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas\index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)
  File "pandas\index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)
  File "pandas\hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)
  File "pandas\hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)
KeyError: Timestamp('2012-03-04 00:00:00')
doctorer
  • 1,672
  • 5
  • 27
  • 50

1 Answers1

1

You need loc:

print (df_freq.loc['2012-03-04 21:00:00'])
freq    0
Name: 2012-03-04 21:00:00, dtype: int64

First snippet works, because use datetimeindex partial string indexing:

print (df_freq['2012-03-04'])
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252