1

I have a Pandas DataFrame whose rows and columns are a DatetimeIndex.

import pandas as pd

data = pd.DataFrame(
    {
        "PERIOD_END_DATE": pd.date_range(start="2018-01", end="2018-04", freq="M"),
        "first": list("abc"),
        "second": list("efg")
    }
).set_index("PERIOD_END_DATE")

data.columns = pd.date_range(start="2018-01", end="2018-03", freq="M")
data

DataFrame

Unfortunately, I am getting a variety of errors when I try to pull out a value:

data['2018-01', '2018-02']       # InvalidIndexError: ('2018-01', '2018-02')
data['2018-01', ['2018-02']]     # InvalidIndexError: ('2018-01', ['2018-02'])
data.loc['2018-01', '2018-02']   # TypeError: only integer scalar arrays can be converted to a scalar index
data.loc['2018-01', ['2018-02']] # KeyError: "None of [Index(['2018-02'], dtype='object')] are in the [columns]" 

How do I extract a value from a DataFrame that uses a DatetimeIndex?

Nick Vence
  • 754
  • 1
  • 9
  • 19

4 Answers4

2

There are 2 issues:

  1. Since, you are using a DateTimeIndex dataframe, the correct notation to traverse between rows and columns are:
a) data.loc[rows_index_name, [column__index_name]]

or

b) data.loc[rows_index_name, column__index_name]

depending on the type of output you desire.

Notation A will return a series value, while notation (b) returns a string value.

  1. The index names can not be amputated- you must specify the whole string.

As such, your issue will be resolved with:

data.loc['2018-01-31',['2018-01-31']] or data.loc['2018-01-31','2018-01-31']
Kevin
  • 36
  • 3
0

As long as you already set the date as index, you will not be able to slice or extract any data of it. You can extract the month and date of it as it is a regular column not when it is an index. I had this before and that was the solution.

I kept it as a regular column, extracted the Month, Day and Year as a seperate column for each of them, then I assigned the date column as the index column.

SHENOOOO
  • 1
  • 4
  • you can slice the data ... test it `data.loc['2018-01', '2018-02':]`. It is likely that Pandas has some rules on partial indexing on both index and columns – sammywemmy Nov 01 '22 at 22:27
0

you are accessing as a period (YYYY-MM) on a date columns. This would help in this case


data.columns = pd.period_range(start="2018-01", end="2018-02", freq='M')
data[['2018-01']] 


                  2018-01
PERIOD_END_DATE     
     2018-01-31     a
     2018-02-28     b
     2018-03-31     c

Naveed
  • 11,495
  • 2
  • 14
  • 21
0

Timestamp indexes are finicky. Pandas accepts each of the following expressions, but they return different types.

    data.loc['2018-01',['2018-01-31']]
    data.loc['2018-01-31',['2018-01-31']]
    data.loc['2018-01','2018-01-31']
    data.loc['2018-01-31','2018-01']
    data.loc['2018-01-31','2018-01-31']

enter image description here

Nick Vence
  • 754
  • 1
  • 9
  • 19