3

I have a dataframe with datetime as index. How can I extract year and month from the index? Below is my dataframe.

            1. open   2. high    3. low  4. close   5. volume
date                                                         
2019-01-07   101.64  103.2681  100.9800    102.06  35656136.0
2019-01-08   103.04  103.9700  101.7134    102.80  31294058.0

apparently df["index"].dt.month or df["date"].dt.month doesnt work.

Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58
Chidu Murthy
  • 688
  • 3
  • 10
  • 26
  • Maybe this could help ? https://stackoverflow.com/questions/30405413/python-pandas-extract-year-from-datetime-dfyear-dfdate-year-is-not/33757291 – Tony Jan 09 '19 at 12:58
  • yeah just get rid of the .dt in the middle? Pandas understands datetimes and date is already a datetime object if the type is datetime. If the type is string, you'll have to convert to a datetime to use the datetime attributes – Davtho1983 Jan 09 '19 at 13:03
  • When asking questions about pandas it is better to include all relevant code and the df.info() – Davtho1983 Jan 09 '19 at 13:05
  • Possible duplicate of [python pandas extract year](https://stackoverflow.com/q/30405413/9209546), see [this answer](https://stackoverflow.com/a/54113327/9209546). – jpp Jan 09 '19 at 15:26

3 Answers3

7

You can take below example, However you can have the details usage from Docs pandas.DatetimeIndex

Example DataFrame:

>>> df
                              name  age favorite_color  grade  birth_date
Willard Morris      Willard Morris   20           blue     88  01-02-1996
Al Jennings            Al Jennings   19            red     92  08-05-1997
Omar Mullins          Omar Mullins   22         yellow     95  04-28-1996
Spencer McDaniel  Spencer McDaniel   21          green     70  12-16-1995

1) To extract year:

>>> df['year'] = pd.DatetimeIndex(df['birth_date']).year
>>> df.head()
                              name  age favorite_color  grade  birth_date  year
Willard Morris      Willard Morris   20           blue     88  01-02-1996  1996
Al Jennings            Al Jennings   19            red     92  08-05-1997  1997
Omar Mullins          Omar Mullins   22         yellow     95  04-28-1996  1996
Spencer McDaniel  Spencer McDaniel   21          green     70  12-16-1995  1995

2) To extract month:

>>> df['month'] = pd.DatetimeIndex(df['birth_date']).month
>>> df.head()
                              name  age favorite_color  grade  birth_date  year  month
Willard Morris      Willard Morris   20           blue     88  01-02-1996  1996      1
Al Jennings            Al Jennings   19            red     92  08-05-1997  1997      8
Omar Mullins          Omar Mullins   22         yellow     95  04-28-1996  1996      4
Spencer McDaniel  Spencer McDaniel   21          green     70  12-16-1995  1995     12

3) To extract year_with_month:

>>> df['month_year'] = pd.to_datetime(df['birth_date']).dt.to_period('M')
>>> df
                              name  age favorite_color  grade  birth_date  year  month month_year
Willard Morris      Willard Morris   20           blue     88  01-02-1996  1996      1    1996-01
Al Jennings            Al Jennings   19            red     92  08-05-1997  1997      8    1997-08
Omar Mullins          Omar Mullins   22         yellow     95  04-28-1996  1996      4    1996-04
Spencer McDaniel  Spencer McDaniel   21          green     70  12-16-1995  1995     12    1995-12
Karn Kumar
  • 8,518
  • 3
  • 27
  • 53
6

Use DatetimeIndex.year and DatetimeIndex.month, dt is used for select column:

print (df.index)
            1. open   2. high    3. low  4. close   5. volume
date                                                         
2019-01-07   101.64  103.2681  100.9800    102.06  35656136.0
2019-01-08   103.04  103.9700  101.7134    102.80  31294058.0


df.index = pd.to_datetime(df.index)

y = df.index.year
m = df.index.month

print (y)
Int64Index([2019, 2019], dtype='int64', name='date')

print (m)
Int64Index([1, 1], dtype='int64', name='date')
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

You could use two methods to extract part of the data in a specific month.

df[df.index.month == 1]

or

df['2019-01']

boths will return only rows that you want.

I recommend this tutorial https://youtu.be/r0s4slGHwzE