24

I have a table of data imported from a CSV file into a DataFrame.

The data contains around 10 categorical fields, 1 month column (in date time format) and the rest are data series.

How do I convert the date column into an index across the the column axis?

Garrett
  • 47,045
  • 6
  • 61
  • 50
MrHopko
  • 879
  • 1
  • 7
  • 16
  • 3
    [read_csv](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.io.parsers.read_csv.html) does this by itself if you invoke the function with the `index_col` and `parse_dates=True` – behzad.nouri Apr 01 '13 at 23:16

3 Answers3

37

You can use set_index:

df.set_index('month')

For example:

In [1]: df = pd.DataFrame([[1, datetime(2011,1,1)], [2, datetime(2011,1,2)]], columns=['a', 'b'])

In [2]: df
Out[2]: 
   a                   b
0  1 2011-01-01 00:00:00
1  2 2011-01-02 00:00:00

In [3]: df.set_index('b')
Out[3]: 
            a
b            
2011-01-01  1
2011-01-02  2
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
  • Thanks Andy Can i set the index so that the date is along the other axis? Will this group the values? – MrHopko Apr 01 '13 at 22:10
  • 1
    Could you give an example DataFrame which you have and what you want? I'm not sure how it makes sense to make a date column a column index... It won't group the values, this makes one column the index. :) – Andy Hayden Apr 01 '13 at 22:15
  • I want the data arranged so that each column represents 1 month. Each row in the table represents a different time series. Does that make sense? I would demonstrate with a table but I have no idea how to insert a table on this website. – MrHopko Apr 02 '13 at 09:29
  • A related question would be: How do I transfer an item from one index axis to another within a hierarchical index? – MrHopko Apr 02 '13 at 10:45
  • Look at stack() and unstack() in the documentation. For more help, give more details. (There's no special trick to inserting a table -- just uses spaces.) – Dan Allan Apr 03 '13 at 00:29
  • add inplace=True to set_index, otherwise it does not work as expected – Serhii Kushchenko Aug 05 '21 at 07:15
7

I had a similar problem and I've just solved by reset_index. But you can use either set_index or reset_index:

df_ind = df.set_index(['A', 'B'])

Or

df.reset_index(level=0, inplace=True)
aysebilgegunduz
  • 790
  • 12
  • 26
3

If you don't know the name of the date column ahead of time and need to set the index automatically based on the time series column in the data

df.set_index((df.select_dtypes(include=[np.datetime64]).columns).tolist())
Shankar ARUL
  • 12,642
  • 11
  • 68
  • 69