Questions tagged [multi-index]

A multi-index (also known as a hierarchical index) allows the manipulation of higher-dimensional data in a 2-dimensional tabular structure.

A multi-index exists when an indexing column (or row) of a tabular structure contains more than one line. The use of layered indexing enables one to effectively store and manipulate arbitrarily high dimension data in a 2-dimensional tabular structure 1. This is also called "hierarchical" or "multi-level" indexing.

2771 questions
82
votes
4 answers

How to remove levels from a multi-indexed dataframe?

For example, I have: In [1]: df = pd.DataFrame([8, 9], index=pd.MultiIndex.from_tuples([(1, 1, 1), (1, 3, 2)]), columns=['A']) In [2]…
Yariv
  • 12,945
  • 19
  • 54
  • 75
74
votes
4 answers

Using .loc with a MultiIndex in pandas

Does anyone know if it is possible to use the DataFrame.loc method to select from a MultiIndex? I have the following DataFrame and would like to be able to access the values located in the Dwell columns, at the indices of ('at', 1), ('at', 3),…
kronosapiens
  • 1,333
  • 1
  • 10
  • 19
73
votes
4 answers

How to move pandas data from index to column after multiple groupby

I have the following pandas dataframe: token year uses books 386 xanthos 1830 3 3 387 xanthos 1840 1 1 388 xanthos 1840 2 2 389 xanthos 1868 2 2 390 xanthos 1875 1 1 I aggregate the…
prooffreader
  • 2,333
  • 4
  • 21
  • 32
69
votes
3 answers

Pandas: Modify a particular level of Multiindex

I have a dataframe with Multiindex and would like to modify one particular level of the Multiindex. For instance, the first level might be strings and I may want to remove the white spaces from that index level: df.index.levels[1] = [x.replace(' ',…
user1642513
66
votes
4 answers

pandas: how to run a pivot with a multi-index?

I would like to run a pivot on a pandas DataFrame, with the index being two columns, not one. For example, one field for the year, one for the month, an 'item' field which shows 'item 1' and 'item 2' and a 'value' field with numerical values. I want…
Pythonista anonymous
  • 8,140
  • 20
  • 70
  • 112
66
votes
13 answers

Selecting columns from pandas MultiIndex

I have DataFrame with MultiIndex columns that looks like this: # sample data col = pd.MultiIndex.from_arrays([['one', 'one', 'one', 'two', 'two', 'two'], ['a', 'b', 'c', 'a', 'b', 'c']]) data =…
metakermit
  • 21,267
  • 15
  • 86
  • 95
65
votes
2 answers

Set MultiIndex of an existing DataFrame in pandas

I have a DataFrame that looks like Emp1 Empl2 date Company 0 0 0 2012-05-01 apple 1 0 1 2012-05-29 apple 2 0 1 2013-05-02 apple 3 0 1 2013-11-22 …
user3527975
  • 1,683
  • 8
  • 25
  • 43
49
votes
4 answers

Creating an empty MultiIndex

I would like to create an empty DataFrame with a MultiIndex before assigning rows to it. I already found that empty DataFrames don't like to be assigned MultiIndexes on the fly, so I'm setting the MultiIndex names during creation. However, I don't…
dmvianna
  • 15,088
  • 18
  • 77
  • 106
49
votes
2 answers

Summing over a multiindex level in a pandas series

I would like to sum (marginalize) over one level in a series with a 3-level multiindex to produce a series with a 2 level multiindex. For example, if I have the following: ind = [tuple(x) for x in ['ABC', 'ABc', 'AbC', 'Abc', 'aBC', 'aBc', 'abC',…
dylkot
  • 2,275
  • 2
  • 20
  • 24
48
votes
4 answers

Pandas groupby(),agg() - how to return results without the multi index?

I have a dataframe: pe_odds[ [ 'EVENT_ID', 'SELECTION_ID', 'ODDS' ] ] Out[67]: EVENT_ID SELECTION_ID ODDS 0 100429300 5297529 18.00 1 100429300 5297529 20.00 2 100429300 5297529 21.00 3 100429300 5297529 …
Ginger
  • 8,320
  • 12
  • 56
  • 99
48
votes
2 answers

How to do group by on a multiindex in pandas?

Below is my dataframe. I made some transformations to create the category column and dropped the original column it was derived from. Now I need to do a group-by to remove the dups e.g. Love and Fashion can be rolled up via a groupby…
Tampa
  • 75,446
  • 119
  • 278
  • 425
46
votes
1 answer

reading excel sheet as multiindex dataframe through pd.read_excel()

I'm struggle to read a excel sheet with pd.read_excel(). My excel table looks like this in it's raw form: I expected the dataframe to look like this: bar baz foo one two …
Boosted_d16
  • 13,340
  • 35
  • 98
  • 158
46
votes
4 answers

Change timezone of date-time column in pandas and add as hierarchical index

I have data with a time-stamp in UTC. I'd like to convert the timezone of this timestamp to 'US/Pacific' and add it as a hierarchical index to a pandas DataFrame. I've been able to convert the timestamp as an Index, but it loses the timezone…
Erik Shilts
  • 4,389
  • 2
  • 26
  • 51
44
votes
2 answers

Read excel sheet with multiple header using Pandas

I have an excel sheet with multiple header like: _________________________________________________________________________ ____|_____| Header1 | Header2 | Header3 | ColX|ColY…
muazfaiz
  • 4,611
  • 14
  • 50
  • 88
43
votes
4 answers

Benefits of panda's multiindex?

So I learned that I can use DataFrame.groupby without having a MultiIndex to do subsampling/cross-sections. On the other hand, when I have a MultiIndex on a DataFrame, I still need to use DataFrame.groupby to do sub-sampling/cross-sections. So what…
K.-Michael Aye
  • 5,465
  • 6
  • 44
  • 56