In a multi index
pandas dataframe I want to access the last element of the second index for all values of the first index. The number of levels in the second index vary depending on the value of the first index. I went through the pandas multi index documentation but could not find anything that does that.
For example, for the data frame below:
arrays = [ ['bar', 'bar', 'baz', 'foo', 'foo', 'foo', 'qux'],
['one', 'two', 'one', 'one', 'two', 'three', 'one']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(7, 3), index=index, columns=['A', 'B', 'C'])
df
A B C
first second
bar one 0.289163 -0.464633 -0.060487
two 0.224442 0.177609 2.156436
baz one -0.262329 -0.248384 0.925580
foo one 0.051350 0.452014 0.206809
two 2.757255 -0.739196 0.183735
three -0.064909 -0.963130 1.364771
qux one -1.330857 1.881588 -0.262170
I want to get:
A B C
first second
bar two 0.224442 0.177609 2.156436
baz one -0.262329 -0.248384 0.925580
foo three -0.064909 -0.963130 1.364771
qux one -1.330857 1.881588 -0.262170
The dataframes
I am working with have over 10M
lines so I want to avoid explicit looping.