python pandas: computing argmax of column in matrix subset

Question

Consider toy dataframes df1 and df2, where df2 is a subset of df1 (excludes the first row).

import pandas as pd import numpy as np

df1 = pd.DataFrame({'colA':[3.0,9,45,7],'colB':['A','B','C','D']})
df2 = df1[1:]

Now lets find argmax of colA for each frame

np.argmax(df1.colA) ## result is "2", which is what I expected
np.argmax(df2.colA) ## result is still "2", which is not what I expected.  I expected "1"

If my matrix of insterest is df2, how do I get around this indexing issue? Is this quirk related to pandas, numpy, or just python memory?

score 1 · Accepted Answer · answered Dec 11 '15 at 19:05

I think it's due to index. You could use reset_index when you assign df2:

df1 = pd.DataFrame({'colA':[3.0,9,45,7],'colB':['A','B','C','D']})
df2 = df1[1:].reset_index(drop=True)

In [464]: np.argmax(df1.colA)
Out[464]: 2

In [465]: np.argmax(df2.colA)
Out[465]: 1

I think it's better to use method argmax instead of np.argmax:

In [467]: df2.colA.argmax()
Out[467]: 1

score 0 · Answer 2 · answered Dec 11 '15 at 19:05

0

You need to reset the index of df2:

df2.reset_index(inplace=True, drop=True)
np.argmax(df2.colA)
>> 1

answered Dec 11 '15 at 19:05

DeepSpace

78,697
11
109
154

python pandas: computing argmax of column in matrix subset

2 Answers2