3

I am trying to subtract one row from another in a Pandas DataFrame. I have multiple descriptor columns preceding one numerical column, forcing me to set the index of the DataFrame on the two descriptor columns.

When I do this I get a KeyError on whatever the first column name listed in the set_index() list of columns is. In this case it is 'COL_A':

df = pd.DataFrame({'COL_A': ['A', 'A'], 
                   'COL_B': ['B', 'B'], 
                   'COL_C': [4, 2]})
df.set_index(['COL_A', 'COL_B'], inplace=True)
df.iloc[1] = (df.iloc[1] / df.iloc[0])
df.reset_index(inplace=True)


KeyError: 'COL_A'

I did not give this a second thought and cannot figure out why the KeyError is how this resolves.

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
KidMcC
  • 486
  • 2
  • 7
  • 17

2 Answers2

1

I came upon this question for a quick answer. Here's what my solution ended up being.

>>> df = pd.DataFrame(data=[[5,5,5,5], [3,3,3,3]], index=['r1', 'r2'])
>>> df
    0  1  2  3
r1  5  5  5  5
r2  3  3  3  3
>>> df.loc['r3'] = df.loc['r1'] - df.loc['r2']
>>> df
    0  1  2  3
r1  5  5  5  5
r2  3  3  3  3
r3  2  2  2  2
>>> 
Nick Brady
  • 6,084
  • 1
  • 46
  • 71
0

Not sure I understand you correctly:

df = pd.DataFrame({'COL_A': ['A', 'A'],
                   'COL_B': ['B', 'B'],
                   'COL_C': [4, 2]})

gives:

  COL_A COL_B  COL_C
0     A     B      4
1     A     B      2

then

df.set_index(['COL_A', 'COL_B'], inplace=True)
df.iloc[1] = (df.iloc[1] / df.iloc[0])

yields:

COL_A COL_B       
A     B        4.0
      B        0.5

If you now want to subtract, say row 0 from row 1, you can:

df.iloc[1].subtract(df.iloc[0])

to get:

COL_C   -3.5
Stefan
  • 41,759
  • 13
  • 76
  • 81