Getting Pandas Key Error on Row although the key exists

Question

I have a dataframe which consists of the columns 'App', 'Query' and 'Label'

train_data = pd.DataFrame({
     'Query':['shoerack','shoerack','shoerack','shoerack', 'nike shoes'], 
     'App':  ['amazon', 'amazon', 'amazon', 'amazon', 'zalando'],
     'Label':[1, 1, 1, 1, 1]})

now if I do a simple apply:

train_data.apply(lambda row: print(row['App']))

I get:

KeyError                                  Traceback (most recent call last)
Cell In[20], line 4
  1 train_data = pd.DataFrame({'Query': ['shoerack','shoerack','shoerack','shoerack', 'nike shoes'], 
  2                            'App': ['amazon', 'amazon', 'amazon', 'amazon', 'zalando'],
  3                            'Label': [1, 1, 1, 1, 1]})
 ----> 4 train_data.apply(lambda row: print(row['App']))
 
KeyError: 'App'

According to this: How to apply a function on every row on a dataframe? the apply should work fine as it is per row. Why do I get a Key Error if the key exists?

Out of curiosity, what are you trying to do? I almost never have to use `apply` on `axis=1` — mozway, Apr 19 '23 at 20:52
I am generating a negative sample based on the input line. Therefore I am calling a function per row and passing the query and the app as arguments. I just simplified it with print() to make it easy reproduceable. — s.blnc, Apr 19 '23 at 21:04
Thanks. Make sure to check whether your function can be vectorized. `apply` should always be avoided whenever possible. — mozway, Apr 19 '23 at 21:08

score 0 · Accepted Answer · answered Apr 19 '23 at 20:50

Using apply with the default axis will run on columns. The is no App indice in your index, thus the KeyError.

You need to use axis=1:

train_data.apply(lambda row: print(row['App']), axis=1)

Output:

# printed
amazon
amazon
amazon
amazon
zalando

# returned value
0    None
1    None
2    None
3    None
4    None
dtype: object

Getting Pandas Key Error on Row although the key exists

1 Answers1