2

Trying to subtract a constant array from a DatraFrame using lambda.

This is my DataFrame d:

import pandas as pd

d = pd.DataFrame()
d['x'] = pd.Series([1, 2, 3, 4, 5, 6])
d['y'] = pd.Series([11, 22, 33, 44, 55, 66])

A working as expected classical loop approach:

transformed = pd.DataFrame(columns=('x', 'y'))
for index, row in d.iterrows():
  transformed.loc[index] = [row[0] + 5, row[1] + 10]
print(transformed)

Produces:

    x   y
0   6  21
1   7  32
2   8  43
3   9  54
4  10  65
5  11  76

This is the lambda version:

print(d.apply(lambda x: x + [5, 10]))

However, is raising the error: ValueError: operands could not be broadcast together with shapes (6,) (2,)

After reading Pandas documentation, I understood my lambda approach should work. Why doesn't it work?

Lourenco
  • 2,772
  • 2
  • 15
  • 21

2 Answers2

3

If number of columns is same like length of list simpliest is:

print(d + [5, 10])
    x   y
0   6  21
1   7  32
2   8  43
3   9  54
4  10  65
5  11  76

If there is multiple columns select by list, lengths of lists has to be same:

print(d[['x','y']] + [5, 10])
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • thanks for your prompt response. I've accepted U12-forward's response because he/she touched on the main problem I faced, setting the `axis=1` argument in `apply`. Even though your response is valid in a broader sense, it is infeasible for me to use `d + [5, 10]` because I removed a lot of contexts of the real problem I was solving. Again, thank you for your parsimonious response. – Lourenco Oct 20 '21 at 19:09
  • @Lourenco Thanks, apply is not recommended because slow - loops under the hood, if exist vectorized alternative like here. But is wide used, people like it, because think it will be faster, unfortunately obviously not. – jezrael Oct 20 '21 at 19:50
1

apply is automatically column wise, the axis argument is set to 0 by default.

You need to specify axis=1 for it will calculate row wise:

>>> d.apply(lambda x: x + [5, 10], axis=1)
    x   y
0   6  21
1   7  32
2   8  43
3   9  54
4  10  65
5  11  76
>>> 

But tbh in this situation you don't need apply anyway:

>>> d + [5, 10]
    x   y
0   6  21
1   7  32
2   8  43
3   9  54
4  10  65
5  11  76
>>> 
U13-Forward
  • 69,221
  • 14
  • 89
  • 114