I'm trying to update some columns of a dataframe where some condition is met (only some lines will meet the condition).
I'm using apply with loc. My function returns a pandas series.
The problem is that the columns are updates with NaN.
Simplifying my problem, we can consider the following dataframe df_test:
col1 col2 col3 col4
0 A 1 1 2
1 B 2 1 2
2 A 3 1 2
3 B 4 1 2
I now want to update col3 and col4 when col1=A. For that I'll use the apply method
df_test.loc[df_test['col1']=='A', ['col3', 'col4']] = df_test[df_test['col1']=='A'].apply(lambda row: pd.Series([10,20]), axis=1)
Doing that I get:
col1 col2 col3 col4
0 A 1 NaN NaN
1 B 2 1.0 2.0
2 A 3 NaN NaN
3 B 4 1.0 2.0
If instead of pd.Series([10, 20]) I use np.array([10, 20]) or [10, 20] I get the following error
ValueError: shape mismatch: value array of shape (2,2) could not be broadcast to indexing result of shape (2,)
What do I need to return to obtain
col1 col2 col3 col4
0 A 1 10 20
1 B 2 1 2
2 A 3 10 20
3 B 4 1 2
thanks!