fill NA of a column with elements of another column

Question

i'm in this situation, my df is like that

    A   B   
0   0.0 2.0 
1   3.0 4.0 
2   NaN 1.0 
3   2.0 NaN 
4   NaN 1.0 
5   4.8 NaN 
6   NaN 1.0

and i want to apply this line of code: df['A'] = df['B'].fillna(df['A'])

and I expect a workflow and final output like that:

    A   B   
0   2.0 2.0 
1   4.0 4.0 
2   1.0 1.0 
3   NaN NaN 
4   1.0 1.0 
5   NaN NaN 
6   1.0 1.0 

    A   B   
0   2.0 2.0 
1   4.0 4.0 
2   1.0 1.0 
3   2.0 NaN 
4   1.0 1.0 
5   4.8 NaN 
6   1.0 1.0

but I receive this error:

TypeError: Unsupported type Series

probably because each time there is an NA it tries to fill it with the whole series and not with the single element with the same index of the B column.

I receive the same error with a syntax like that: df['C'] = df['B'].fillna(df['A']) so the problem seems not to be the fact that I'm first changing the values of A with the ones of B and then trying to fill the "B" NA with the values of a column that is technically the same as B

I'm in a databricks environment and I'm working with koalas data frames but they work as the pandas ones. can you help me?

see https://stackoverflow.com/help/someone-answers#:~:text=Choose%20one%20answer%20that%20you,the%20answer%2C%20at%20any%20time. — Anurag Dabas, Jul 30 '21 at 16:44

score 0 · Answer 1 · answered Jul 29 '21 at 14:05

0

IIUC:

try with max():

df['A']=df[['A','B']].max(axis=1)

output of df:

    A       B
0   2.0     2.0
1   4.0     4.0
2   1.0     1.0
3   2.0     NaN
4   1.0     1.0
5   4.8     NaN
6   1.0     1.0

answered Jul 29 '21 at 14:05

Anurag Dabas

23,866
9
21
41

score 0 · Accepted Answer · answered Jul 29 '21 at 14:25

Another option

Suppose the following dataset

import pandas as pd
import numpy as np

df = pd.DataFrame(data={'State':[1,2,3,4,5,6, 7, 8, 9, 10], 
                         'Sno Center': ["Guntur", "Nellore", "Visakhapatnam", "Biswanath", "Doom-Dooma", "Guntur", "Labac-Silchar", "Numaligarh", "Sibsagar", "Munger-Jamalpu"], 
                         'Mar-21': [121, 118.8, 131.6, 123.7, 127.8, 125.9, 114.2, 114.2, 117.7, 117.7],
                         'Apr-21': [121.1, 118.3, 131.5, np.NaN, 128.2, 128.2, 115.4, 115.1, np.NaN, 118.3]})

df
State   Sno Center      Mar-21  Apr-21
0   1   Guntur          121.0   121.1
1   2   Nellore         118.8   118.3
2   3   Visakhapatnam   131.6   131.5
3   4   Biswanath       123.7   NaN
4   5   Doom-Dooma      127.8   128.2
5   6   Guntur          125.9   128.2
6   7   Labac-Silchar   114.2   115.4
7   8   Numaligarh      114.2   115.1
8   9   Sibsagar        117.7   NaN
9   10  Munger-Jamalpu  117.7   118.3

Then

df.loc[(df["Mar-21"].notnull()) & (df["Apr-21"].isna()), "Apr-21"] = df["Mar-21"]

df
State   Sno Center      Mar-21  Apr-21
0   1   Guntur          121.0   121.1
1   2   Nellore         118.8   118.3
2   3   Visakhapatnam   131.6   131.5
3   4   Biswanath       123.7   123.7
4   5   Doom-Dooma      127.8   128.2
5   6   Guntur          125.9   128.2
6   7   Labac-Silchar   114.2   115.4
7   8   Numaligarh      114.2   115.1
8   9   Sibsagar        117.7   117.7
9   10  Munger-Jamalpu  117.7   118.3

fill NA of a column with elements of another column

2 Answers2