15

I have a pandas dataframe like the following:

    Col1  Col2  Col3
0   A     7     NaN
1   B     16    NaN
1   B     16    15

What I want to do is to swap Col2 with Col3 where the value of Col3 is NaN. Based on other posts and answers on SO, I have this code so far:

df[['Col2', 'Col3']] = df[['Col3', 'Col2']].where(df[['Col3']].isnull())

But this does not seem to be working properly and gives me the following:

    Col1  Col2  Col3
0   A     NaN   NaN
1   B     NaN   NaN
1   B     NaN   NaN

Is there something that I might be missing here?

Update: My desired output looks like:

    Col1  Col2  Col3
0   A     NaN   7
1   B     NaN   16
1   B     16    15

Thanks

dagg3r
  • 341
  • 3
  • 5
  • 12
  • What do you mean by when Col3 is None? If all elements are NaN or would a single element suffice? – ayhan Aug 11 '16 at 16:21
  • 1
    @ayhan For some rows, the value of Col3 is NaN and when that happens I want to swap it with Col2. I edited the question, hope it is clear now. – dagg3r Aug 11 '16 at 16:32

2 Answers2

17

You can use loc to do the swap:

df.loc[df['Col3'].isnull(), ['Col2', 'Col3']] = df.loc[df['Col3'].isnull(), ['Col3', 'Col2']].values

Note that .values is required to make sure the swap is done properly, otherwise Pandas would try to align based on index and column names, and no swap would occur.

You can also just reassign each row individually, if you feel the code is cleaner:

null_idx = df['Col3'].isnull()
df.loc[null_idx, 'Col3'] = df['Col2']
df.loc[null_idx, 'Col2'] = np.nan

The resulting output:

  Col1  Col2  Col3
0    A   NaN   7.0
1    B   NaN  16.0
2    B  16.0  15.0
root
  • 32,715
  • 6
  • 74
  • 87
11

Try this: (its faster)

df["Col3"], df["Col2"] = np.where(df['Col3'].isnull(), [df["Col2"], df["Col3"]], [df["Col3"], df["Col2"] ])
df

     Col1  Col2  Col3
0    A   NaN   7.0
1    B   NaN  16.0
1    B  16.0  15.0


    %timeit df.loc[df['Col3'].isnull(), ['Col2', 'Col3']] = df.loc[df['Col3'].isnull(), ['Col3', 'Col2']].values
100 loops, best of 3: 2.68 ms per loop


    %timeit df["Col3"], df["Col2"] = np.where(df['Col3'].isnull(), [df["Col2"], df["Col3"]], [df["Col3"], df["Col2"] ])
1000 loops, best of 3: 592 µs per loop
Merlin
  • 24,552
  • 41
  • 131
  • 206