1

I've been stuck on this for a bit so hopefully someone has better guidance. I currently have a dataframe that looks something like this(only way more rows):

|"released_date"| "status"  |

+-------------+--------+

|   12/12/20  |released|

+-------------+--------+

|   10/01/20  |   NaN  |

+-------------+--------+

|   NaN       |   NaN  |

+-------------+--------+

|   NaN.      |released|

+-------------+--------+

I wanted to do df['status'].fillna('released' if df.released_date.notnull())

aka, fill any Nan value in the status column of df with "released" as long as df.released_date is't a null value.

I keep getting various error messages when I do this though in different variations, first for the code above is a syntax error, which I imagine is because notnull() returns a boolean array?

I feel like there is a simple answer for this and I somehow am not seeing it. I haven't found any questions like this where I'm trying to organize something based on the null values in a dataframe, which leads me to wonder if my methodology isn't ideal in the first place? How can I filter values in a dataframe column based on null values in a different column without using isnull() or notnull() if those only return boolean arrays anyways? using == Null doesn't seem to work either...

sharathnatraj
  • 1,614
  • 5
  • 14
Erica
  • 15
  • 3

1 Answers1

0

Try:

idx = df[(df['status'].isnull()) & (~df['released_date'].isnull())].index
df.loc[idx,'status'] = 'released'

First get the index of all rows with 'status' equals null and 'released_date' notequals null. Then use df.loc to update the status column.

Prints:

  released_date    status
0      12/12/20  released
1      10/01/20  released
2           NaN       NaN
3           NaN  released
sharathnatraj
  • 1,614
  • 5
  • 14