Let's say I have a pyspark DF:
| Column A | Column B |
| -------- | -------- |
| val1 | val1B |
| null | val2B |
| val2 | null |
| val3 | val3B |
Can someone help me with replacing any null value in any column (for the whole df) with the value right below it? So the final table should look like this:
Column A | Column B |
---|---|
val1 | val1B |
val2 | val2B |
val2 | val3B |
val3 | val3B |
How could this be done? Can I get a code demo if possible? Thank you!
All I've really gotten through is counting all the row nums and creating a condition to find the row nums with all of the null values. So I'm left with a table like this:
Column A | Column B | row_num |
---|---|---|
null | val2B | 2 |
val2 | null | 3 |
But I don't think this step is needed. I'm stuck as to what to do.