-1

I'm trying to understand how can I convert a lambda function to a normal one. I have this lambda function that it supposed to fill the null values of each column with the mode

def fill_nn(data):
    df= data.apply(lambda column: column.fillna(column.mode()[0]))
    return df

I tried this:

def fill_nn(df):
    for column in df:
        if df[column].isnull().any():
            return df[column].fillna(df[column].mode()[0])
Barmar
  • 741,623
  • 53
  • 500
  • 612
Dani
  • 3
  • 2
  • 2
    It would be just `def func(column): return column.fillna(column.mode()[0])` – juanpa.arrivillaga Nov 23 '22 at 18:44
  • There's no need for the `if`. If there aren't any null values, `fillna()` won't do anything. – Barmar Nov 23 '22 at 18:45
  • 2
    A lambda expression `lambda : ` is always equivalent to `def name(): return ` – juanpa.arrivillaga Nov 23 '22 at 18:45
  • 1
    @juanpa.arrivillaga You're answering the title, not the actual question. She's trying to remove the use of `apply()` entirely. – Barmar Nov 23 '22 at 18:46
  • @Barmar in this case what I want is to replace the null values of a column (if any) with the mode, but if there's no null values will keep the ones that are already there – Dani Nov 23 '22 at 19:18
  • That's what `fillna()` does. It leaves all the non-null values alone. If there are no null values, it leaves the entire column unchanged. So the check before calling it is unnecessary. – Barmar Nov 23 '22 at 20:29

1 Answers1

0

Hi Hope you are doing well!

If I understood your question correctly then the best possible way will be similar to this:

import pandas as pd


def fill_missing_values(series: pd.Series) -> pd.Series:
    """Fill missing values in series/column."""

    value_to_use = series.mode()[0]
    return series.fillna(value=value_to_use)


df = pd.DataFrame(
    {
        "A": [1, 2, 3, 4, 5],
        "B": [None, 2, 3, 4, None],
        "C": [None, None, 3, 4, None],
    }
)

df = df.apply(fill_missing_values)  # type: ignore

print(df)
#    A    B    C
# 0  1  2.0  3.0
# 1  2  2.0  3.0
# 2  3  3.0  3.0
# 3  4  4.0  4.0
# 4  5  2.0  3.0

but personally, I would still use the lambda as it requires less code and is easier to handle (especially for such a small task).