I have a data frame which have some Nan values. So, I want to delete the rows which has two or more than 2 nan values. Also, replace the nan values in the other rows with the mean of row. Here is simple example:
import numpy as np
df = pd.DataFrame()
df['id'] = [1, 2, 3, 4, 5,6]
df['val1'] = [1, np.nan, 2, np.nan, 3, 5]
df['val2'] = [np.nan, np.nan, 2, np.nan, np.nan, 5]
df['val3'] = [4, np.nan, 2, np.nan, 7, 5]
df['val4'] = [3, np.nan, 2, np.nan, np.nan, 5]
id val1 val2 val3 val4
0 1 1.0 NaN 4.0 3.0
1 2 NaN NaN NaN NaN
2 3 2.0 2.0 2.0 2.0
3 4 NaN NaN NaN NaN
4 5 3.0 NaN 7.0 NaN
5 6 5.0 5.0 5.0 5.0
The output that I want is:
id val1 val2 val3 val4
0 1 1 2.67 4 3
1 3 2 2.00 2 2
2 6 5 5.00 5 5