0

I have a dataframe with 200 rows and 151 columns, with the output variable being of the type object.

I am trying to impute Null values in the input variables (150 columns) with the mean value of the column section grouped by output variable.

Is there a way to use sklearn Imputer in this instance? Does anyone know an example I could follow? Thanks.

Fabien
  • 4,862
  • 2
  • 19
  • 33
Hello_Boy
  • 29
  • 1
  • 5
  • Can you share the code of what you have tried? – Bonifacio2 Aug 15 '17 at 12:30
  • I tried the following............ for column in df: df[column] = df.groupby("column_name").transform(lambda x: x.fillna(x.mean()))....................I think it did the job, but (a) it took ages to compute (over 5 minutes); (b) did not use the machine learning algorithm. Wondering if there was a better way of doing it. Thanks – Hello_Boy Aug 15 '17 at 19:53
  • Can you add it to the body of your question? :) – Bonifacio2 Aug 15 '17 at 21:27

0 Answers0