0

Running into a small problem. Working on a UCI machine learning repository (ILPD in specific). There are 4 missing values in one column. Rather than impute with the mean or median, it can be worked out using a simple formula from the existing column data.

Trying to fill in the missing data for albumin globulin ratio with the following formula albumin/(total proteins - albumin), however following code keeps running into errors.

IndianLiver['Albumin Globulin Ratio']
.fillna(IndianLiver.groupby('Class')['Albumin Globulin Ratio']
.transform(['Albumin']/(['Total Proteins']-['Albumin']), inplace=True)

SyntaxError: unexpected EOF while parsing

Any thoughts?

Thanks

CodeNoob
  • 345
  • 4
  • 17
projecthunder
  • 21
  • 1
  • 3

1 Answers1

0

Managed to fix it, seemed to be missing the dataframe name on the columns called in the calculation at the end:

    Albumin_Globulin_Ratio = IndianLiver['AG Ratio'].fillna(IndianLiver['Albumin']/(IndianLiver['Total Proteins']/IndianLiver['Albumin']))

This now fills the 4 missing data points in the column using the existing columns to calculate the missing data points.

projecthunder
  • 21
  • 1
  • 3