0

I am a newby and trying to perform some EDA on a dataset that contains also object type columns. I have already cleaned up the dataset, eliminating nan and duplicates and I am trying to eliminate outliers using the IQR method. I have calculated the Q1 and Q2 of numeric columns:

Q1 = df.quantile(0.25 , axis=0, numeric_only=True)
Q3 = df.quantile(0.75 , axis=0, numeric_only=True)
IQR = Q3 - Q1
print(IQR)

now I am trying to eliminate the outliers with the following formula:

df = df[~((df < (Q1 - 1.5 * IQR)) |(df > (Q3 + 1.5 * IQR))).any(axis=1)]

but I get the following error:

"ValueError: Operands are not aligned. Do left, right = left.align(right, axis=1, copy=False) before operating."

I assume that something has to do with the object type columns in df, but I don't get how to get out of this impasse.

I would expect the obtain a dataframe (inclusive of the object type columns) but with less rows (basically without the outliers).

AlexM
  • 1

0 Answers0