I have an isolation forest implementation where I take the features (all are numerical); scale them to be between 0 and 1
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
data = scaler.fit_transform(df)
x = pd.DataFrame(data)
Then call predict:
import matplotlib.pyplot as plt
from sklearn.ensemble import IsolationForest
clf = IsolationForest(max_samples=100, random_state=42).fit(x)
clf.predict(x)
In this instance, I have 23 numerical features.
When I run the script, it returns 1 for absolutely every result.
When I limit the feature set to 2 columns, it returns a mixture of 1 and -1.
How can I get around this?
Thanks