How to alter output data format for isolation forest

Question

I have built an isolation forest to detect anomalies for a csv file that I have, and I wanted to see how I can change the format of the data. Right now, the anomaly data is being outputted as a pandas dataframe, but I would like to alter it to be a json file, in the following format:

{seconds: #seconds for that row, size2: size2, pages: #pages for that row}

I have attached come of the code and a sample of the data, thank you so much!

model.fit(df[['label']])
df['anomaly']=model.fit_predict(df[['size2','size3','size4']])
#df['anomaly']= model.predict(df[['pages']])
print(model.predict(X_test))
anomaly = df.loc[df['anomaly']==-1]
anomaly_index = list(anomaly.index)
print(anomaly)

The output data looks something like this:

Unnamed:  seconds:    size2: ... size4: pages:  anomaly:
1          40            32       654     1       -1

score 1 · Answer 1 · answered Jun 19 '20 at 16:35

I have figured out a way to do this; I made multiple dictionaries, one mapping the index of the row to that timestamp, and one mapping the index of the row to the label. I was then able to keep track of which indexes were in the output data, and access all the information from those dictionaries.

How to alter output data format for isolation forest

1 Answers1