0

Using iforest as described here: https://github.com/titicaca/spark-iforest But model.save() is throwing exception:

Exception: scala.NotImplementedError: The default jsonEncode only supports string, vector and matrix. org.apache.spark.ml.param.Param must override jsonEncode for java.lang.Double.

Followed the code snippet mentioned under "Python API" section on mentioned git page.

from pyspark.ml.feature import VectorAssembler
import os
import tempfile
from pyspark_iforest.ml.iforest import *

col_1:integer
col_2:integer
col_3:integer

assembler = VectorAssembler(inputCols=in_cols, outputCol="features")
featurized = assembler.transform(df)

iforest = IForest(contamination=0.5, maxDepth=2)
model=iforest.fit(df)

model.save("model_path")

model.save() should be able to save model files.

Below is the output dataframe I'm getting after executing model.transform(df):

col_1:integer
col_2:integer
col_3:integer
features:udt
anomalyScore:double
prediction:double
Mario
  • 1,631
  • 2
  • 21
  • 51
Sandie
  • 869
  • 2
  • 12
  • 22
  • I have similar problem when using the Scala version.I created the issue:https://github.com/titicaca/spark-iforest/issues/15 – florins Jul 01 '19 at 14:10

1 Answers1

1

I have just fixed this issue. It was caused by an incorrect param type. You can checkout the latest codes in the master branch, and try it again.

F.Z.Yang
  • 66
  • 3