I just tried to use Apache Spark ml library for Logistic Regression, but whenever I tried it, there was an error message, such as
"ERROR OWLQN: Failure! Resetting history: breeze.optimize.NaNHistory: "
The example of data set for logistic regression is following:
+-----+---------+---------+---------+--------+-------------+
|state|dayOfWeek|hourOfDay|minOfHour|secOfMin| features|
+-----+---------+---------+---------+--------+-------------+
| 1.0| 7.0| 0.0| 0.0| 0.0|(4,[0],[7.0])|
And there is code for the logistic regression as following:
//Data Set
StructType schema = new StructType(
new StructField[]{
new StructField("state", DataTypes.DoubleType, false, Metadata.empty()),
new StructField("dayOfWeek", DataTypes.DoubleType, false, Metadata.empty()),
new StructField("hourOfDay", DataTypes.DoubleType, false, Metadata.empty()),
new StructField("minOfHour", DataTypes.DoubleType, false, Metadata.empty()),
new StructField("secOfMin", DataTypes.DoubleType, false, Metadata.empty())
});
List<Row> dataFromRDD = bucketsForMLs.map(p -> {
return RowFactory.create(p.label(), p.features().apply(0), p.features().apply(1), p.features().apply(2), p.features().apply(3));
}).collect();
Dataset<Row> stateDF = sparkSession.createDataFrame(dataFromRDD, schema);
String[] featureCols = new String[]{"dayOfWeek", "hourOfDay", "minOfHour", "secOfMin"};
VectorAssembler vectorAssembler = new VectorAssembler().setInputCols(featureCols).setOutputCol("features");
Dataset<Row> stateDFWithFeatures = vectorAssembler.transform(stateDF);
StringIndexer labelIndexer = new StringIndexer().setInputCol("state").setOutputCol("label");
Dataset<Row> stateDFWithLabelAndFeatures = labelIndexer.fit(stateDFWithFeatures).transform(stateDFWithFeatures);
MLRExecutionForDF mlrExe = new MLRExecutionForDF(javaSparkContext);
mlrExe.execute(stateDFWithLabelAndFeatures);
// Logistic Regression part
LogisticRegressionModel lrModel = new LogisticRegression().setMaxIter(maxItr).setRegParam(regParam).setElasticNetParam(elasticNetParam)
// This part would occur error
.fit(stateDFWithLabelAndFeatures);