I've data in my spark
dataframe
(df) which have 24 features and the 25th column is my target variable. I want to fit my dl4j
model on this dataset
which takes input in the form of org.nd4j.linalg.api.ndarray.INDArray, org.nd4j.linalg.dataset.Dataset
or org.nd4j.linalg.dataset.api.iterator.DataSetIterator
. How can I convert my dataframe
to the required type ?
I've also tried using Pipeline method to input spark dataframe to the model directly. But sbt dependency of dl4j-spark-ml is not working. My build.sbt file is :
scalaVersion := "2.11.8"
libraryDependencies += "org.deeplearning4j" %% "dl4j-spark-ml" % "0.8.0_spark_2-SNAPSHOT"
libraryDependencies += "org.deeplearning4j" % "deeplearning4j-core" % "0.8.0"
libraryDependencies += "org.nd4j" % "nd4j" % "0.8.0"
libraryDependencies += "org.nd4j" % "nd4j-native-platform" % "0.8.0"
libraryDependencies += "org.nd4j" % "nd4j-backends" % "0.8.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.1"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.0.1"
Can someone guide me from here ? Thanks in advance.