0

I am trying to build an application on spark using Deeplearning4j library. I have a cluster where i am going to run my jar(built using intelliJ) using spark-submit command. Here's my code

package Com.Spark.Examples

import scala.collection.mutable.ListBuffer
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.canova.api.records.reader.RecordReader
import org.canova.api.records.reader.impl.CSVRecordReader
import org.deeplearning4j.nn.api.OptimizationAlgorithm
import org.deeplearning4j.nn.conf.MultiLayerConfiguration
import org.deeplearning4j.nn.conf.NeuralNetConfiguration
import org.deeplearning4j.nn.conf.layers.DenseLayer
import org.deeplearning4j.nn.conf.layers.OutputLayer
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork
import org.deeplearning4j.nn.weights.WeightInit
import org.deeplearning4j.spark.impl.multilayer.SparkDl4jMultiLayer
import org.nd4j.linalg.lossfunctions.LossFunctions

object FeedForwardNetworkWithSpark {
  def main(args:Array[String]): Unit ={
    val recordReader:RecordReader = new CSVRecordReader(0,",")
    val conf = new SparkConf()
      .setAppName("FeedForwardNetwork-Iris")
    val sc = new SparkContext(conf)
    val numInputs:Int = 4
    val outputNum = 3
    val iterations =1
    val multiLayerConfig:MultiLayerConfiguration = new NeuralNetConfiguration.Builder()
      .seed(12345)
      .iterations(iterations)
      .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
      .learningRate(1e-1)
      .l1(0.01).regularization(true).l2(1e-3)
      .list(3)
      .layer(0, new DenseLayer.Builder().nIn(numInputs).nOut(3).activation("tanh").weightInit(WeightInit.XAVIER).build())
      .layer(1, new DenseLayer.Builder().nIn(3).nOut(2).activation("tanh").weightInit(WeightInit.XAVIER).build())
      .layer(2, new OutputLayer.Builder(LossFunctions.LossFunction.MCXENT).weightInit(WeightInit.XAVIER)
        .activation("softmax")
        .nIn(2).nOut(outputNum).build())
      .backprop(true).pretrain(false)
      .build
    val network:MultiLayerNetwork = new MultiLayerNetwork(multiLayerConfig)
    network.init
    network.setUpdater(null)
    val sparkNetwork:SparkDl4jMultiLayer = new
        SparkDl4jMultiLayer(sc,network)
    val nEpochs:Int = 6
    val listBuffer = new ListBuffer[Array[Float]]()
    (0 until nEpochs).foreach{i => val net:MultiLayerNetwork = sparkNetwork.fit("/user/iris.txt",4,recordReader)
      listBuffer +=(net.params.data.asFloat().clone())
      }
    println("Parameters vs. iteration Output: ")
    (0 until listBuffer.size).foreach{i =>
      println(i+"\t"+listBuffer(i).mkString)}
  }
}

Here is my build.sbt file

name := "HWApp"

version := "0.1"

scalaVersion := "2.12.3"

libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.6.0" % "provided"
libraryDependencies += "org.apache.spark" % "spark-mllib_2.10" % "1.6.0" % "provided"
libraryDependencies += "org.deeplearning4j" % "deeplearning4j-nlp" % "0.4-rc3.8"
libraryDependencies += "org.deeplearning4j" % "dl4j-spark" % "0.4-rc3.8"
libraryDependencies += "org.deeplearning4j" % "deeplearning4j-core" % "0.4-rc3.8"
libraryDependencies += "org.nd4j" % "nd4j-x86" % "0.4-rc3.8" % "test"
libraryDependencies += "org.nd4j" % "nd4j-api" % "0.4-rc3.8"
libraryDependencies += "org.nd4j" % "nd4j-jcublas-7.0" % "0.4-rc3.8"
libraryDependencies += "org.nd4j" % "canova-api" % "0.0.0.14"

when i see my code in intelliJ, it does not show any error but when i execute the application on cluster: i got something like this:

Error

I don't know what it wants from me. Even a little help will be appreciated. Thanks.

Prasad Khode
  • 6,602
  • 11
  • 44
  • 59

1 Answers1

0

I'm not sure how you came up with this list of versions (I'm assuming just randomly compiling? please don't do that.)

You are using a 1.5 year old version of dl4j with dependencies that are a year older than that that don't exist anymore.

Start from scratch and follow our getting started and examples like you would any other open source project.

Those can be found here: https://deeplearning4j.org/quickstart

with example projects here: https://github.com/deeplearnin4j/dl4j-examples

A few more things: Canova doesn't exist anymore and has been renamed datavec for more than a year.

All dl4j, datavec, nd4j,.. versions must be the same.

If you are using any of our scala modules like spark, those must also always have the same scala version.

So you are mixing scala 2.12 with scala 2.10 dependencies which is a scala no no (that's not even dl4j specific).

Dl4j only supports scala 2.11 at most. This is mainly because hadoop distros like cdh and hortonworks don't support scala 2.12 yet.

Edit: Another thing to watch out for that is dl4j specific is how we do spark versions. Spark 1 and 2 are supported. Your artifact id should be:

dl4j-spark_${yourscala version} (usually 2.10, 2.11) with a dependency like: 0.9.1_spark_${YOUR VERSION OF SPARK}

This is applicable for our NLP modules as well.

Edit for more folks who haven't followed our getting started (Please do that, we keep that up to date): You also always need an nd4j backend. Usually this is nd4j-native-platform but maybe cuda if you are using gpus with: nd4j-cuda-${YOUR CUDA VERSION}-platform

Adam Gibson
  • 3,055
  • 1
  • 10
  • 12
  • Thanks for quick and laconic answer. Let me re-tune my built.sbt file. – Mohit Kumar Sep 12 '17 at 06:34
  • I need a favor. These are my current scala and spark versions which i am planning to use. scalaVersion := "2.11.11" libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "1.6.0" % "provided" libraryDependencies += "org.apache.spark" % "spark-mllib_2.11" % "1.6.0" % "provided" Could you please tell me which dl4j and nd4j versions are compatible to above scala and spark versions? – Mohit Kumar Sep 12 '17 at 06:49
  • Please follow our examples which show everything end to end. The group id, artifact id, and version all line up with sbt. – Adam Gibson Sep 12 '17 at 06:50
  • Example Link is not working: https://github.com/deeplearnin4j/dl4j-examples please check it once. – Mohit Kumar Sep 12 '17 at 06:52
  • One last reply to your question: I already specified what you wanted above you *literally* only need to make them all the same version. They never diverge. Please read my answer more carefully. I already gave you the last bits of spark info you need too. If you *still* need help, go to our live chat (which we put in nice bright letters on the website): https://gitter.im/deeplearning4j/deeplearning4j – Adam Gibson Sep 12 '17 at 07:02