Unable to run Spark program on Standalone mode (Error in Client and Cluster mode)

Question

I have a single Ubuntu server where I ran a Master and a Slave (one executor) and they show up on 8080 UI.
I can run spark-shell --master spark://foo.bar:7077 successfully, but I can't submit my program (fat jar) successfully and I get errors (standalone mode).

I have a Main object which extends App instead of having a main method. and all in a package myProject. And I am running my program like this:

 spark-submit --master spark://foo.bar:7077 \
--class myProject.Main \
--deploy-mode client \
--num-executors 1 \
--executor-memory 58g \
--executor-cores 39 \
--driver-memory 4g \
--driver-cores 2 \
--conf spark.driver.memoryOverhead=819m \
--conf spark.executor.memoryOverhead=819m \
target/scala-2.12/myProject-assembly-0.1.jar

`client` mode's output:

Exception in thread "main" java.lang.NoSuchMethodError: scala.App.$init$(Lscala/App;)V
        at mstproject.Main$.<init>(Main.scala:8)
        at mstproject.Main$.<clinit>(Main.scala)
        at mstproject.Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:855)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:930)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:939)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

I already checked a similar error, But all of my packages seems compatible with Scala 2.12 as my build.sbt shows (I am not sure about my assemblyMergeStrategy):

scalaVersion := "2.12.12"
libraryDependencies ++= Seq(
  "com.github.pathikrit" %% "better-files" % "3.9.1",
  "org.scalatest" %% "scalatest" % "3.2.3" % Test,
  "org.apache.spark" %% "spark-core" % "2.4.8",
  "org.apache.spark" %% "spark-sql" % "2.4.8",
  "org.apache.spark" %% "spark-graphx" % "2.4.8",
  "redis.clients" % "jedis" % "3.5.1",
  "com.redislabs" %% "spark-redis" % "2.4.2"
)

assemblyMergeStrategy in assembly := {
  case PathList("org","aopalliance", xs @ _*) => MergeStrategy.last
  case PathList("javax", "inject", xs @ _*) => MergeStrategy.last
  case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
  case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
  case PathList("org", "apache", xs @ _*) => MergeStrategy.last
  case PathList("com", "google", xs @ _*) => MergeStrategy.last
  case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
  case PathList("com", "codahale", xs @ _*) => MergeStrategy.last
  case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
  case "about.html" => MergeStrategy.rename
  case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last
  case "META-INF/mailcap" => MergeStrategy.last
  case "META-INF/mimetypes.default" => MergeStrategy.last
  case PathList("META-INF", xs @ _*) =>
    xs map {_.toLowerCase} match {
      case "manifest.mf" :: Nil | "index.list" :: Nil | "dependencies" :: Nil =>
        MergeStrategy.discard
      case ps @ x :: xs if ps.last.endsWith(".sf") || ps.last.endsWith(".dsa") =>
        MergeStrategy.discard
      case "plexus" :: xs =>
        MergeStrategy.discard
      case "services" :: xs =>
        MergeStrategy.filterDistinctLines
      case "spring.schemas" :: Nil | "spring.handlers" :: Nil =>
        MergeStrategy.filterDistinctLines
      case _ => MergeStrategy.first
    }
  case "application.conf" => MergeStrategy.concat
  case "reference.conf" => MergeStrategy.concat
  case "plugin.properties" => MergeStrategy.last
  case "log4j.properties" => MergeStrategy.last
  case _ => MergeStrategy.first
//  case x =>
//    val oldStrategy = (assemblyMergeStrategy in assembly).value
//    oldStrategy(x)
}

`cluster` mode's output:

log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.NativeCodeLoader).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
21/07/17 17:35:40 INFO SecurityManager: Changing view acls to: root
21/07/17 17:35:40 INFO SecurityManager: Changing modify acls to: root
21/07/17 17:35:40 INFO SecurityManager: Changing view acls groups to:
21/07/17 17:35:40 INFO SecurityManager: Changing modify acls groups to:
21/07/17 17:35:40 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
21/07/17 17:35:40 INFO Utils: Successfully started service 'driverClient' on port 34218.
21/07/17 17:35:40 INFO TransportClientFactory: Successfully created connection to foo.bar/10.0.8.137:7077 after 39 ms (0 ms spent in bootstraps)
21/07/17 17:35:41 INFO ClientEndpoint: Driver successfully submitted as driver-20210717173541-0003
21/07/17 17:35:41 INFO ClientEndpoint: ... waiting before polling master for driver state
21/07/17 17:35:46 INFO ClientEndpoint: ... polling master for driver state
21/07/17 17:35:46 INFO ClientEndpoint: State of driver-20210717173541-0003 is FAILED
21/07/17 17:35:46 INFO ShutdownHookManager: Shutdown hook called
21/07/17 17:35:46 INFO ShutdownHookManager: Deleting directory /tmp/spark-c15b4457-664f-43b7-9699-b62839ec83c0

Cluster mode gives the same output even if I give a random --class name. But client mode just emits the error.
submitting in Cluster mode adds a "Completed Driver" with a "FAILED" state in Master's UI (8080 port).
I can run my program on local mode successfully.
No master and worker log output in client mode.
in cluster mode. worker log informs about copying jar and failure of the driver.
in cluster mode, the master log gives this output:

21/07/17 17:45:08 INFO Master: Driver submitted org.apache.spark.deploy.worker.DriverWrapper
21/07/17 17:45:08 INFO Master: Launching driver driver-20210717174508-0007 on worker worker-20210716205255-10.0.8.137-45558
21/07/17 17:45:12 INFO Master: Removing driver: driver-20210717174508-0007
21/07/17 17:45:14 INFO Master: 10.0.8.137:34806 got disassociated, removing it.
21/07/17 17:45:14 INFO Master: 10.0.8.137:38394 got disassociated, removing it.

I feel the same thing is happening in both deploy-modes. The error in client is visible through spark-submit, but the error in cluster mode is hidden and must be checked from Master UI (I don't why it is not visible in Master and Worker log files.

UPDATE: as Luis said it was a problem with Scala incompatibility, where my Spark cluster was using an embedded Scala 2.11 and not 2.12. So fixed it by downgrading my fat jar Scala version to 2.11.

You didn't show that your libraries are using `2.11` also are your sure your cluster is using `2.11` maybe the cluster is using `2.12` - BTW, are you sure you are also using the same **Spark** versions? - Finally, did you excluded the **Spark** libraries and the **Scala** stdlib jars from the uber jar? — Luis Miguel Mejía Suárez, Jul 17 '21 at 18:16
I am using Scala 2.12 and not 2.11. The Server and the build Spark version are 2.4.8 both. You can check my `build.sbt` above. And I don't think if I excluded anything, just handled the "deduplication" issue by copy/pasting and don't know if this mergeStrategy is okay. — MalekMFS, Jul 17 '21 at 19:59
You still don't show your **Scala** version in your `build.sbt` also, you haven't confirmed what is the **Scala** and **Spark** versions of your cluster. Also, that merge strategy feels very complex and still you haven't done basic configuration as shown here: https://stackoverflow.com/questions/52371961/spark-build-sbt-file-versioning/52375099#52375099 — Luis Miguel Mejía Suárez, Jul 17 '21 at 20:16
Thanks. I added above the Scala version from my `build.sbt`. Also, I checked the `scala -version` from my cluster, but it was `2.11`, So I downloaded a `2.12.12` and installed, Restarted the Master and the Slave, Yet the same Logs... — MalekMFS, Jul 17 '21 at 20:54
No matter what `scala - version` says, **Spark** doesn't use the system **Scala**, it uses its own embedded **Scala**, you need to check that using `spark-shell` — Luis Miguel Mejía Suárez, Jul 17 '21 at 21:28
spark-shell is showing `2.11` unfortunately, while 2.4.8 is the latest Spark 2 release. I have the "spark-2.4.8-bin-hadoop2.7.tg" from https://archive.apache.org/dist/spark/spark-2.4.8/ But I wonder if I could replace it with "spark-2.4.8-bin-without-hadoop-scala-2.12.tgz" ? — MalekMFS, Jul 17 '21 at 22:07
If you want to use **Scala** `2.12` then you need to download and install **Spark** binaries that were built against that version. Otherwise, just change your **Scala** version in your project and produce a new fat jar. — Luis Miguel Mejía Suárez, Jul 17 '21 at 22:22
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/235014/discussion-between-malekmfs-and-luis-miguel-mejia-suarez). — MalekMFS, Jul 18 '21 at 09:40

Unable to run Spark program on Standalone mode (Error in Client and Cluster mode)

client mode's output:

cluster mode's output:

0 Answers0

`client` mode's output:

`cluster` mode's output: