I try make Apache Spark job on scala. I'm novice in Scala and and earlier use Pyspark. Have error when the job starts. Code:
object SparkRMSP_full {
import org.apache.spark.sql.SparkSession
def main(args: Array[String]): Unit = {
val spark = SparkSession
.builder
.appName("parse_full_rmsp_job")
.getOrCreate()
val raw_data_df = spark.readStream
.format("kafka")
.option("kafka.bootstrap.servers", "10.1.24.111:9092")
.option("subscribe", "dev.etl.fns.rmsp.raw-data")
.load()
println(raw_data_df.isStreaming)
raw_data_df.printSchema
}
}
spark-submit command:
spark-submit --packages org.apache.spark:spark-streaming-kafka-0-10-assembly_2.11:2.1.0 --master local --num-executors 2 --executor-memory 2g --driver-memory 1g --executor-cores 2 "C:\tools\jar\streaming_spark.jar"
And I have error:
20/07/15 15:05:32 WARN SparkSubmit$$anon$2: Failed to load SparkRMSP_full.
java.lang.ClassNotFoundException: SparkRMSP_full
How I must declare the class correctly?
UPD:
build.sbt:
name := "streaming_spark"
version := "0.1"
scalaVersion := "2.11.12"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.3.1"
libraryDependencies += "org.apache.spark" %% "spark-streaming-kafka-0-10-assembly" % "2.3.1"
project sctructure on pastebin