0

i've done the twitter streaming using twitter's streaming user api and spark streaming. this runs successfully on my local machine. but when i run this program on cluster in local mode. it just run successfully for the very first time. later on it gives the following exception.

"Exception in thread "main" org.apache.spark.SparkException: Found both spark.executor.extraClassPath and SPARK_CLASSPATH. Use only the former."

and spark class path is unset already!! I have to make a new checkpoint directory each time to make it run successfully. otherwise it shows above exception.

Can anyone help me to resolve this issue? Thanks :)

Sadaf
  • 247
  • 5
  • 16

2 Answers2

0

try use

#!/bin/bash
HBASE_HOME=/opt/cloudera/parcels/CDH/lib/hbase
SPARK_CLASSPATH="$HBASE_HOME/conf/:$HBASE_HOME/hbase-client.jar:$HBASE_HOME/hbase-protocol.jar:$HBASE_HOME/lib/htrace-core.jar:$HBASE_HOME/lib/htrace-core-3.1.0-incubating.jar"
spark-submit --num-executors 2 --executor-cores 2 --executor-memory 10G  --conf spark.executor.extraClassPath=$SPARK_CLASSPATH your_spark_program.jar --class your_entry_class

the most important is --conf spark.executor.extraClassPath=$SPARK_CLASSPATH

shgy
  • 1
0

Had faced similar issue. setting SPARK_CLASSPATH causes problems as it is depricated. So don't use it.

export LIB_JARS=dependency/jcodings-1.0.8.jar,dependency.....etc
spark-submit --deploy-mode client --master local --class org.xyz.spark.driver.SomeClass  --num-executors 10 --jars ${LIB_JARS}
Abhiram
  • 362
  • 1
  • 2
  • 14