I would like to submit a spark job with configuring additional jar on hdfs, however the hadoop gives me a warning on skipping remote jar. Although I can still get my final results on hdfs, I cannot obtain the effect of additional remote jar. I would appreciate if you can give me some suggestions.
Many thanks,
root@cluster-1-m:~# hadoop fs -ls hdfs://10.146.0.4:8020/tmp/jvm-profiler-1.0.0.jar
-rwxr-xr-x 2 hdfs hadoop 7097056 2019-01-23 14:44 hdfs://10.146.0.4:8020/tmp/jvm-profiler-1.0.0.jar
root@cluster-1-m:~#/usr/lib/spark/bin/spark-submit \
--deploy-mode cluster \
--master yarn \
--conf spark.jars=hdfs://10.146.0.4:8020/tmp/jvm-profiler-1.0.0.jar \
--conf spark.driver.extraJavaOptions=-javaagent:jvm-profiler-1.0.0.jar \
--conf spark.executor.extraJavaOptions=-javaagent:jvm-profiler-1.0.0.jar \
--class com.github.ehiggs.spark.terasort.TeraSort \
/root/spark-terasort-master/target/spark-terasort-1.1-SNAPSHOT-jar-with-dependencies.jar /tmp/data/terasort_in /tmp/data/terasort_out
Warning: Skip remote jar hdfs://10.146.0.4:8020/tmp/jvm-profiler-1.0.0.jar.
19/01/24 02:20:31 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at cluster-1-m/10.146.0.4:8032
19/01/24 02:20:31 INFO org.apache.hadoop.yarn.client.AHSProxy: Connecting to Application History server at cluster-1-m/10.146.0.4:10200
19/01/24 02:20:34 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: Submitted application application_1548293702222_0002