I'm trying to connect to my remote cluster using spark-submit
and run a jar file that I've put on hdfs.
I have the following property in my $SPARK_HOME/libexec/conf/core-site.xml
, which is also in $HADOOP_HOME/libexec/etc/hadoop/
:
<property>
<name>fs.defaultFS</name>
<value>hdfs://mydns.asuscomm.com:8021</value>
</property>
I can successfully view the file on my laptop using
hdfs dfs -ls hdfs:///user/stevenhurwitt/jars/
But when I run spark-submit
using:
spark-submit --deploy-mode cluster \
--class com.steven.redditStreaming.dataFrameFromCSVFile \
hdfs:///user/stevenhurwitt/jars/redditStreaming-1.0-SNAPSHOT.jar
I get the following error:
ERROR deploy.ClientEndpoint: Exception from cluster was: java.io.IOException: Incomplete HDFS URI, no host: hdfs:///user/stevenhurwitt/jars/redditStreaming-1.0-SNAPSHOT.jar
I have tried including hdfs:///mydns.asuscomm.com:8021/user...
but still get the same error.