Spark config, org.apache.spark.shuffle.FetchFailedException Failed to connect

Question

I installed hadoop 3.1.0 and spark 2.4.7 on 4 virtual machines. In total I have 32 cores, 128G memory. I have been running spark-shell test

[hadoop@hadoop1 bin]$hadoop fs -mkdir -p /user/hadoop/testdata 
[hadoop@hadoop1 bin]$hadoop fs -put /app/hadoop/hadoop-2.2.0/etc/hadoop/core-site.xml /user/hadoop/testdata
[hadoop@hadoop1 bin]$ spark-shell --master spark://hadoop1:7077

scala>val rdd=sc.textFile("hdfs://hadoop1:9000/user/hadoop/testdata/core-site.xml") 
scala>rdd.cache() 
scala>val wordcount=rdd.flatMap(_.split(" ")).map(x=>(x,1)).reduceByKey(_+_) 
scala>wordcount.take(10) 
scala>val wordsort=wordcount.map(x=>(x._2,x._1)).sortByKey(false).map(x=>(x._2,x._1)) 
scala>wordsort.take(10)

I have been playing with the following parameters

spark.core.connection.ack.wait.timeout 600s
spark.default.parallelism 4
spark.driver.memory 6g
spark.executor.memory 6g
spark.cores.max 21
spark.executor.cores 3

and bumped into org.apache.spark.shuffle.FetchFailedException Failed to connect 192.168.0.XXX or WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

Is there a general guide to fine-tune these and any other parameters?

https://stackoverflow.com/questions/39268624/how-to-fix-org-apache-spark-shuffle-fetchfailedexception-failed-to-connect-in — itIsNaz, Feb 24 '21 at 11:59
I tried it and got WARN TaskSchedulerImpl: Initial job has not accepted any resources; This is why I ask for guidelines of fine-tuning the parameters as a whole. — davidzxc574, Feb 24 '21 at 14:02

Spark config, org.apache.spark.shuffle.FetchFailedException Failed to connect

0 Answers0