Spark : check your cluster UI to ensure that workers are registered

Question

I have a simple program in Spark:

/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

object SimpleApp {
  def main(args: Array[String]) {
    val conf = new SparkConf().setMaster("spark://10.250.7.117:7077").setAppName("Simple Application").set("spark.cores.max","2")
    val sc = new SparkContext(conf)    
    val ratingsFile = sc.textFile("hdfs://hostname:8020/user/hdfs/mydata/movieLens/ds_small/ratings.csv")

    //first get the first 10 records 
    println("Getting the first 10 records: ")
    ratingsFile.take(10)    

    //get the number of records in the movie ratings file
    println("The number of records in the movie list are : ")
    ratingsFile.count() 
  }
}

When I try to run this program from the spark-shell i.e. I log into the name node (Cloudera installation) and run the commands sequentially on the spark-shell:

val ratingsFile = sc.textFile("hdfs://hostname:8020/user/hdfs/mydata/movieLens/ds_small/ratings.csv")
println("Getting the first 10 records: ")
ratingsFile.take(10)    
println("The number of records in the movie list are : ")
ratingsFile.count()

I get correct results, but if I try to run the program from eclipse, no resources are assigned to program and in the console log all I see is:

WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

Also, in the Spark UI, I see this:

Job keeps Running - Spark

Also, it should be noted that this version of spark was installed with Cloudera (hence no worker nodes show up).

What should I do to make this work?

EDIT:

I checked the HistoryServer and these jobs don't show up there (even in incomplete applications)

Related question on the first part of the error message: [`TaskSchedulerImpl: Initial job has not accepted any resources;`](http://stackoverflow.com/q/29469462/1804173) — bluenote10, Aug 14 '16 at 10:17

score 15 · Accepted Answer · answered Feb 27 '16 at 05:54

15

I have done configuration and performance tuning for many spark clusters and this is a very common/normal message to see when you are first prepping/configuring a cluster to handle your workloads.

This is unequivocally due to insufficient resources to have the job launched. The job is requesting one of:

more memory per worker than allocated to it (1GB)
more CPU's than available on the cluster

answered Feb 27 '16 at 05:54

WestCoastProjects

58,982
91
316
560

1

I think the problem resides with the way this job is being deployed. The master URL should be specified only when spark has its own master and slaves. However, in my case, the program is running on a YARN cluster. Not sure how the deployment works in that case. – vineet sinha Feb 29 '16 at 00:11
1

For Yarn the master is simply `--master yarn` http://spark.apache.org/docs/latest/running-on-yarn.html – WestCoastProjects Feb 29 '16 at 04:52
https://spark.apache.org/docs/latest/configuration. The cause is the default option in standalone mode is bad, you will block your entire cluster if you do not turn down the number of --total-executor-cores 1 in YARN mode, all the available cores on the worker in standalone and Mesos coarse-grained modes. – mathtick Feb 27 '18 at 10:25

score 2 · Answer 2 · answered Feb 29 '16 at 19:51

2

Finally figured out what the answer is.

When deploying a spark program on a YARN cluster, the master URL is just yarn.

So in the program, the spark context should just looks like:

val conf = new SparkConf().setAppName("SimpleApp")

Then this eclipse project should be built using Maven and the generated jar should be deployed on the cluster by copying it to the cluster and then running the following command

spark-submit --master yarn --class "SimpleApp" Recommender_2-0.0.1-SNAPSHOT.jar

This means that running from eclipse directly would not work.

answered Feb 29 '16 at 19:51

vineet sinha

317
1
4
12

I get similar error in hdp 2.4. When set hdp as master in standalone mode, I could run spark-shell from slave sever to the master server. Then I could run method like "val distData = sc.parallelize(Array(1, 2, 3, 4, 5))". But when try to read files from hdfs, it will throw up the error. It seems it could work in eclipse. Hope someone could help~ – Decula Mar 23 '16 at 00:20

score 2 · Answer 3 · edited Sep 22 '17 at 12:23

You can check your cluster's work node cores: your application can't exceed that. For example, you have two work node. And per work node you have 4 cores. Then you have 2 applications to run. So you can give every application 4 cores to run the job.

You can set like this in the code:

SparkConf sparkConf = new SparkConf().setAppName("JianSheJieDuan")
                          .set("spark.cores.max", "4");

It works for me.

score -1 · Answer 4 · answered Jan 17 '19 at 16:30

-1

There are also some causes of this same error message other than those posted here.

For a spark-on-mesos cluster, make sure you have java8 or newer java version on mesos slaves.

For spark standalone, make sure you have java8 (or newer) on the workers.

answered Jan 17 '19 at 16:30

Ayoub Omari

806
1
7
24

score -2 · Answer 5 · answered Feb 26 '16 at 23:12

-2

You don't have any workers to execute the job. There are no available cores for the job to execute and that's the reason the job's state is still in 'Waiting'.

If you have no workers registered with Cloudera how will the jobs execute?

answered Feb 26 '16 at 23:12

Saket

3,079
3
29
48

According to what I know, if Spark is running over YARN, worker nodes don't show up in the UI because the workers are again manage by YARN? – vineet sinha Feb 26 '16 at 23:25
I have typically seen these errors when there are no workers which are available or there are not enough free cores for the job. – Saket Feb 26 '16 at 23:27
you're right. But since I'm running this program on YARN, this "master URL" format might be wrong. Any insights on that ? – vineet sinha Feb 29 '16 at 00:12

Spark : check your cluster UI to ensure that workers are registered

5 Answers5

Linked