1

I have made a Spark Standalone Cluster with two virtual machines.
In the 1st VM (8 cores, 64 GB Memory), I started the master manually using the command bin/spark-class org.apache.spark.deploy.master.Master.
In the 2nd VM (8 cores, 64 GB Memory), I started the slave manually using
bin/spark-class org.apache.spark.deploy.worker.Worker spark://<hostname of master>:7077.
Then in the 1st VM, I also started the slave using the above slave command. It can be seen in the below pic that both the workers & master are started & ALIVE.

But when I run my Spark applications only the worker in 2nd VM is run ( worker-20160613102937-10.0.37.150-47668 ). The worker of 1st VM ( worker-20160613103042-10.0.37.142-52601 ) doesn't run. See the below pic

Spark Standalone Cluster UI

I want both the workers should be used in my Spark applications. How can this be done?

EDIT : See this pic of Executor summary where the Executors corresponding to worker in VM 1st are failed.

Executor Summary

When I click on any stdout or stderr, it shows the error of invalid log directory. See the below pic

error

Abhilash Awasthi
  • 782
  • 5
  • 22

1 Answers1

0

The error is resolved. Spark was not able to create the log directory on the 1st VM. The user from which I was submitting the Spark job didn't have the permission to create a file on the path /usr/local/spark. Just changing the read/write permissions of the directory (chmod -R 777 /usr/local/spark) did the trick.

Abhilash Awasthi
  • 782
  • 5
  • 22