I hear people talking about an "Apache Standalone Cluster", which confuses me because I understand a "cluster" as various machines connected by a potentially fast network and working in parallel, and "standalone" as a machine or program that is isolated. So the question is, can Apache Standalone do distributed work across a network? If it can, what is the difference then versus the non-standalone versions?
Asked
Active
Viewed 60 times
1 Answers
2
Standalone (don't mistake with local) in Spark means that you don't use external resource manage (YARN, Mesos) but Spark's own resource management utilities. It can be distributed the same way as Spark on other cluster managers.
Spark in local
mode runs on a single JVM. It cannot be distributed (but, in the limits of a single machine is still parallelized with threads and processes) is useful only for development.

Alper t. Turker
- 34,230
- 9
- 83
- 115
-
1Thanks! That clarified the issue a lot! – Martin Ventura Aug 04 '17 at 21:02