I have a computer cluster consisting of a master node and a bunch of worker nodes, each having 8GB of physical RAM. I would like to run some Spark jobs on this cluster, but limit the memory that is used by each node for performing this job. Is this possible within Spark?
For example, can I run a job so that each worker will only be allowed to use at most 1GB of physical RAM to do its job?
If not, is there some other way to artificially "limit" the RAM that each worker has?
If it is possible to limit RAM in some way or the other, what actually happens when Spark "runs out" of physical RAM? Does it page to disk or does the Spark job just kill itself?