i want to run spark wordcount application on four different file at same time.
i have standalone cluster with 4 worker nodes, each node having one core and 1gb memory.
spark works in standalone mode... 1.4worker nodes 2.1 core for each worker node 3.1gb memory for each node 4.core_max set to 1
./conf/spark-env.sh
**
export SPARK_MASTER_OPTS="-Dspark.deploy.defaultCores=1"
export SPARK_WORKER_OPTS="-Dspark.deploy.defaultCores=1"
export SPARK_WORKER_CORES=1
export SPARK_WORKER_MEMORY=1g
export SPARK_WORKER_INSTANCES=4
**
i have executed using .sh file
./bin/spark-submit --master spark://-Aspire-E5-001:7077 ./wordcount.R txt1 &
./bin/spark-submit --master spark://-Aspire-E5-001:7077 ./wordcount.R txt2 &
./bin/spark-submit --master spark://-Aspire-E5-001:7077 ./wordcount.R txt3 &
./bin/spark-submit --master spark://-Aspire-E5-001:7077 ./wordcount.R txt4
is this a correct way to submit application parallelly ?
when one application running it takes 2sec like that(only using one core) when 4 application given simultaneously then each application takes more than 4sec ... how do i run spark application on different files parallely?