So I'm trying to run a Spark pipeline on EMR, and I'm creating a step like so:
// Build the Spark job submission request
val runSparkJob = new StepConfig()
.withName("Run Pipeline")
.withActionOnFailure(ActionOnFailure.TERMINATE_CLUSTER)
.withHadoopJarStep(
new HadoopJarStepConfig()
.withJar(jarS3Path)
.withMainClass("com.example.SparkApp")
)
Problem is, when I run this, I encounter an exception like so:
org.apache.spark.SparkException: A master URL must be set in your configuration
The thing is, I'm trying to figure out where to specify the master URL, and I can't seem to find it. Do I specify it when setting up the pipeline run step or do I need to somehow get the master IP:port
into the application and specify it in the main function?