0

I have a Spark 2.1.1 job that is running in a Mesos cluster. Spark UI is showing 32 active executors, and RDD.getNumPartitions is showing 28 partitions. But only one (random) executor is doing any work, all others are marked as completed. I added debug statements to executor code (stdout) and only one executor is showing those. Entire pipeline is structured as follows: Get list of ids -> download JSON data for each id -> parse JSON data -> save to S3.

stage 1: val ids=session.sparkContext.textFile(path).repartition(28) -> RDD[String]

//ids.getNumPartitions shows 28
stage 2: val json=ids.mapPartitions { keys =>
  val urlBuilder ...
  val buffer ....
  keys map { key =>
    val url=urlBuilder.createUrl(id) //java.net.URL
    val json=url.openStream() ... //download text to buffer, close stream
    (id,json.toString)
  }
} -> RDD[Tuple2[String,String]]

stage 3: val output = json flatMap { t =>
  val values = ... //parse JSON, get values from JSON or empty sequence if not found
  values map { value => (t._1, value) }
} -> RDD[Tuple2[String,String]]

stage 4: output.saveAsTextFile("s3://...")

These are config settings for Spark binary: --driver-memory 32g --conf spark.driver.cores=4 --executor-memory 4g --conf spark.cores.max=128 --conf spark.executor.cores=4

The stage that is running on one executor only is the second one. I explicitly specified number of partitions (repartition(28)) in step one. Has anyone seen such behavior before? Thanks,

M

SOLUTION

I went the other way (see suggestion from Travis) and increased the number of partitions (after step 1) to 100. That worked, the job finished in a matter of minutes. But there is a side effect - now I have 100 partial files sitting in S3.

1 Answers1

0

Make sure that your .repartition() stage is happening after you "Get list of ids".

It sounds like you are generating an empty set with 28 partitions first, and then getting the list of ids into a single partition.

EDIT after example code provided:

Is it possible that each task is completing quickly (i.e. within a few seconds)? I have seen spark not schedule tasks to idle executors when the tasks complete in a short amount of time, even with thousands of outstanding tasks. If that is the case, you may need fewer partitions to make each task take a little longer. Sometimes that is enough to trigger the task scheduler to schedule more tasks to idle executors.

Travis Hegner
  • 2,465
  • 1
  • 12
  • 11
  • I went the other way and increased the number of partitions (after step 1) to 100. That worked, the job finished in a matter of minutes. But there is a side effect - now I have 100 partial files sitting in S3. –  Sep 08 '17 at 21:04
  • @Travis Hegner , even i am having same problem ..my pipeline , read df from oracle with numOfPartitions = 20, insert into cassandra ....only one executor out of 20 been executed rest finishes in ms , What should I do to fix this , in the code ??? – BdEngineer Nov 22 '18 at 10:32
  • First, make sure this is not your problem: https://stackoverflow.com/a/40938905/2639647. If that doesn't cover it, I'd post a new question with the code included. – Travis Hegner Nov 25 '18 at 01:56