2

When I post simultaneous jobserver requests, they always seem to be processed in FIFO mode. This is despite my best efforts to enable the FAIR scheduler. How can I ensure that my requests are always processed in parallel?

Background: On my cluster there is one SparkContext to which users can post requests to process data. Each request may act on a different chunk of data but the operations are always the same. A small one-minute job should not have to wait for a large one-hour job to finish.

Intuitively I would expect the following to happen (see my configuration below): The context runs within a FAIR pool. Every time a user sends a request to process some data, Spark should split up the fair pool and give a fraction of the cluster resources to process that new request. Each request is then run in FIFO mode parallel to any other concurrent requests.

Here's what actually happens when I run simultaneous jobs: The interface says "1 Fair Scheduler Pools" and it lists one active (FIFO) pool named "default." It seems that everything is executing within the same FIFO pool, which itself is running alone within the FAIR pool. I can see that my fair pool details are loaded correctly on Spark's Environment page, but my requests are all processed in FIFO fashion.

How do I configure my environment/application so that every request actually runs in parallel to others? Do I need to create a separate context for each request? Do I create an arbitrary number of identical FIFO pools within my FAIR pool and then somehow pick an empty pool every time a request is made? Considering the objectives of Jobserver, it seems like this should all be automatic and not very complicated to set up. Below are some details from my configuration in case I've made a simple mistake.

From local.conf:

contexts {
 mycontext {
   spark.scheduler.mode = FAIR
   spark.scheduler.allocation file = /home/spark/job-server-1.6.0/scheduler.xml
   spark.scheduler.pool = fair_pool
 }
}

From scheduler.xml:

<?xml version="1.0"?>
<allocations>
  <pool name="fair_pool">
    <schedulingMode>FAIR</schedulingMode>
    <weight>1</weight>
  </pool>
</allocations>

Thanks for any ideas or pointers. Sorry for any confusion with terminology - the word "job" has two meanings in jobserver.

Graham S
  • 1,642
  • 10
  • 12

1 Answers1

1

I was looking at my configuration and found that

spark.scheduler.allocation file should be spark.scheduler.allocation.file

and all the values are quoted like

contexts {
  mycontext {
    spark.scheduler.mode = "FAIR"
    spark.scheduler.allocation.file = "/home/spark/job-server-1.6.0/scheduler.xml"
    spark.scheduler.pool = "fair_pool"
  }
}

Also ensure that mycontext is created and you are passing mycontext when submitting a job.

You can verify whether mycontext is using FAIR scheduler using Spark Master UI also.

noorul
  • 1,283
  • 1
  • 8
  • 18
  • 1
    Thanks for pointing out the missing period. With that changed, fair_pool started showing up in the UI. However, the jobs were still running in the default pool. The critical change was to add `sc.setLocalProperty("spark.scheduler.pool", "fair_pool")` in my scala code. I thought this could be specified in my Jobserver config file - apparently not! – Graham S Aug 25 '16 at 15:20
  • @grahamS can you create an issue? https://github.com/spark-jobserver/spark-jobserver/issues – ZeoS Aug 31 '16 at 11:25
  • @ZeoS [New issue here](https://github.com/spark-jobserver/spark-jobserver/issues/581) – Graham S Aug 31 '16 at 18:09