6

I am working with Spark inside Zeppelin in a collaborative environment. So we have only one interpreter and many users are using this interpreter. For this reason, I defined it using instantiation per user in scoped mode.

With this configuration, a user job X await the resource allocated by jobs of another users.

To change this behavior and allow jobs from different users to execute at the same time, I defined the Spark configuration (on Zeppelin interpreter configurations) spark.scheduler.mode equal to FAIR. To make desired effect, the user need to define manually, on your notebook, your own Spark pool (jobs from different pools can be executed at same time: https://spark.apache.org/docs/latest/job-scheduling.html#scheduling-within-an-application) with this code:

sc.setLocalProperty("spark.scheduler.pool", "pool1")

Ps.: After one hour, the interpreter shutdown. If users forget to execute this command on next time, they fall in default pool, what is not good.

What I want to know: Is possible to set a Spark user pool automatically when he executes your paragraphs without manual efforts every time?

If there is another way to do this, please let me know if it's possible.

Igor Uchôa
  • 329
  • 1
  • 9
  • 1
    in 7.01 there was an **experimental** feature called [Interpreter Execution Hooks](https://zeppelin.apache.org/docs/0.7.1/manual/interpreterexechooks.html). It sounds like it is run for each paragraph, but for your example it probably won't matter. Could this work for you? – GMc May 16 '19 at 22:39
  • Thanks for your solution. I was reading about it and seems to be a approach. The only drawback is that the end users (zeppelin users) will need to worry about infrastructure issues. Thanks! – Igor Uchôa May 17 '19 at 08:33

0 Answers0