2

I am running a spark streaming job on cluster mode , i have created a pool with memory of 200GB(CDH). I wanted to run my spark streaming job on that pool, i tried setting

sc.setLocalProperty("spark.scheduler.pool", "pool")

in code but its not working and i also tried the spark.scheduler.pool seems not working in spark streaming, whenever i run the job it goes in the default pool. What would be the possible issue? Is there any configuration i can add while submitting the job?

Justin
  • 735
  • 1
  • 15
  • 32

2 Answers2

1

In yarn we can add the

--conf spark.yarn.queue="que_name" to the spark-submit command . Then it will use that particular queue and its resources only.

Justin
  • 735
  • 1
  • 15
  • 32
0

I ran into this same issue with Spark 2.4. In my case, the problem was resolved by removing the default "spark.scheduler.pool" option in my Spark config.

I traced the issue to a bug in Spark - https://issues.apache.org/jira/browse/SPARK-26988. The problem is that if you set the config property "spark.scheduler.pool" in the base configuration, you can't then override it using setLocalProperty. Removing it from the base configuration made it work correctly. See the bug description for more detail.

Dave DeCaprio
  • 2,051
  • 17
  • 31