1

I have series of spark jobs execution flow in single class as shown below:

SparkMainClass:

  • Job1 (Using 4 new dataframes in broadcast join)
  • Job2 (Using 3 new dataframes in broadcast join)
  • Job3 (Using 4 new dataframes in broadcast join)
  • Job4 (Using 2 new dataframes in broadcast join)

All the 4 jobs will execute one after another in sequence. So after each Job execution I want to clear dataframes used in broadcast join to save Driver and Executor memory, else I am encountering out-of-memory issue or suggesting to increase driver memory.

Is there any any way to clear broadcasted data from driver and executor memory dynamically. Like Job1 execution completed clear 4 df broadcast data and start Job2 execution and so on...

Please assist.

Ku002
  • 117
  • 1
  • 2
  • 14

0 Answers0