I have series of spark jobs execution flow in single class as shown below:
SparkMainClass:
- Job1 (Using 4 new dataframes in broadcast join)
- Job2 (Using 3 new dataframes in broadcast join)
- Job3 (Using 4 new dataframes in broadcast join)
- Job4 (Using 2 new dataframes in broadcast join)
All the 4 jobs will execute one after another in sequence. So after each Job execution I want to clear dataframes used in broadcast join to save Driver and Executor memory, else I am encountering out-of-memory issue or suggesting to increase driver memory.
Is there any any way to clear broadcasted data from driver and executor memory dynamically. Like Job1 execution completed clear 4 df broadcast data and start Job2 execution and so on...
Please assist.