18

I have the following code which fires hiveContext.sql() most of the time. My task is I want to create few tables and insert values into after processing for all hive table partition.

So I first fire show partitions and using its output in a for-loop, I call a few methods which creates the table (if it doesn't exist) and inserts into them using hiveContext.sql.

Now, we can't execute hiveContext in an executor, so I have to execute this in a for-loop in a driver program, and should run serially one by one. When I submit this Spark job in YARN cluster, almost all the time my executor gets lost because of shuffle not found exception.

Now this is happening because YARN is killing my executor because of memory overload. I don't understand why, as I have a very small data set for each hive partition, but still it causes YARN to kill my executor.

Will the following code do everything in parallel and try to accommodate all hive partition data in memory at the same time?

public static void main(String[] args) throws IOException {   
    SparkConf conf = new SparkConf(); 
    SparkContext sc = new SparkContext(conf); 
    HiveContext hc = new HiveContext(sc); 

    DataFrame partitionFrame = hiveContext.sql(" show partitions dbdata partition(date="2015-08-05")"); 
  
    Row[] rowArr = partitionFrame.collect(); 
    for(Row row : rowArr) { 
        String[] splitArr = row.getString(0).split("/"); 
        String server = splitArr[0].split("=")[1]; 
        String date =  splitArr[1].split("=")[1]; 
        String csvPath = "hdfs:///user/db/ext/"+server+".csv"; 
        if(fs.exists(new Path(csvPath))) { 
            hiveContext.sql("ADD FILE " + csvPath); 
        } 
        createInsertIntoTableABC(hc,entity, date); 
        createInsertIntoTableDEF(hc,entity, date); 
        createInsertIntoTableGHI(hc,entity,date); 
        createInsertIntoTableJKL(hc,entity, date); 
        createInsertIntoTableMNO(hc,entity,date); 
   } 
}
gsamaras
  • 71,951
  • 46
  • 188
  • 305
Umesh K
  • 13,436
  • 25
  • 87
  • 129

2 Answers2

19

Generally, you should always dig into logs to get the real exception out (at least in Spark 1.3.1).

tl;dr
safe config for Spark under Yarn
spark.shuffle.memoryFraction=0.5 - this would allow shuffle use more of allocated memory
spark.yarn.executor.memoryOverhead=1024 - this is set in MB. Yarn kills executors when its memory usage is larger then (executor-memory + executor.memoryOverhead)

Little more info

From reading your question you mention that you get shuffle not found exception.

In case of org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle you should increase spark.shuffle.memoryFraction, for example to 0.5

Most common reason for Yarn killing off my executors was memory usage beyond what it expected. To avoid that you increase spark.yarn.executor.memoryOverhead , I've set it to 1024, even if my executors use only 2-3G of memory.

gsamaras
  • 71,951
  • 46
  • 188
  • 305
Barak1731475
  • 780
  • 1
  • 9
  • 15
  • Hmm Barak what about repartitioning the dataset so that every partition holds less data? – gsamaras Aug 04 '16 at 16:35
  • @gsamaras Data resides in different memory area, and in spark 1.3.1 it wasn't dynamic. So, you wouldn't actually "free" some memory on the executor for the shuffle. You have to explicitly increase shuffle area. That said, you might have smaller shuffle memory needs on the map side if you decrease data per partition, so it might help somewhat. Bear in mind that repartitioning has other effects on the process, so I wouldn't use it as solution to this specific problem. It is might be a good idea, but it is a bigger subject :) – Barak1731475 Aug 05 '16 at 14:50
0

This is my assumption: you must be having limited executors on your cluster and job might be running in shared environment.

As you said, your file size is small, you can set a smaller number of executors and increase executor cores and setting the memoryOverhead property is important here.

  1. Set number of executors = 5
  2. Set number of execuotr cores = 4
  3. Set memory overhead = 2G
  4. shuffle partition = 20 (to use maximum parallelism based on executors and cores)

Using above property I am sure you will avoid any executor out of memory issues without compromising performance.

Paul Roub
  • 36,322
  • 27
  • 84
  • 93
Ajay Ahuja
  • 1,196
  • 11
  • 26