Spark CodeGenerator failed to compile, got NPE, infrequently

Question

I'm doing simple spark aggregation operation, reading data from avro file as dataframe and then mapping them to case-classes using rdd.map method then doing some aggregation operation, like count etc. Most of the time it works just fine. But sometimes it generating weird CodeGen exception;

[ERROR] 2017-03-24 08:43:20,595 org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator logError - failed to compile: java.lang.NullPointerException
/* 001 */ public java.lang.Object generate(Object[] references) {
/* 002 */   return new SpecificUnsafeProjection(references);
/* 003 */ }
/* 004 */
/* 005 */ class SpecificUnsafeProjection extends org.apache.spark.sql.catalyst.expressions.UnsafeProjection {

I am using this code;

    val deliveries = sqlContext.read.format("com.databricks.spark.avro").load(deliveryDir)
                      .selectExpr("FROM_UNIXTIME(timestamp/1000, 'yyyyMMdd') as day",
                        "FROM_UNIXTIME(timestamp/1000, 'yyyyMMdd_HH') as hour",
                        "deliveryId"
                       )
                      .filter("valid = true").rdd
                      .map(row => {
                        val deliveryId = row.getAs[Long]("deliveryId")
                        val uid = row.getAs[Long]("uid")
                        val deviceModelId: Integer = if(row.getAs[Integer]("deviceModelId") == null) {
                          0
                        } else {
                          row.getAs[Integer]("deviceModelId")
                        }
                        val delivery = new DeliveryEvent(deliveryId, row.getAs[Integer]("adId"), row.getAs[Integer]("adSpaceId"), uid, deviceModelId)
                        eventCache.getDeliverCache().put(new Element(deliveryId, delivery))
                        new InteractedAdInfo(row.getAs[String]("day"), delivery.deliveryId, delivery.adId, delivery.adSpaceId, uid, deviceModelId, deliveryEvent=1)
                      })
deliveries.count()

I can't regenerate the problem. But i get it irregularly in production. I'm using from java-app and taking spark-core_2.11:2.1.0 and spark-avro_2.11:3.1.0 maven co-ordinates.

Where might be the problem, i'm setting java -Xms8G -Xmx12G -XX:PermSize=1G -XX:MaxPermSize=1G while running the app.

score 0 · Answer 1 · answered Nov 01 '17 at 19:41

I'm seeing a similar error with the very simple action spark.read.format("com.databricks.spark.avro").load(fn).cache.count, which is intermittent when applied to large AVRO files (4GB-10GB range in my tests). However, I can eliminate the error removing the setting --conf spark.executor.cores=4 and letting it default to 1.

WARN TaskSetManager: Lost task 58.0 in stage 2.0 (TID 82, foo.com executor 10): java.lang.RuntimeException: 
Error while encoding: 
java.util.concurrent.ExecutionException: 
java.lang.Exception: failed to compile: java.lang.NullPointerException
/* 001 */ public java.lang.Object generate(Object[] references) {
/* 002 */   return new SpecificUnsafeProjection(references);
/* 003 */ }

Spark CodeGenerator failed to compile, got NPE, infrequently

1 Answers1