2

We have a MapR Cluster, on which this was running but now suddenly its stopped and even does not work on mapr demo cluster. We are running MapR 5.1 and Spark 1.6.1.

from pyspark import SparkConf, SparkContext
from pyspark import HiveContext
from pyspark.sql import DataFrameWriter
conf = SparkConf().setAppName('test')
sc = SparkContext(conf=conf)
sqlContext = HiveContext(sc)

df = sqlContext.createDataFrame([(2012, 8, "Batman", 9.8), (2012, 8, "Hero", 8.7), (2012, 7, "Robot", 5.5), (2011, 7, "Git", 2.0)],["year", "month", "title", "rating"])
df.show()
  df.write.mode("append").format("com.databricks.spark.avro").save("/user/bedrock/output_avro")
sc.stop()

But now i am getting this error:

java.lang.IllegalAccessError: tried to access class org.apache.avro.SchemaBuilder$FieldDefault from class    com.databricks.spark.avro.SchemaConverters$$anonfun$convertStructToAvro$1

Any ideas? This is as per the instructions on databricks github. I am invoking the pyspark shell or spark-submit using these packages:

/opt/mapr/spark/spark-1.6.1/bin/pyspark --packages com.databricks:spark-avro_2.10:2.0.1 --driver-class-path /opt/mapr/spark/spark-1.6.1/lib/avro-1.7.7.jar --conf spark.executor.extraClassPath=/opt/mapr/spark/spark-1.6.1/lib/avro-1.7.7.jar --master yarn-client
manmeet
  • 330
  • 2
  • 4
  • 15

1 Answers1

0

I have experienced this error in the past, but not with pyspark. I'm hoping that my experience can help.

It turned out that there was a badly configured Java CLASSPATH that placed avro-1.7.5.jar before any others. You might be able to solve this by ensuring that your cluster configuration uses avro-1.7.7.jar.

You can usually do this by setting the spark.driver.userClassPathFirst and spark.executor.userClassPathFirst configuration variables to true.

The specific error is triggered by a change to Avro between 1.7.5 and 1.7.6 (see https://github.com/apache/avro/blob/release-1.7.5/lang/java/avro/src/main/java/org/apache/avro/SchemaBuilder.java#L2136)

Ryan Skraba
  • 1,108
  • 9
  • 25