The app I'm working on has to run under Spark 2.1 on Azure. I updated build.sbt
for the app with a slightly newer version of json4s (from 3.2.11 to 3.3.0)
to get bug fix for Double serialization:
libraryDependencies += "org.json4s" %% "json4s-native" % "3.3.0",
libraryDependencies += "org.json4s" %% "json4s-ext" % "3.3.0",
However, when using json4s's FieldSerializer under Spark 2.1:
implicit val formats = DefaultFormats +
new FieldSerializer[MyClass](renameTo("name", "anotherName"))
I get this error:
java.lang.NoSuchMethodError: org.json4s.FieldSerializer$.$lessinit$greater$default$3()Z
Note this change in FieldSerializer's declaration from 3.2.11
case class FieldSerializer[A](
serializer: PartialFunction[(String, Any), Option[(String, Any)]] = Map(),
deserializer: PartialFunction[JField, JField] = Map()
)(implicit val mf: Manifest[A])
to 3.3.0:
case class FieldSerializer[A](
serializer: PartialFunction[(String, Any), Option[(String, Any)]] = Map(),
deserializer: PartialFunction[JField, JField] = Map(),
includeLazyVal: Boolean = false
)(implicit val mf: Manifest[A])
My interpretation of the error message is that the third parameter is not declared, which would indicate that somehow json4s 3.2.11 is still being used. Given that, I declared the json4s libraries as 'provided', and named them in the start script:
spark-submit
...
--packages org.json4s:json4s-native_2.11:3.3.0,org.json4s:json4s-ext_2.11:3.3.0
...
This does cause the jars to be downloaded to ~/.ivy2/jars, but the same error occurs. Is there a way to dump the libraries Spark is using for the executors? I used --conf spark.executor.extraJavaOptions="-verbose"
, but that never showed
the FieldSerializer being loaded, even though it showed a long list of class loads.
Is there a way to force Spark to use version 3.3.0 in the executors, or to not load 3.2.11?
EDIT:
I don't have a general solution for printing library versions, but for json4s I can print BuildInfo.version
at runtime, which shows that
version 3.2.11 is being used with or without --packages
.