5

I'm trying to use GeoIP2 v2.10.0 Java API (https://github.com/maxmind/GeoIP2-java) with Apache Spark v2.2.0, Scala 2.11.8. The problem is, Apache Spark has jackson-databind artifact v2.6.5 in it's pom file where as the GeoIP2 requires a miniumum version of jackson-databind 2.9.2. Hence, I'm trying to shade the associated libraries using sbt-assembly. I'm using spark-submit command on AWS-EMR, but I keep getting the following dependency error:

 Exception in thread "main" java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.node.ArrayNode.<init>(Lcom/fasterxml/jackson/databind/node/JsonNodeFactory;Ljava/util/List;)V at com.maxmind.db.Decoder.decodeArray(Decoder.java:272)

The above exception occurs mainly because jackson-databind v2.6.5 is being pulled from Spark rather than using 2.9.2. Can anyone tell me whether I'm doing the shading process correctly or if I'm missing anything? How do I verify whether the com.fasterxml.jackson.core.* is actually shaded to shaded.jackson.core.* in IntelliJ? I've tried to use the sbt-dependencyTree plugin, but shows me just the dependency as com.fasterxml.jackson.core.*.

The following is my build.sbt file:

    name := "Sessionization"
    version := "0.1"
    scalaVersion := "2.11.8"

    assemblyShadeRules in assembly := Seq(ShadeRule.rename("com.fasterxml.jackson.core.**" -> "shaded.jackson.core.@1").inAll
      //.inLibrary("com.fasterxml.jackson.core" % "jackson-databind" % "2.9.2")
      //.inLibrary("com.fasterxml.jackson.core" % "jackson-core" % "2.9.2")
      //.inLibrary("com.fasterxml.jackson.core" % "jackson-annotations" % "2.9.0")
      //.inProject
)

    // Have to use "provided" because of some deduplicate error in merging many same-name class files.
    libraryDependencies ++= Seq("com.maxmind.geoip2" % "geoip2" % "2.10.0", //"org.apache.spark" %% "spark-core" % "2.2.0" % "provided",
  "org.apache.spark" %% "spark-sql" % "2.2.0" % "provided"
  //"org.apache.spark" %% "spark-core" % "2.2.0" excludeAll( ExclusionRule(organization="com.fasterxml.jackson.core")),
  //"org.apache.spark" %% "spark-sql" % "2.2.0" excludeAll( ExclusionRule(organization="com.fasterxml.jackson.core"))
    )

Thanks in advance!

Sai Kiriti Badam
  • 950
  • 16
  • 15

1 Answers1

0

I'll be honest, I have never used the assembly shade rules but have you tried either of these methods:

  1. un-comment your exclusion rule attempt and replace the exclusion rule with ExclusionRule(organization = "*", name = "jackson-core")
  2. Add a merge strategy to the bottom of your build.sbt. Something along the lines of:

    assemblyMergeStrategy in assembly := { case x if x.contains("decodeArray") => MergeStrategy.deduplicate case PathList("META-INF", xs @ _*) => MergeStrategy.discard case x => MergeStrategy.first }

If you are still having problems then another useful thing to do is add conflictManager := ConflictManager.strict to your build.sbt to identify conflicts.

GoBuildAngus
  • 35
  • 1
  • 10