1

This has a couple previous questions, with answers but the answers often don't have clear enough information to solve the problem.

I am using Apache Spark, to ingest data into Elasticsearch. We are using X-Pack security, and its corresponding transport client. I am using the transport client to create/delete indices in special cases, then using Spark for ingestion. When our code gets to client.close() an exception is thrown:

Exception in thread "elasticsearch[_client_][generic][T#2]" java.lang.NoSuchMethodError: io.netty.bootstrap.Bootstrap.config()Lio/netty/bootstrap/BootstrapConfig;
        at org.elasticsearch.transport.netty4.Netty4Transport.lambda$stopInternal$5(Netty4Transport.java:443)
        at org.apache.lucene.util.IOUtils.close(IOUtils.java:89)
        at org.elasticsearch.common.lease.Releasables.close(Releasables.java:36)
        at org.elasticsearch.common.lease.Releasables.close(Releasables.java:46)
        at org.elasticsearch.common.lease.Releasables.close(Releasables.java:51)
        at org.elasticsearch.transport.netty4.Netty4Transport.stopInternal(Netty4Transport.java:426)
        at org.elasticsearch.transport.TcpTransport.lambda$doStop$5(TcpTransport.java:959)
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

At first, I believed that the X-Pack Transport client was using the Netty coming in from Spark, so I excluded it. Even after excluding it, we run into the same issue. Here is our set of dependencies:

    libraryDependencies ++= Seq(
   "com.crealytics" % "spark-excel_2.11" % "0.9.1" exclude("io.netty", "netty-all"),
  "com.github.alexarchambault" %% "scalacheck-shapeless_1.13" % "1.1.6" % Test,
  "com.holdenkarau" % "spark-testing-base_2.11" % "2.2.0_0.7.4" % Test exclude("org.scalatest", "scalatest_2.11") ,
  "com.opentable.components" % "otj-pg-embedded" % "0.9.0" % Test,
  "org.apache.spark" % "spark-core_2.11" % "2.2.0" % "provided" exclude("org.scalatest", "scalatest_2.11") exclude("io.netty", "netty-all"),
  "org.apache.spark" % "spark-sql_2.11" % "2.2.0" % "provided" exclude("org.scalatest", "scalatest_2.11") exclude("io.netty", "netty-all"),
  "org.apache.spark" % "spark-hive_2.11" % "2.2.0" % "provided" exclude("org.scalatest", "scalatest_2.11") exclude("io.netty", "netty-all"),
  "org.apache.logging.log4j" % "log4j-core" %"2.8.2",
  "org.elasticsearch" % "elasticsearch-spark-20_2.11" % "5.5.0" exclude("org.scalatest", "scalatest_2.11") exclude("io.netty", "netty-all"),
  "org.elasticsearch.client" % "x-pack-transport" % "5.5.0",
  "org.elasticsearch.client" % "transport" % "5.5.0",
  "org.elasticsearch.test" % "framework" % "5.4.3" % Test,
  "org.postgresql" % "postgresql" % "42.1.4",
  "org.scalamock" %% "scalamock-scalatest-support" % "3.5.0" % Test,
  "org.scalatest" % "scalatest_2.11" % "3.0.1" % Test,
  "org.scalacheck" %% "scalacheck" % "1.13.4" % Test,
  "org.scalactic" %% "scalactic" % "3.0.1",
  "org.scalatest" %% "scalatest" % "3.0.1" % Test,
  "mysql" % "mysql-connector-java" % "5.1.44"
      )

I verified with sbt dependencyTree that SBT is not excluding netty from Spark and spark-excel, and I'm not sure why... We're using SBT 1.0.4.

UPDATE: spark-submit/Spark was the culprit, answer below!

skylerl
  • 4,030
  • 12
  • 41
  • 60
  • 1
    `ExclusionRule()` doesn't seem to accept a version... – skylerl Dec 02 '17 at 22:45
  • Changed excludeAll to exclude to no avail – skylerl Dec 02 '17 at 23:02
  • excludeAll ExclusionRule(organization = "com.fasterxml.jackson.core"), "io.confluent" % "common-config" % "3.1.2" Can tou do something like this? – Achilleus Dec 02 '17 at 23:56
  • @AkhilanandBenkalVenkanna actually, my exclusions are working now, I am only seeing 4.1.11 and higher. However, I unpacked the jar that SBT makes, and noticed that org.jboss.netty is being included, however pulling up the dependency tree in SBT does not reveal this dependency being included whatsoever. – skylerl Dec 03 '17 at 00:11
  • Kool.. Please do post your solution that worked finally so that someone else could benefit from this! – Achilleus Dec 03 '17 at 00:12
  • The exclusions above worked to get rid of `io.netty:netty-all` however, the code still fails due to `org.jboss.netty` coming from somewhere... I am removing my dependencies one by one to see which one is bringing that code in, obviously it's not coming from ivy. – skylerl Dec 03 '17 at 00:14
  • @AkhilanandBenkalVenkanna thank you for your continued help, I posted my answer below. – skylerl Dec 03 '17 at 03:24

3 Answers3

3

Okay, after many trials and tribulations, I figured it out. The issue is not that SBT was failing to exclude libraries, it was excluding them perfectly. The issue was that even though I was excluding any version of Netty that wasn't 4.1.11.Final, Spark was using its own jars, external to SBT and my built jar.

When spark-submit is run, it includes jars from the $SPARK_HOME/lib directory. One of those is an older version of Netty 4. This problem is shown with this call:

bootstrap.getClass().getProtectionDomain().getCodeSource()

The result of that is a jar location of /usr/local/Cellar/apache-spark/2.2.0/libexec/jars/netty-all-4.0.43.Final.jar

So, Spark was including its own Netty dependency. When I created my jar in SBT, it had the right jars. Spark has a configuration for this called spark.driver.userClassPathFirst documented in the Spark config documentation however when I set this to true, I end up with issues to do with using a later version of Netty.

I decided to ditch using the Transport client, and use trusty old HTTP requests instead.

skylerl
  • 4,030
  • 12
  • 41
  • 60
3

I came across the same issue of needing a dependency that uses Netty in conjunction with Spark. I also tried the spark.driver.userClassPathFirst option and it did not work. I DID find another workaround that I thought I'd share in case it was helpful for others in the future.

Since we are creating an assembly jar for use with spark-submit I figured I could just shade the dependencies in the assembly jar so spark-submit could pull in it's own Netty versions without conflict. We are using the https://github.com/sbt/sbt-assembly plugin so all I needed to do was include this in my build.sbt within the module in question:

assemblyShadeRules in assembly := Seq(
  ShadeRule.rename("io.netty.**" -> "shadenetty.@1").inAll
)
Nathaniel Wendt
  • 1,194
  • 4
  • 23
  • 49
0

Excluding Netty dependencies from spark-core worked for us

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.10</artifactId>
            <version>${spark.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>io.netty</groupId>
                    <artifactId>netty-all</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>io.netty</groupId>
                    <artifactId>netty</artifactId>
                </exclusion>
            </exclusions>
        </dependency>