28

I'm trying to run small spark application and am getting the following exception:

Exception in thread "main" java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:262)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:217)
    at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
    at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
    at scala.Option.getOrElse(Option.scala:120)

the relevant gradle dependencies section:

compile('org.apache.spark:spark-core_2.10:1.3.1')
compile('org.apache.hadoop:hadoop-mapreduce-client-core:2.6.2') {force = true}
compile('org.apache.hadoop:hadoop-mapreduce-client-app:2.6.2') {force = true}
compile('org.apache.hadoop:hadoop-mapreduce-client-shuffle:2.6.2') {force = true}
compile('com.google.guava:guava:19.0') { force = true }
Lika
  • 1,043
  • 2
  • 10
  • 13

10 Answers10

50

version 2.6.2 of hadoop:hadoop-mapreduce-client-core can't be used together with guava's new versions (I tried 17.0 - 19.0) since guava's StopWatch constructor can't be accessed (causing above IllegalAccessError)

using hadoop-mapreduce-client-core's latest version - 2.7.2 (in which they don't use guava's StopWatch in the above method, rather they use org.apache.hadoop.util.StopWatch) solved the problem, with two additional dependencies that were required:

compile('org.apache.hadoop:hadoop-mapreduce-client-core:2.7.2') {force = true}

compile('org.apache.hadoop:hadoop-common:2.7.2') {force = true} // required for org.apache.hadoop.util.StopWatch  

compile('commons-io:commons-io:2.4') {force = true} // required for org.apache.commons.io.Charsets that is used internally

note: there are two org.apache.commons.io packages: commons-io:commons-io (ours here), and org.apache.commons:commons-io (old one, 2007). make sure to include the correct one.

biniam
  • 8,099
  • 9
  • 49
  • 58
Lika
  • 1,043
  • 2
  • 10
  • 13
16

We just experienced the same situation using IntelliJ and Spark.

When using

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.1"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.3.1"

com.google.guava 20.0 is downloaded and hadoop client 2.6.5 is downloaded.

The quickest solution would be to force the guava library to version 15.0 (SBT)

dependencyOverrides += "com.google.guava" % "guava" % "15.0"
  • It is working for me for the issue in loading saved Pipeline Model of Spark-NLP library with following environment details:- Windows 10 Spark 2.4.3 Spark-NLP 2.2.1 – amandeep1991 Sep 11 '19 at 07:37
15

I just changed my guava version from 19.0 to 15.0 and it worked. I am currently using version spark 2.2

<dependency>
        <groupId>com.google.guava</groupId>
        <artifactId>guava</artifactId>
        <version>15.0</version>
      </dependency>
pranaygoyal02
  • 181
  • 1
  • 10
  • It is working for me for the issue in loading saved Pipeline Model of Spark-NLP library with following environment details:- Windows 10 Spark 2.4.3 Spark-NLP 2.2.1 Thanks @pranaygoyal02 – amandeep1991 Sep 11 '19 at 07:36
7

I had this problem with Spark 1.6.1 because one of our additional dependencies evicted Guava 14.0.1 and replaced it with 18.0. Spark has the base dependency for hadoop-client of 2.2. See [Maven Repo] (https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.10/1.6.1)

The solution that worked for is to add to sbt libraryDependencies the following: "org.apache.hadoop" % "hadoop-client" % "2.7.2"

ekrich
  • 357
  • 2
  • 10
6

Sounds like you've got a Guava version mismatch.

Something in your codebase is trying to invoke the Stopwatch constructor, but the constructors were removed from in Guava 17.0 in favor of static factory methods (createStarted() and createUnstarted()) that were added in Guava 15.0.

You should update whatever code is trying to use the constructors to use the static factory methods instead.

  • Thanks! It indeed was a mismatch, I had to update several dependencies (elaborated below) to make it stop using guava's StopWatch. – Lika Apr 06 '16 at 07:09
4

In my case, because of adding guava 21.0 result in error.

 <dependency>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
            <version>21.0</version>
 </dependency>

After that, I am using guava 15.0 or remove above dependency. My code works well.

VanThaoNguyen
  • 792
  • 9
  • 20
4

Solution

  1. Multiple versions of guava.jar files getting conflicted as transitive dependencies which is causing this exception.
  2. Identify the conflicting version and add as an exclusion in pom.xml will resolve this issue.
  3. In My case after adding pmml-evaluator version 1.4.1 dependency caused this exception.
  4. Identified through dependency hierarchy and added maven exclusions resolved this issue.

       <dependency>
        <groupId>org.jpmml</groupId>
        <artifactId>pmml-evaluator</artifactId>
        <version>1.4.1</version>
        <scope>test</scope>
        <exclusions>
            <exclusion>
                <groupId>com.google.guava</groupId>
                <artifactId>guava</artifactId>
            </exclusion>
        </exclusions>
       </dependency>
    
Praveen Kumar K S
  • 3,024
  • 1
  • 24
  • 31
  • For maven use `mvn dependency:tree -Dincludes=com.google.guava:guava` to identify which dependencies use guava in other versions – goozez Jun 21 '21 at 08:38
0

It seems that the problem comes from dependent libraries.

Basically, you will get the problem when your trying to put the data into hbase table.

Initially I had used <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-client</artifactId> <version>1.1.2</version> </dependency>

I got the similar problem as you and later I changed to <dependency> <groupId>org.apache.hbase</groupId> <artifactId>hbase-shaded-client</artifactId> <version>1.1.2</version> </dependency> and now the problem has been resolved.

Yannis
  • 1,682
  • 7
  • 27
  • 45
Ram Jaddu
  • 1
  • 2
0

If you want to workaround this problem without re-building Spark, for instance using a pre-built distribution of Spark, then I found the following worked on Apache Spark 2.3.0 (i.e. used the pre-built: 'spark-2.3.0-bin-without-hadoop'):

  1. Rename or remove the errant version of the 'hadoop-mapreduce-client-core' jar file (in my case this was 'hadoop-mapreduce-client-core-2.6.5.jar') from the Spark 'jars' directory.
  2. Copy-in (or soft-link) the compatible version (from your Hadoop installation) of the 'hadoop-mapreduce-client-core' jar into the Spark 'jars' directory.

It may also be possible to force the desired 'hadoop-mapreduce-client-core' jar file to be used by altering your classpath (so that Spark finds the version from Hadoop rather than the one distributed with Spark).

0

spark version=2.4.8

scala version=2.12.12

I added below 3 dependencies

libraryDependencies += "com.google.guava" % "guava" % "15.0"

libraryDependencies += "org.apache.hadoop" % "hadoop-mapreduce-client-core" % "3.3.4"

libraryDependencies += "org.apache.hadoop" % "hadoop-common" % "3.3.4"

It's worked for me

rajashree
  • 11
  • 3