I'm trying to upgrade my project from spark 2.1.1 to 2.3.1, when I change the dependency over, I'm getting the following exception:
java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.mapred.FileInputFormat
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:312)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
at scala.Option.getOrElse(Option.scala:121)
...
I've found the following question which seems to answer what is going on: IllegalAccessError to guava's StopWatch from org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus
However I don't link to hadoop directly, I'm just using spark-2.3.1-bin-hadoop2.7 as my spark home.
My assumption is that in spark 2.1.1 guava was included implicitly, but it now is not, and hadoop hasn't been updated, does this mean that I now need to explicitly include guava from my project?