0

I was trying to setup a IntelliJ build for spark with janusgraph using gremlin scala but I am running into errors.

My build.sbt file is:

version := "1.0"

scalaVersion := "2.11.11"

libraryDependencies += "com.michaelpollmeier" % "gremlin-scala" % "2.3.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.1"
// https://mvnrepository.com/artifact/org.apache.spark/spark-sql
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.2.1"
// https://mvnrepository.com/artifact/org.apache.spark/spark-mllib
libraryDependencies += "org.apache.spark" %% "spark-mllib" % "2.2.1"
// https://mvnrepository.com/artifact/org.apache.spark/spark-hive
libraryDependencies += "org.apache.spark" %% "spark-hive" % "2.2.1"
// https://mvnrepository.com/artifact/org.janusgraph/janusgraph-core
libraryDependencies += "org.janusgraph" % "janusgraph-core" % "0.2.0"

libraryDependencies ++= Seq(
  "ch.qos.logback" % "logback-classic" % "1.2.3" % Test,
  "org.scalatest" %% "scalatest" % "3.0.3" % Test
)

resolvers ++= Seq(
  Resolver.mavenLocal,
  "Sonatype OSS" at "https://oss.sonatype.org/content/repositories/public"
) 

But I am getting errors when I try to compile code that uses gremlin scala libraries or io.Source libraries. Can someone share their build file or tell what I should modify to fix it. Thanks in advance.

So, I was trying to compile this code:

import gremlin.scala._
import org.apache.commons.configuration.BaseConfiguration
import org.janusgraph.core.JanusGraphFactory


class Test1() {
  val conf = new BaseConfiguration()
  conf.setProperty("storage.backend", "inmemory")
  val gr = JanusGraphFactory.open(conf)
  val graph = gr.asScala()
  graph.close

}

object Test{
  def main(args: Array[String]) {
    val t = new Test1()
    println("in Main")
  }
}

The errors I get are:

Error:(1, 8) not found: object gremlin import gremlin.scala._

Error:(10, 18) value asScala is not a member of org.janusgraph.core.JanusGraph val graph = gr.asScala()

J.Doe
  • 183
  • 1
  • 13

1 Answers1

2

If you go to the Gremlin-Scala GitHub page you'll see that the current version is "3.3.1.1" and that

Typically you just need to add a dependency on "com.michaelpollmeier" %% "gremlin-scala" % "SOME_VERSION" and one for the graph db of your choice to your build.sbt (this readme assumes tinkergraph). The latest version is displayed at the top of this readme in the maven badge.

It is not a surprise that the APi has changed when the major version of the library is different. If I change your first dependency as

//libraryDependencies += "com.michaelpollmeier" % "gremlin-scala" % "2.3.0" //old!
libraryDependencies += "com.michaelpollmeier" %% "gremlin-scala" % "3.3.1.1"

then your example code compiles for me.

SergGr
  • 23,570
  • 2
  • 30
  • 51
  • Thanks, this helped. But now, when I write some code to connect to hive metastore and retrieve a table into spark SQL, I am getting a "java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.()V from class org.apache.hadoop.mapred.FileInputFormat" error. What could possibly be wrong? Adding guava dependency did not work. – J.Doe Jan 25 '18 at 09:03
  • @J.Doe, I'd bet that you use old Hadoop binaries with new Guava binaries. As you may see [Hadoop don't use Guava's Stopwatch anymore for quite some time](https://github.com/apache/hadoop/commit/a6ed4894b518351bf1b3290e725a475570a21296). It looks like you are following some quite old manual that was not updated for years. But the moment you introduce some new dependency, it may result in "upgrading" a lot of other transitive dependencies. I'm not sure you can easily roll the Guava back. I think it might be easier to update Hadoop. P.S. If the answer helps, it is customary to mark it as Accepted – SergGr Jan 25 '18 at 16:38
  • I updated the dependencies with the latest version of all the libraries I am using. Now I get a new error that starts with: Exception in thread "main" java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.spark.rdd.HadoopRDD.getInputFormat(HadoopRDD.scala:185) at .... – J.Doe Jan 27 '18 at 03:51
  • @J.Doe, with that I'm out of my depth. This is a totally new error and I'm not sure about the reasons. You should investigate it first (even SO has a few similar questions) and if it doesn't help, create a new question with more details on how you reproduce that. – SergGr Jan 27 '18 at 13:24