I am building an application that uses Spark and Spark-mllib, the build.sbt states the dependencies as followings:
3 libraryDependencies ++= Seq(
4 "org.apache.spark" %% "spark-core" % "1.6.0" withSources() withJavadoc(),
5 "org.apache.spark" %% "spark-mllib" % "1.6.0" withSources() withJavadoc()
6 )
This works fine. Now I would like to change some code in mllib and recompile the application using sbt and here is what I did:
- Download the source code of spark-1.6.0, modify code in mllib and recompile it into a jar named spark-mllib_2.10-1.6.0.jar
- Put the aforementioned jar into the lib directory of the project.
- Also put the spark-core_2.10-1.6.0.jar into the lib directory of the project.
- Delete the libraryDependencies statement in the build.sbt file.
- run sbt clean package
However, this doesn't compile because of the missing dependencies which spark-core and spark-mllib need in order to run, the depencencies are managed by sbt automatically only if the statement of libraryDependencies is written in the file of build.sbt.
So I put the statement of libraryDependencies back in build.sbt hoping that sbt would solve the dependency issues and still use local spark-mllib instead of the one from the remote repository. However, running my application showed that it was not the case.
So I am wondering if there is a way to use my local spark-mllib jar without manually resolve the dependency issues?
UPDATE: I followed the first approach of Roberto Congiu's answer, and successfully built the package using following build.sbt:
1 lazy val commonSettings = Seq(
2 scalaVersion := "2.10.5",
3 libraryDependencies ++= Seq(
4 "org.apache.spark" %% "spark-core" % "1.6.0" withSources() withJavadoc(),
5 "org.apache.spark" %% "spark-streaming" % "1.6.0" withSources() withJavadoc(),
6 "org.apache.spark" %% "spark-sql" % "1.6.0" withSources() withJavadoc(),
7 "org.scalanlp" %% "breeze" % "0.11.2"
8 )
9 )
10 lazy val core = project.
11 settings(commonSettings: _*).
12 settings(
13 name := "xSpark",
14 version := "0.01"
15 )
16
17 lazy val example = project.
18 settings(commonSettings: _*).
19 settings(
20 name := "xSparkExample",
21 version := "0.01"
22 ).
23 dependsOn(core)
xSparkExample includes a KMeans example which calls xSpark, and xSpark calls the KMeans function in spark-mllib. This spark-mllib is a customized jar which I put in the directory of core/lib so that sbt can pick it up as a local dependency.
However, running my application still doesn't use the customized jar for some reason. I even use find . -name "spark-mllib_2.10-1.6.0.jar"
to make sure there is no other jar existed on my system.