I have been using IntelliJ for getting up to speed with developing Spark applications in Scala using sbt. I understand the basics although IntelliJ hides a lot of the scaffolding so I'd like to try getting something up and running from the command-line (i.e. using a REPL). I am using macOS.
Here's what I've done:
mkdir -p ~/tmp/scalasparkrepl
cd !$
echo 'scalaVersion := "2.11.12"' > build.sbt
echo 'libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.0"' >> build.sbt
echo 'libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.3.0"' >> build.sbt
echo 'libraryDependencies += "org.apache.spark" %% "spark-hive" % "2.3.0"' >> build.sbt
sbt console
That opens a scala REPL (including downloading all the dependencies) in which I run:
import org.apache.spark.SparkConf
import org.apache.spark.sql.{SparkSession, DataFrame}
val conf = new SparkConf().setMaster("local[*]")
val spark = SparkSession.builder().appName("spark repl").config(conf).config("spark.sql.warehouse.dir", "~/tmp/scalasparkreplhive").enableHiveSupport().getOrCreate()
spark.range(0, 1000).toDF()
which fails with error access denied org.apache.derby.security.SystemPermission( "engine", "usederbyinternals" )
:
scala> spark.range(0, 1000).toDF()
18/05/08 11:51:11 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('~/tmp/scalasparkreplhive').
18/05/08 11:51:11 INFO SharedState: Warehouse path is '/tmp/scalasparkreplhive'.
18/05/08 11:51:12 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
18/05/08 11:51:12 INFO HiveUtils: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
18/05/08 11:51:12 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
18/05/08 11:51:12 INFO ObjectStore: ObjectStore, initialize called
18/05/08 11:51:13 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
18/05/08 11:51:13 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
java.security.AccessControlException: access denied org.apache.derby.security.SystemPermission( "engine", "usederbyinternals" )
I've googled around and there is some information on this error but nothing which I've been able to use to solve it. I find it strange that a scala/sbt project on the command-line would have this problem whereas a sbt project in IntelliJ works fine (I pretty much copied/pasted the code from an IntelliJ project). I guess IntelliJ is doing something on my behalf but I don't know what, that's why I'm undertaking this exercise.
Can anyone advise how to solve this problem?