3

I have been using IntelliJ for getting up to speed with developing Spark applications in Scala using sbt. I understand the basics although IntelliJ hides a lot of the scaffolding so I'd like to try getting something up and running from the command-line (i.e. using a REPL). I am using macOS.

Here's what I've done:

mkdir -p ~/tmp/scalasparkrepl
cd !$
echo 'scalaVersion := "2.11.12"' > build.sbt
echo 'libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.0"' >> build.sbt
echo 'libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.3.0"' >> build.sbt
echo 'libraryDependencies += "org.apache.spark" %% "spark-hive" % "2.3.0"' >> build.sbt
sbt console

That opens a scala REPL (including downloading all the dependencies) in which I run:

import org.apache.spark.SparkConf
import org.apache.spark.sql.{SparkSession, DataFrame}
val conf = new SparkConf().setMaster("local[*]")
val spark = SparkSession.builder().appName("spark repl").config(conf).config("spark.sql.warehouse.dir", "~/tmp/scalasparkreplhive").enableHiveSupport().getOrCreate()
spark.range(0, 1000).toDF()

which fails with error access denied org.apache.derby.security.SystemPermission( "engine", "usederbyinternals" ):

scala> spark.range(0, 1000).toDF()
18/05/08 11:51:11 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('~/tmp/scalasparkreplhive').
18/05/08 11:51:11 INFO SharedState: Warehouse path is '/tmp/scalasparkreplhive'.
18/05/08 11:51:12 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
18/05/08 11:51:12 INFO HiveUtils: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
18/05/08 11:51:12 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
18/05/08 11:51:12 INFO ObjectStore: ObjectStore, initialize called
18/05/08 11:51:13 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
18/05/08 11:51:13 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
java.security.AccessControlException: access denied org.apache.derby.security.SystemPermission( "engine", "usederbyinternals" )

I've googled around and there is some information on this error but nothing which I've been able to use to solve it. I find it strange that a scala/sbt project on the command-line would have this problem whereas a sbt project in IntelliJ works fine (I pretty much copied/pasted the code from an IntelliJ project). I guess IntelliJ is doing something on my behalf but I don't know what, that's why I'm undertaking this exercise.

Can anyone advise how to solve this problem?

jamiet
  • 10,501
  • 14
  • 80
  • 159

2 Answers2

7

Not going to take full credit for this, but it looks similar to SBT test does not work for spark test

The solution is to issue this line before running the Scala code:

System.setSecurityManager(null)

So in full:

System.setSecurityManager(null)
import org.apache.spark.SparkConf
import org.apache.spark.sql.{SparkSession, DataFrame}
val conf = new SparkConf().setMaster("local[*]")
val spark = SparkSession.builder().appName("spark repl").config(conf).config("spark.sql.warehouse.dir", "~/tmp/scalasparkreplhive").enableHiveSupport().getOrCreate()
spark.range(0, 1000).toDF()
JGC
  • 5,725
  • 1
  • 32
  • 30
0

You can set the permission appropriately, add this to your pre-init script:

export SBT_OPTS="-Djava.security.policy=runtime.policy"

Create a runtime.policy file:

grant codeBase "file:/home/user/.ivy2/cache/org.apache.derby/derby/jars/*" {
    permission org.apache.derby.security.SystemPermission "engine", "usederbyinternals";
};

This assumes that your runtime.policy file resides in the current working directory and you're pulling Derby from your locally cached Ivy repository. Change the path to reflect the actual parent folder of the Derby Jar if necessary. The placement of the asterisk is significant, and this is not a traditional shell glob.

See also: https://docs.oracle.com/javase/7/docs/technotes/guides/security/PolicyFiles.html

Coder Guy
  • 1,843
  • 1
  • 15
  • 21