6

Is it possible to list what spark packages have been added to the spark session?

The class org.apache.spark.deploySparkSubmitArguments has a variable for the packages:

var packages: String = null

Assuming this is a list of the spark packages, is this available via SparkContext or somewhere else?

Chris Snow
  • 23,813
  • 35
  • 144
  • 309

1 Answers1

6

I use the following method to retrieve that information: spark.sparkContext.listJars

For example:
$ spark-shell --packages elsevierlabs-os:spark-xml-utils:1.4.0

scala> spark.sparkContext.listJars.foreach(println)
spark://192.168.0.255:51167/jars/elsevierlabs-os_spark-xml-utils-1.4.0.jar
spark://192.168.0.255:51167/jars/commons-io_commons-io-2.4.jar
spark://192.168.0.255:51167/jars/commons-logging_commons-logging-1.2.jar
spark://192.168.0.255:51167/jars/org.apache.commons_commons-lang3-3.4.jar
spark://192.168.0.255:51167/jars/net.sf.saxon_Saxon-HE-9.6.0-7.jar

In this case, I loaded the spark-xml-utils package, and the other jars were loaded as dependencies.

JamCon
  • 2,313
  • 2
  • 25
  • 34
  • Upvoted, but not accepted yet. The answer is for scala, but I was looking for something that would work with sparkr. – Chris Snow Feb 20 '17 at 09:12
  • I've spent a couple hours combing through the SparkR documentation, code and the shell. It does not appear that this functionality is directly exposed in the SparkR implementation. I also did a few cursory attempts at retrieving information via sparkR.callJStatic without success. Sorry! – JamCon Feb 20 '17 at 19:59
  • That's as good as an answer - I.e there isn't a way to do this. Thanks for the help. – Chris Snow Feb 20 '17 at 20:13