7

I want to see the jars my spark context is using. I found the code in Scala:

$ spark-shell --jars --master=spark://datasci:7077 --jars /opt/jars/xgboost4j-spark-0.7-jar-with-dependencies.jar --packages elsevierlabs-os:spark-xml-utils:1.6.0

scala> spark.sparkContext.listJars.foreach(println)
spark://datasci:42661/jars/net.sf.saxon_Saxon-HE-9.6.0-7.jar
spark://datasci:42661/jars/elsevierlabs-os_spark-xml-utils-1.6.0.jar
spark://datasci:42661/jars/org.apache.commons_commons-lang3-3.4.jar
spark://datasci:42661/jars/commons-logging_commons-logging-1.2.jar
spark://datasci:42661/jars/xgboost4j-spark-0.7-jar-with-dependencies.jar
spark://datasci:42661/jars/commons-io_commons-io-2.4.jar

Source: List All Additional Jars Loaded in Spark

But I could not find how to do it in PySpark. Any suggestions?

Thanks

Eli Simhayev
  • 164
  • 1
  • 11

2 Answers2

7

I really got the extra jars with this command:

print(spark.sparkContext._jsc.sc().listJars())
neves
  • 33,186
  • 27
  • 159
  • 192
5
sparkContext._jsc.sc().listJars()

_jsc is the java spark context

Job Evers
  • 4,077
  • 5
  • 20
  • 26
  • 3
    How do I unpack this object? All that listJars() is returning for me is "JavaObject id=o30"? It doesn't appear be iterable or have any methods that can be seen via introspection? – Adam Luchjenbroers May 12 '21 at 02:05
  • 3
    @AdamLuchjenbroers when you `print` the object, it shows the jar filename: `print(spark.sparkContext._jsc.sc().listJars())` – Gunther Struyf Jun 04 '21 at 06:59