Overriding Apache Spark dependency (spark-hive)

Question

Tech stack:

Spark 2.4.4 Hive 2.3.3 HBase 1.4.8 sbt 1.5.8

What is the best practice for Spark dependency overriding?

Suppose that Spark app (CLUSTER MODE) already have spark-hive (2.44) dependency (PROVIDED)

I compiled and assembled "custom" spark-hive jar that I want to use in Spark app.

score 0 · Answer 1 · answered Dec 06 '22 at 09:01

There is not a lot of information about how you're running Spark, so it's hard to answer exactly.

But typically, you'll have Spark running on some kind of server or container or pod (in k8s).

If you're running on a server, go to $SPARK_HOME/jars. In there, you should find the spark-hive jar that you want to replace. Replace that one with your new one.
If running in a container/pod, do the same as above and rebuild your image from the directory with the replaced jar.

Hope this helps!

1 Answers1