0

I am trying to execute a Hive query in my Spark code, but I need to use a jar library to perform this query with Hive because I had created the table with this jar, so to query the table I have to import it. My Spark code:

val hiveContext=...
hiveContext.sql("ADD JAR hive-jdbc-handler-2.3.4.jar")
hiveContext.sql("SELECT * FROM TABLE")

Following this previous question: How to add jar using HiveContext in the spark job I had added to my spark-submit the parameter:

--jar "LOCAL PATH to hive-jdbc-handler-2.3.4.jar"

In the logs of my application, I am getting following messages:

18/08/02 14:10:41,271 | INFO | 180802140805 | SessionState         | Added   [hive-jdbc-handler-2.3.4.jar] to class path
18/08/02 14:10:41,271 | INFO | 180802140805 | SessionState         | Added resources: [hive-jdbc-handler-2.3.4.jar]
18/08/02 14:10:42,179 | ERROR | 180802140805 | org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor | Error while trying to get column names.
org.apache.commons.dbcp.SQLNestedException: Cannot load JDBC driver class 'org.postgresql.Driver'

Notice that I want to execute my application in a cluster. What could I do?

AngryCoder
  • 396
  • 3
  • 15

1 Answers1

0

The way I am triying to add a jar to use it in Spark was correct (there is no need to use the method "addFile" in Cluster mode). The error that I was getting is due to the jar that I was using is corrupted; I replaced my jar for a new one and it worked.

AngryCoder
  • 396
  • 3
  • 15
  • Isn't there a better way to do this? Like add classpath to point to the driver's hive lib directory? – mohit_d Sep 04 '19 at 16:42
  • Yes, if you have permissions to add the .jar to your hive lib directory (It will be the best solution). But, in my case, I don´t have these permissions right now so the solution described in my first message is a workaround. – AngryCoder Sep 09 '19 at 09:39