1

New to spark and tried other solutions from stackoverflow but no luck

I have installed spark 3.1.2 and did few configuration setup (user spark/conf/spark-defaults.conf) to point aws rds mysql as a metastore (remote)

spark.jars.packages com.amazonaws:aws-java-sdk:1.12.63,org.apache.hadoop:hadoop-aws:3.2.0
spark.jars /home/newdependencies/jtds-1.3.1.jar, /home/newdependencies/mysql-connector-java-6.0.6.jar, /home/newdependencies/postgresql-42.2.20.jar
spark.hadoop.javax.jdo.option.ConnectionURL jdbc:mysql://testhivemetastore.asdfasfar.us-west-2.rds.amazonaws.com:3306/metastore
spark.hadoop.javax.jdo.option.ConnectionUserName username
spark.hadoop.javax.jdo.option.ConnectionPassword password
spark.hadoop.javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver

Error message - when trying to run show databases,

import os.path, sys
sys.path.append(os.path.join(os.path.dirname(os.path.realpath('__file__')), os.pardir))
import findspark
findspark.init()
import pyspark
sp = pyspark.sql.SparkSession.builder.enableHiveSupport().appName(f"Test spark configurations").getOrCreate()
sqlStr = 'show databases'
sp.sql(sqlStr).show()

Error: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

FYI - I didn't install Hadoop and Hive as well (Don't know its mandatory or not)

user1531248
  • 521
  • 1
  • 5
  • 17
  • Have you found any solution on this issue? I came across to something similar – NikSp Oct 29 '21 at 13:04
  • You have to install hive (by doing configuration of hive/conf/hive-site.xml) and do install hadoop (in order to make hive up and running - because its using hadoop libraries under-the-hood) and for sure you please run hive (hive --service metastore &) – user1531248 Nov 01 '21 at 14:39

0 Answers0