I have a pyspark program which connects to MySQL db successfully and reads a table. Now, I am trying to pass the database credentials from a properties file, instead of embedding them in the code, but not able to make it work.
from pyspark.sql import SparkSession
from pyspark.sql.types import *
#spark-submit –packages mysql:mysql-connector-java:8.0.13 workWithMySQL.py
spark = SparkSession.builder.appName(“MySQL connection”).getOrCreate()
#create spart context from spark session
sc = spark.sparkContext
#read from mysql
#configuration details
hostname=”localhost”
jdbcport=3306
dbname=”TEST”
username=”kanchan@localhost”
password=”password”
mysql_url = “jdbc:mysql://{0}:{1}/{2}?user={3}&password={4}”.format(hostname,jdbcport,dbname,username,password)
mysql_driver = “com.mysql.jdbc.Driver”
query = “(select * from cats) t1_alias”
df4 = spark.read.format(“jdbc”).options(driver=mysql_driver,url=mysql_url, dbtable=query).load()
df4.show()
Now, I have created a properties file jdbc.properties at $SPARK_HOME/conf
spark.mysql.user kanchan@localhost
spark.mysql.password password
And add it in spark-submit call
spark-submit –packages mysql:mysql-connector-java:8.0.13 --files $SPARK_HOME/conf/jdbc.properties workWithMySQL.py
replaced the assignments:
username=sc.getConf.getOption("spark.mysql.user")
password=sc.getConf.getOption("spark.mysql.user")
when run. it throws an error saying the function has no attribute as get option. I could not locate the appropriate documentation for it. Can anyone help?
Further, is it possible to encrypt the credentials or ensure data security by any other means?