I am trying to get a spark cluster to write to SQL server using JavaKerberos with Microsoft's JDBC driver (v7.0.0) (i.e., I specify integratedSecurity=true;authenticationScheme=JavaKerberos
in the connection string) with credentials specified in a keyTab file and I am not having much success (the problem is the same if I specify credentials in the connections string).
I am submitting the job to the cluster (4-node YARN mode v 2.3.0) with:
spark-submit --driver-class-path mssql-jdbc-7.0.0.jre8.jar \
--jars /path/to/mssql-jdbc-7.0.0.jre8.jar \
--conf spark.executor.extraClassPath=/path/to/mssql-jdbc-7.0.0.jre8.jar \
--conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=/path/to/SQLJDBCDriver.conf" \
--conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=/path/to/SQLJDBCDriver.conf" \
application.jar
Things work partially: the spark driver authenticates correctly and creates the table, however when any of the executors come to write to the table they fail with an exception:
java.security.PrivilegedActionException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
Observations:
- I can get everything to work if I specify SQL server credentials (however I need to use integrated security in my application)
- The keytab and login module file “SQLJDBCDriver.conf” seem to be specified correctly since they work for the driver
- I can see in the spark UI the executors pick up the correct command line options :
-Djava.security.auth.login.config=/path/to/SQLJDBCDriver.conf
After a lot of logging/debugging the difference in spark driver and executor behaviour, it seems to come down to the executor trying to use the wrong credentials even though the options specified should make it use those specified in the keytab file as it does successfully for the spark driver. (That is why it generates that particular exception which is what it does if I try deliberately incorrect credentials.)
Strangely, I can see in the debug output the JDBC driver finds and reads the SQLJDBCDriver.conf file and the keytab has to present (otherwise I get file not found failure) yet it then promptly ignores them and tries to use default behaviour/local user credentials.
Can anyone help me understand how I can force the executors to use credentials provided in a keytab or otherwise get JavaKerberos/SQL Server authentication to work with Spark?