How to run stored procedure on SQL server from Spark (Databricks) JDBC python?

Question

I have a working example of executing a stored procedure in a SQL SERVER with the below Scala code in Databricks. But I'm wondring if it is possible to do the same in Python JDBC? I cannot make it work. Please see examples below:

WORKING code in SCALA

import java.sql.DriverManager
import java.sql.Connection
import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.types.{StructType, StructField, StringType,IntegerType};
import java.sql.ResultSet

val username = "xxxxx"
val pass = "xxxxx"
val url = "jdbc:sqlserver://xxx.database.windows.net:1433;databaseName=xxx"
val table = "SalesLT.Temp3"
val query = s"EXEC sp_truncate_table '${table}'"

val conn = DriverManager.getConnection(url, username, pass)
val rs = conn.createStatement.execute(query)

Python Code so far

connector_type_sql_server_driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
jdbc_hostname = "xxxx.database.windows.net"
jdbc_database = "xxxx"
jdbc_port = 1433
jdbc_url = f"jdbc:sqlserver://{jdbc_hostname}:{jdbc_port};database={jdbc_database}"
jdbc_spa_user = "xxx"
jdbc_spa_password = "xxx"


query = "EXEC sys.sp_tables"
query2 = "SELECT * FROM sys.tables"

jdbc_db = (spark.read
          .format("jdbc")
          .option("driver", connector_type_sql_server_driver)
          .option("url", jdbc_url)
          .option("query", query)
          .option("user", jdbc_spa_user)
          .option("password", jdbc_spa_password)
          .load()
          )

query2 in python is working, but anything starting with EXEC does not seem to work...

If it is not possible can someone explain in detail why it can be done in Scala and not in Python in databricks? I want to make it work with Python because the rest of the notebooks are in Python already..

Thank you.

/Baatch

_does not seem to work_ means absolutely nothing. Please explain why it is not functioning as expected. If an error is returned _post the error_ — Nick.Mc, Dec 14 '21 at 11:12

score 8 · Answer 1 · answered Mar 25 '21 at 12:20

8

Yes, it's possible you just need to get access to the underlying Java classes of JDBC, something like this:

# the first line is the main entry point into JDBC world
driver_manager = spark._sc._gateway.jvm.java.sql.DriverManager
connection = driver_manager.getConnection(mssql_url, mssql_user, mssql_pass)
connection.prepareCall("EXEC sys.sp_tables").execute()
connection.close()

answered Mar 25 '21 at 12:20

Alex Ott

80,552
8
87
132

Alex - It works as expected. However, if one wants to read the rows returned by the Stored Procedure how will one code it? – Gopinath Rajee Dec 08 '21 at 16:54
just follow JDBC documentation, specially Listing 6.7: https://docs.oracle.com/cd/E17952_01/connector-j-5.1-en/connector-j-usagenotes-statements-callable.html – Alex Ott Dec 08 '21 at 18:05
@AlexOtt How can we do it using `Apache Spark Connector` - a question posted [here](https://stackoverflow.com/q/72092252/1232087). – nam May 02 '22 at 20:52
see as well https://medium.com/delaware-pro/executing-ddl-statements-stored-procedures-on-sql-server-using-pyspark-in-databricks-2b31d9276811 – ice1e0 Apr 18 '23 at 10:30
1

Can someone please letme know how to capture the output of the stored procedures , like DataFrame or Integer or String from the Stored Procedure – Rajesh Akshith May 12 '23 at 17:21
@AlexOtt link is broken – jonathan-dufault-kr Aug 22 '23 at 15:25

How to run stored procedure on SQL server from Spark (Databricks) JDBC python?

1 Answers1

Linked

Related