I'm trying to load streaming data from Kafka into SQL Server Big Data Clusters Data Pools. I'm using Spark 2.4.5 (Bitnami 2.4.5 spark image).
If I want to load data into regular tables, I use this sentence and it goes well:
logs_df.write.format('jdbc').mode('append').option('driver', 'com.microsoft.sqlserver.jdbc.SQLServerDriver').option \
('url', 'jdbc:sqlserver://XXX.XXX.XXX.XXXX:31433;databaseName=sales;').option('user', user).option \
('password', password).option('dbtable', 'SYSLOG_TEST_TABLE').save()
But the same sentence to load data into SQL Data Pool gives me this error:
py4j.protocol.Py4JJavaError: An error occurred while calling o93.save.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 4.0 failed 1 times, most recent failure: Lost task 0.0 in stage 4.0 (TID 3, localhost, executor driver): java.sql.BatchUpdateException: External Data Pool Table DML statement cannot be used inside a user transaction.
I found that the way to load data into SQL Data Pool is to use 'com.microsoft.sqlserver.jdbc.spark' format, as this:
logs_df.write.format('com.microsoft.sqlserver.jdbc.spark').mode('append').option('url', url).option('dbtable', datapool_table).option('user', user).option('password', password).option('dataPoolDataSource',datasource_name).save()
But it's giving me this error:
py4j.protocol.Py4JJavaError: An error occurred while calling o93.save.
: java.lang.ClassNotFoundException: Failed to find data source: com.microsoft.sqlserver.jdbc.spark. Please find packages at http://spark.apache.org/third-party-projects.html
I'm running the script with spark-submit like this:
docker exec spark245_spark_1 /opt/bitnami/spark/bin/spark-submit --driver-class-path /opt/bitnami/spark/jars/mssql-jdbc-8.2.2.jre8.jar --jars /opt/bitnami/spark/jars/mssql-jdbc-8.2.2.jre8.jar --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.5 /storage/scripts/some_script.py
Is there any other package I should include or some special import I'm missing?
Thanks in advance
Edited: I've tried in scala with same results