4

I am trying to utilize the bulkCopyToSqlDB function for the microsoft sql server jdbc driver with the sql spark connector found here.

This is the syntax to launch the spark shell:

spark-shell --jars /developer/sqljdbc_6.4/enu/mssql-jdbc-6.4.0.jre8.jar,/developer/azure-sqldb-spark-master/azure-sqldb-spark-master/target/azure-sqldb-spark-1.0.0-jar-with-dependencies.jar

The bulkCopyConfig is created and after the following line of code is ran in the spark shell, the error is generated when I run the following:

df.bulkCopyToSqlDB(bulkCopyConfig)

The full error message is:

  Caused by: java.lang.NoSuchMethodError: com.microsoft.sqlserver.jdbc.SQLServerBulkCopyOptions.setAllowEncryptedValueModifications(Z)V
  at com.microsoft.azure.sqldb.spark.bulk.BulkCopyUtils$.getBulkCopyOptions(BulkCopyUtils.scala:109)
  at com.microsoft.azure.sqldb.spark.connect.DataFrameFunctions.com$microsoft$azure$sqldb$spark$connect$DataFrameFunctions$$bulkCopy(DataFrameFunctions.scala:126)
  at com.microsoft.azure.sqldb.spark.connect.DataFrameFunctions$$anonfun$bulkCopyToSqlDB$1.apply(DataFrameFunctions.scala:72)
  at com.microsoft.azure.sqldb.spark.connect.DataFrameFunctions$$anonfun$bulkCopyToSqlDB$1.apply(DataFrameFunctions.scala:72)
  at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:929)
  at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:929)
  at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
  at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
  at org.apache.spark.scheduler.Task.run(Task.scala:109)
  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
John
  • 41
  • 4

1 Answers1

4

This is due to an old version of the mssql server jar. Make sure you update it to 6.4.0 or later. If you're using maven it will look like this:

    <dependency>
        <groupId>com.microsoft.sqlserver</groupId>
        <artifactId>mssql-jdbc</artifactId>
        <version>6.4.0.jre8</version>
    </dependency>

Note the artifactId used to be called sqljdbc42 not mssql-jdbc.

Matthew
  • 10,361
  • 5
  • 42
  • 54
  • In the pom.xml file for the sql spark connector found here: https://github.com/Azure/azure-sqldb-spark/blob/master/pom.xml The dependency using maven looks synonymous to the one you wrote above. – John Aug 02 '18 at 18:12