2

I'm trying to insert a Spark Dataframe into a table in SQL Server using the Apache Spark connector for SQL Server https://learn.microsoft.com/en-us/sql/connect/spark/connector?view=sql-server-ver15.

I'm getting the following error -

: org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 1.0 failed 4 times, most recent failure: Lost task 1.3 in stage 1.0 (TID 56, a76b60538ed0405e9a110786aae369ca001abe49040, executor 1): com.microsoft.sqlserver.jdbc.SQLServerException: PdwManagedToNativeInteropException ErrorNumber: 46724, MajorCode: 467, MinorCode: 24, Severity: 20, State: 2, Exception of type 'Microsoft.SqlServer.DataWarehouse.Tds.PdwManagedToNativeInteropException' was thrown.

Following is the code used -

  df.write\
    .format("com.microsoft.sqlserver.jdbc.spark") \
    .mode("append") \
    .option("url", url) \
    .option("dbtable", table_name) \
    .option("user", username) \
    .option("password", password) \
    .option("mssqlIsolationLevel", "READ_UNCOMMITTED") \
    .save()

There is a column with TimestampType in the dataframe, whose corresponding datatype in SQL table is Datetime. I was able to successfully write the data by removing this column from both dataframe and SQL table (tried it to identity that the problem was with the Timestamp column).

Tried with the SQL column data type as Datetime2 which gave the following error -

ErrorMsg: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid column type from bcp client for colid 1.

I want the datatype in SQL table to be Datatime or Datetime2. Is there a way to write the TimestampType column successfully? Am I missing something?

MK22
  • 71
  • 10
  • 1
    Seems to be API limitation: https://stackoverflow.com/questions/56847335/writing-to-sql-server-datetime-datatype-from-scala-spark – tintin Aug 02 '21 at 14:58

0 Answers0