0

While trying to write a dataframe to Bigquery using Simba driver. am getting the below exception.below is the dataframe. Have created a table in bigquery with same schema.

df.printSchema
root
 |-- empid: integer (nullable = true)
 |-- firstname: string (nullable = true)
 |-- middle: string (nullable = true)
 |-- last: string (nullable = true)
 |-- gender: string (nullable = true)
 |-- age: double (nullable = true)
 |-- weight: integer (nullable = true)
 |-- salary: integer (nullable = true)
 |-- city: string (nullable = true)

Simba driver is throwing the below error

 Caused by: com.simba.googlebigquery.support.exceptions.GeneralException: [Simba][BigQueryJDBCDriver](100032) Error executing query job. Message: 400 Bad Request
    {
      "code" : 400,
      "errors" : [ {
        "domain" : "global",
        "location" : "q",
        "locationType" : "parameter",
        "message" : "Syntax error: Unexpected string literal \"empid\" at [1:38]",
        "reason" : "invalidQuery"
      } ],
      "message" : "Syntax error: Unexpected string literal \"empid\" at [1:38]",
      "status" : "INVALID_ARGUMENT"
    }
      ... 24 more

below is the code am using for the same :

val url = "jdbc:bigquery://https://www.googleapis.com/bigquery/v2;ProjectId=my_project_id;OAuthType=0;OAuthPvtKeyPath=service_account_jsonfile;OAuthServiceAcctEmail=googleaccount"
df.write.mode(SaveMode.Append).jdbc(url,"orders_dataset.employee",new java.util.Properties)

Please let me know if am missing any other configuration or where am going wrong. Thanks in advance!

Mohan
  • 221
  • 1
  • 21

1 Answers1

0

Seems that behavior is caused by Spark, which is sending extra quotas around column names.

To fix this behavior in Spark, you need to add the following code after creating Spark context and before create a dataframe:

JdbcDialects.registerDialect(new JdbcDialect() {

override def canHandle(url: String): Boolean = url.toLowerCase.startsWith("jdbc:bigquery:")

override

def quoteIdentifier(column: String): String =  column

})