0

I'm playing with Spark Thrift Server and want to save a dataframe to it via the JDBC port it opens, but i'm facing an error with the SQL code generated by Spark.

I've started the Thrift Server with the following command:

spark-submit --master local[2] 
    --packages io.delta:delta-core_2.13:2.3.0 
    --conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension" 
    --conf "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog" 
    --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 
    --name "Thrift JDBC/ODBC Server"`

I'm using spark-shell to write spark code, and it's started as following:

spark-shell --packages io.delta:delta-core_2.13:2.3.0 
    --conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension" 
    --conf "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog"

The dataframe is very simple, it contains data about movies reviews, the schema is the following: Dataframe schema

I'm trying to write the dataframe with the following code:

df.write
  .format("delta")
  .mode("overwrite")
  .jdbc("jdbc:hive2://localhost:10000","movies", connectionProperties)

Finnaly, the error i'm facing is: Error

I've tried creating the table with beeline and writing the data with append mode, tried creating a case class to represent the data of the dataset, but wasn't able to save the data.

Koedlt
  • 4,286
  • 8
  • 15
  • 33

0 Answers0