I'm playing with Spark Thrift Server and want to save a dataframe to it via the JDBC port it opens, but i'm facing an error with the SQL code generated by Spark.
I've started the Thrift Server with the following command:
spark-submit --master local[2]
--packages io.delta:delta-core_2.13:2.3.0
--conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension"
--conf "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog"
--class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
--name "Thrift JDBC/ODBC Server"`
I'm using spark-shell to write spark code, and it's started as following:
spark-shell --packages io.delta:delta-core_2.13:2.3.0
--conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension"
--conf "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog"
The dataframe is very simple, it contains data about movies reviews, the schema is the following:
I'm trying to write the dataframe with the following code:
df.write
.format("delta")
.mode("overwrite")
.jdbc("jdbc:hive2://localhost:10000","movies", connectionProperties)
Finnaly, the error i'm facing is:
I've tried creating the table with beeline and writing the data with append mode, tried creating a case class to represent the data of the dataset, but wasn't able to save the data.