0
import spark.implicits._

val data = Seq(("James","Sales",34))
val df1 = data.toDF("name","dept","age")
df1.printSchema()
df1.write.option("mergeSchema", "true").format("delta").save("/location")

val data2 = Seq(("Tiger","Sales","34") )
var df2 = data2.toDF("name","dept","age")
df2.printSchema()
df2.write.option("mergeSchema", "true").format("delta").save("/location")
df2.show(false)

When we write the df2 dataframe, it fails because in the delta table age is of IntergerType and the second df2 age is of StringType. How do we handle such sitaution so that the code handles this case smoothly.

boom_clap
  • 129
  • 1
  • 12
  • You will have to provide the mode whether you need to append the data or overwrite the data. Also you need to provide the option of OverwriteSchema to True. – Nikunj Kakadiya Dec 17 '21 at 05:09
  • See this link : https://medium.com/@amany.m.abdelhalim/appending-overwriting-with-different-schema-to-delta-lake-vs-parquet-6b39c4a5d5dc – Nikunj Kakadiya Dec 17 '21 at 05:09

1 Answers1

0

you can just use the option of overwriteSchema to true and that should work.

val data = Seq(("James","Sales",34))
val df1 = data.toDF("name","dept","age")
df1.printSchema()
df1.write.option("mergeSchema", "true").format("delta").save("/location")

val data2 = Seq(("Tiger","Sales","34") )
var df2 = data2.toDF("name","dept","age")
df2.printSchema()
df2.write.option("overwriteSchema", "true").mode("overwrite").format("delta").save("/location")
Nikunj Kakadiya
  • 2,689
  • 2
  • 20
  • 35