0

I have an existing delta file with 4 columns in the schema, which I was converting into dataset at runtime. Case class

case class MyObj2(x:int)
case class MyObj1(p: MyObj3, q:MyObj3)
case class MyCaseClass(a:int, b:MyObj1, c:int, d:MyObj2)

Now I have added a new column in case class My Obj2.

case class MyObj2(x:int, y:int)

This is how I am reading the delta file df as a dataset

val df = spark.read.format("delta").load(path)
df.as[MyCaseClass]

When trying to convert existing delta file (with old schema) into dataset with new schema , it gives me Schema Mismatch error.

I have already tried -

Creating empty df with new schema and using merge schema. It gives me error for other fields such as MyObj1 as I have a map of key value pairs in delta file for this so it doesn't matches while merging schema. Although it works with.as method.

Koedlt
  • 4,286
  • 8
  • 15
  • 33
P Mittal
  • 172
  • 6

0 Answers0