I have a delta table
# Load the data from its source.
df = spark.read.load("/databricks-datasets/learning-spark-v2/people/people-10m.delta")
# Write the data to a table.
table_name = "people_10m"
df.write.saveAsTable(table_name)
I now have a schema change that I want to add, maybe a single column, maybe a few columns, maybe nested arrays. I can't predict what will come up in the code execution.
I used python's set API to find the new columns, and now I want to add them to the delta table. Ideally, using python API.
One thought was to modify the schema of the Dataframe and then somehow tell the table to match. I'm using python's set API to find new columns. I don't want to read the whole dataset and write it, I don't want to kill the history also. I would be ok with schema evolution if it's possible to do it without any data (just schema update) and stop all column deletions.