I have a database table in Azure DataBricks that already has data in it - I need to append data to that table.
I have my pandas dataframe (df_allfeatures) that I want to append to my database
The function that I use to write to my database table:
def write_to_delta(df_name, db_name, table_name, write_mode, num_part=10):
df_name \
.repartition(num_part) \
.write \
.mode(write_mode) \
.insertInto("{}.{}".format(db_name, table_name), overwrite=True)
When using this function to write into by database:
df_allfeatures = spark.createDataFrame(df_allfeatures)
write_to_delta(df_allfeatures, 'production', 'feed_to_output_all_features', 'append', num_part=10)
However I keep getting the error " AnalysisException: Cannot write incompatible data to table 'production.feed_to_output_all_features': "
The columns that are singled out are the following:
"AnalysisException: Cannot write incompatible data to table 'production.feed_to_output_all_features':
- Cannot safely cast 'LEAD_CONCENTRATE_GRADES_PB': string to double
- Cannot safely cast 'TAILINGS_RECOVERIES_PB': timestamp to double
- Cannot safely cast 'DATE': double to timestamp"
I have already changed the datatypes to rectify this error:
df_allfeatures = df_allfeatures.astype({"LEAD_CONCENTRATE_GRADES_PB": 'float32'})
df_allfeatures = df_allfeatures.astype({"TAILINGS_RECOVERIES_PB": 'float32'})
df_allfeatures['DATE'] = pd.to_datetime(df_allfeatures['DATE'])
But I keep getting the same error