delta lake merge missing reocords

Asked Feb 01 '23 at 21:41

Active Feb 01 '23 at 21:41

Viewed 59 times

I am excuting delta Lake function on aws. However, I am not getting the correct result. below is the pyspark script. It ran successfully. However, the output contains less records than the origianl table.

                    deltaTable.alias("old")\
                        .merge(df.alias("new"),join_string)\
                        .whenMatchedUpdateAll() \
                        .whenNotMatchedInsertAll() \
                        .execute()

As the below image show, numOutputrows should be ~226k . however, i only get 21k in the final result.

enter image description here

~226k records in the output table.

asked Feb 01 '23 at 21:41

Scarlett

delta lake merge missing reocords

0 Answers0