Snowflake- After MERGE t1 into t2, how do I ensure that next time I run MERGE, old records from t1 don't get merged again?

Question

I'm trying to merge data in t1 that I Snowpiped from my S3 into t2. What is best practice to do with the data in t1 after they have been merged with t2? Do I delete the data in t1 or there is some metadata job tracker to ensure that the same data doesn't get merged more than once?

Snowflake streams are pretty helpful for this use case: https://docs.snowflake.com/en/user-guide/streams.html — Simon D, Oct 26 '20 at 12:44

score 0 · Answer 1 · answered Oct 26 '20 at 06:44

When you do a merge match you can tell the merge to update if it finds a match and insert if it does not find a match.

please see the syntax below

merge into t1 using t2 on t1.t1key = t2.t2key
    when matched and t2.marked = 1 then delete
    when matched and t2.isnewstatus = 1 then update set val = t2.newval, status = t2.newstatus
    when matched then update set val = t2.newval
    when not matched then insert (val, status) values (t2.newval, t2.newstatus);

But ideally once you merge from T1, it will be better to truncate as then your merge query will do less partition scanning next time when it runs and the updates will also be less.

I think that can be done as well, because for every new run we are cloning it anyway — Rajib Deb, Oct 26 '20 at 16:48

Snowflake- After MERGE t1 into t2, how do I ensure that next time I run MERGE, old records from t1 don't get merged again?

1 Answers1