When I run a following query:
merge into test_records t
using (
select id, "senior developer" title, country from test_records where country = 'Brazil'
) u
on t.id = u.id
when matched and (t.id <> u.id) then -- this is just to be sure that nothing will get updated
update set t.title = u.title, t.updated_at = now()
when not matched then
insert (id, title, country, created_at, updated_at) values (id, title, country, now(), now());
I still see the following data when I run describe history of the target table:
{"numTargetRowsCopied": "2", "numTargetRowsDeleted": "0", "numTargetFilesAdded": "1", "numTargetRowsInserted": "0", "numTargetRowsUpdated": "0", "numOutputRows": "2", "numSourceRows": "2", "numTargetFilesRemoved": "1"}
And in the spark ui i see this:
So the unmodified rows are being rewritten without any (?) reason. Why is that?