I'm working in an environment where data exists on numerous client databases and one by one is pulled into a central data repository via sql.
To automate a testing process, I've written a really nice, streamlined push-button script that backups, purges and re-extracts data on a user-specified client database. It then restores the data from backup tables. It makes heavy use of synonyms to streamline the code.
I'm running into a performance problem with the purge process where the DELETE query incurs a Remote Scan. It exactly the same query as the INSERT/SELECT which is simply passed as a Remote Query.
This INSERT works great:
INSERT INTO origChild
SELECT child.*
FROM
bakParent par
JOIN bakChild child ON par.GUID = child.GUID
WHERE
par.DateInserted = '2013-08-12 20:30:42.920'
This DELETE performs poorly:
DELETE
bakChild
FROM
bakParent par
JOIN bakChild child ON par.GUID = child.GUID
WHERE
par.DateInserted = '2013-08-12 20:30:42.920'
Below are the estimated query execution plans. The Remote Scan pulls 5M+ records while the INSERT/SELECT only deals with ~16,000 records.
I can't figure out why the plans are so different. I understand that queries to linked servers can lead to performance issues but the two JOINs are identical. I would expect them to be the same. (Or there should be a way for me to get the DELETE to perform similarly to the INSERT.)
I have confirmed that removing the INSERT part in the first query has the same execution plan.
Any suggestions?