I have a merge replication that has been running fine for a year (the system has been active for several years but I recreated the replication a year ago because of some other problems).
Everything is still working but when someone is (re-)initializing a subscription (downloading a new subscription-database) it takes much longer time and there are some locks appearing. the subscribers are ca 300 windows ce devices with sql server compact.
What I can see is that a stored procedure called MSenumgenerations90
is the culprit and it takes up alot of IO and CPU. the most common wait i can see in activity monitor i CXPACKET and I understand that this is parallelism. I can see some pageiolatch and at least some of them is pointing to the tempdb. the table msMerge_GenHistory contains a bit more then 1,5 million records and i tried to add indexes to it to make the expensive operation in the stored procedure to run quicker but with no success.
My retention period is high. it is set to 60 days but i can still see that there is generations in MSMerge_GenHistory
that is created (coldate) when I recreated the replication a year ago however there seems to be some generations removed since the generationid of the latest one is above 2,2 million. could there be something wrong with the metadata cleanup or is this normal behavior?
When the CXPACKET waits are showing i get Buffer I/O wait times at 2000 ms/s and there is also a lot of IO_COMPLETION waits, when I monitor the disk at the server i can see that tempdb-file is getting the most reads and writes. One of the things I have read is that you can setup multiple files for tempdb and that it could relieve the pressure on tempdb, could this be something that would help my situation?
Is it safe to add more files to tempdb when you have a merge replication running?
UPDATE. I ran the stored procedure to get the actual execution plan and there is a sort warning that says the sort is spilling to tempdb. I am guessing this is the reason for my problem, the sp is running in parallel and it spills to tempdb and 8 consecutive threads are accessing the the tempdb to try and sort results and that is why the waits in IO is high and why CXPACKET is showing!? my sort waring how can i get it to stop spilling to tempdb? could it be done by indexing better? I cannot modify the sp since it is part of the replication. Any help is welcome.