0

We have a merge publication which uses a range of replication methods across ~80 articles (host_name filtering on one table, join filtering for several others; some tables using bi-directional synchronization direction, others with 'download only, prohibit subscriber changes'; some tables using identity range management, others not needing it).

We are using push replication to subscriber databases which already have all the necessary tables, so we are using 'delete data' for the 'action if name is in use'. The tables have identical schemas on both the subscriber and publication databases, but are empty until replication has initiated.

The issue is that sometimes initialization of the subscription takes ~3 minutes, but other times it is timing out after ~20 minutes, using identically templated subscriber databases, and identical starting datasets (~10,000 records). And after initialization, when synchronizing, instead of taking ~5 seconds, it's again taking ~20 minutes. And looking at the history in the replication monitor, the synchronization history says it's making 1000's of schema changes (even when there have been no data changes).

I turned on verbose logging to see what the schema changes are, and it seems to be looping repeatedly through all the tables, turning all the constraints off, and then back on again. I'm at a loss as to why it's doing this.

Note in case it's relevant: I have been using a 100 character unicode string (created randomly from the full unicode range of characters) as the host_name for different subscribers. I suspected this might be causing the issue, but I have since reproduced the problem using a 50 character lowercase letter string.

Finally, all servers are hosted in the same data center, so I do not think network latency is an issue.

willycs40
  • 63
  • 6
  • I've had some cases where the Merge Agent repeatedly enumerates schema changes when the -ParallelUploadDownload Merge Agent parameter is being used. Removing it or setting it to 0 alleviated the issue. Another way I've alleviated this is by checking the schemaversion in sysmergesubscriptions on both Publisher and Subscriber and making sure they match as well as the schemaguids. – Brandon Williams Oct 25 '13 at 15:56

2 Answers2

0

In case anyone comes across this, here's the solution:

When we provisioned a new 'subscriber', we made a new set of data in the tables for that subscriber (based on defaults). However, we took a shortcut when creating this new copy of data; we turned all the constraints off, then did our select..insert's and then turned the constraints back on. This was because there were lots of tables with lots of constraints, and we didn't want to have to go through each table in the right (and besides, we knew we were going to add good-integrity data).

The problem is, turning all the constraints off, and then back on, is recorded by merge replication as two schema changes. (Rather than 'none'). So every time we added subscribers, we created loads of schema changes. And next time any subscriber sync'ed - it had to send all these pointless constraint on/offs.

Due to the particulars of our shortcut, it actually added more than two schema changes like this per new subscriber. So if a subscriber didn't sync for a while, it would end up having thousands of new schema changes.

And unless we refreshed the snapshot, new subscribers would have an outdated schema as soon as it was created, so new subscriptions took longer and longer until they started timing out.

Solution: Remove the 'shortcut' and just copy the data in the right order, without touching the constraints. No further problems.

willycs40
  • 63
  • 6
-1

If it is merge replication. Questions are: Do your publisher and subscribers databases are exactly same ? You should have network shares instead of ftp in your case to transfer i.e. snapshots ?

grillazz
  • 527
  • 6
  • 15