We're moving several terabytes of mailboxes from one Ex2010 DAG to another one - same subnet for the active mailboxes, the destination has members across a WAN link but they're currently passive copies. This process seems to have been faster when we started and is running pretty slowly now. I don't have solid metrics on it from the past, but the BytesTransferredPerMinute is often down into double digits now, where it was in the multiple hundreds previously.
All of our target databases (in the new DAG) were set to "secondcopy" for DataMoveReplicationConstraint - we have two members here at HQ, in the same subnet as the source servers, and the replication queues for those two members are single digits, mostly zeroes. We do have higher numbers for the members across the WAN link, which is the behavior we've been seeing when things were running well - the move to the new DAG is fast on the HQ servers, and lags on the DR servers because it has to traverse a 50 Mb WAN link. So that shouldn't be the problem, open to investigating further if there's a bug or unexpected behavior from the MRS data guarantee API.
I've throttled back the MRS service to only run 2 at a time and it's not much improved. Now, running PerfMon, I'm seeing what I would think are high values in "Stalled moves (database replication)", "Transient Failures (network)", "Move requests: Stalls". I've turned up diagnostic logging for both items under MRS to "Expert" and it's not really showing me much of anything, other than "initial seeding", etc. The DB copy statuses are good.
Is there anywhere else to look to get an idea of what Exchange is seeing as the reason for the problem? I'd prefer to not have to start with the infrastructure and verify that everything is great; I'd like to know specifically what Exchange is unhappy about and go from there.