Exchange 2010 mailbox moves - high "stall" count

Question

We're moving several terabytes of mailboxes from one Ex2010 DAG to another one - same subnet for the active mailboxes, the destination has members across a WAN link but they're currently passive copies. This process seems to have been faster when we started and is running pretty slowly now. I don't have solid metrics on it from the past, but the BytesTransferredPerMinute is often down into double digits now, where it was in the multiple hundreds previously.

All of our target databases (in the new DAG) were set to "secondcopy" for DataMoveReplicationConstraint - we have two members here at HQ, in the same subnet as the source servers, and the replication queues for those two members are single digits, mostly zeroes. We do have higher numbers for the members across the WAN link, which is the behavior we've been seeing when things were running well - the move to the new DAG is fast on the HQ servers, and lags on the DR servers because it has to traverse a 50 Mb WAN link. So that shouldn't be the problem, open to investigating further if there's a bug or unexpected behavior from the MRS data guarantee API.

I've throttled back the MRS service to only run 2 at a time and it's not much improved. Now, running PerfMon, I'm seeing what I would think are high values in "Stalled moves (database replication)", "Transient Failures (network)", "Move requests: Stalls". I've turned up diagnostic logging for both items under MRS to "Expert" and it's not really showing me much of anything, other than "initial seeding", etc. The DB copy statuses are good.

Is there anywhere else to look to get an idea of what Exchange is seeing as the reason for the problem? I'd prefer to not have to start with the infrastructure and verify that everything is great; I'd like to know specifically what Exchange is unhappy about and go from there.

Maybe this script can shed some light on it - http://blogs.technet.com/b/exchange/archive/2014/03/24/mailbox-migration-performance-analysis.aspx — joeqwerty, Jan 19 '16 at 16:30
I don't work with Exchange a lot but two other things I'd be curious about is transaction logging and the Copy Queue and Replay Queue on the passive database copies. Is the transaction log drive filling up, perhaps causing the moves to be throttled? Are the copy and replay queues high? Are they increasing? — joeqwerty, Jan 19 '16 at 16:38
@joeqwerty - edited my question with more details about queue length. Shouldn't be an issue with our DataMoveReplicationConstraint of "second copy". — mfinni, Jan 19 '16 at 17:02

score 0 · Accepted Answer · answered Mar 10 '16 at 14:31

0

I'm a maroon. There's really good logging in the "get-moverequeststatistics -includereport" flag.

answered Mar 10 '16 at 14:31

mfinni

36,144
4
53
86

Exchange 2010 mailbox moves - high "stall" count

1 Answers1