We currently have a SQL Server 2012 availability group with 2 replicas. We want to move the replicas to new hardware and upgrade to SQL Server 2016. We plan on doing this as a rolling upgrade, as follows:
- Server A - SQL Server 2012 replica
- Server B - SQL Server 2012 replica
- Server C - SQL Server 2016 replica
- Server D - SQL Server 2016 replica
Add servers C and D to AG and wait until they are synchronized. Fail over to one of them and verify things still work. Remove Servers A and B from AG.
We can add Servers C and D to the AG, but when we try to join them to the AG, we get an error:
Failed to join local availability replica to AG. The operation encountered SQL Server error 41106 and has been rolled back...
Did some searching and found some things to check like firewall ports and permissions on endpoints.. Ran out of time today, so we didn't get far enough to actually check these things yet, but I wanted to ask something here that might determine if this is even a feasible plan.
The issue is SQL Server A is running under a Managed Service Account (call it MSA1) on Windows Server 2012. SQL Server B is running under another MSA, call it MSA2, on Windows Server 2012. SQL Servers C and D are running under a single Global Managed Service Account, call it GMSA1, on Windows Server 2016.
I know gMSAs are not supported in AGs for SQL Server 2012 and they can cause wonky behavior if used. We have no intention of running in this configuration other than for the few minutes it takes to add the new servers, fail over to them, and remove the old ones.
Is this possible? Could the gMSA be the cause of the 41106 error when trying to join the server to the AG?