We are experiencing an issue with a simple .NET Core (3.1) console app connected to a SQL Server database and deployed in a container on AWS Fargate (ECS). SQL Server is a mirrored setup (principal/mirror, with a witness to allow for automatic failover).
The problem occurs when we (manually) failover the database; observing SQL Server Profiler, the SQL client appears to continue to repeatedly attempt to connect with the old principal (now the mirror and in restoring mode). There is no attempt to connect to the new principal post-failover. We are using System.Data.SqlClient.dll
as the client. The following exception is logged:
System.Exception Cannot connect to SQL Server Browser. Ensure SQL Server Browser has been started.
- at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)
at System.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal& connection)
at System.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource`1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection)
at System.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource`1 retry, DbConnectionOptions userOptions)
at System.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource`1 retry)
at System.Data.SqlClient.SqlConnection.OpenAsync(CancellationToken cancellationToken)
--- End of stack trace from previous location where exception was thrown ---
Despite the error suggesting that SQL Server Browser is not started (we are connecting to a named instance), we have confirmed that the service is started. We have also created a temporary security group to allow all traffic (all protocols/ports, e.g. UDP port 1434 for SQL browsing) between application and database, so there is no traffic being blocked at any point. When we manually fail back to the original principal (i.e. the server specified as 'data source' in the connection string), the application can successfully connect again.
The strange thing is that we have run the exact same application in the following AWS configurations and have seen that the client will immediately and successfully connect to the failover partner (i.e. the new principal) following a manual failover:
- .NET Core application (3.1) running in a Docker container on a Ubuntu EC2 instance
- .NET Core application (3.1) running as a self-contained app on a Ubuntu EC2 instance
- .NET Core application (3.1) running on a Windows Server EC2 instance (.NET Core runtime installed)
- .NET Full Framework (4.8) running on a Windows Server EC2 instance (.NET runtime installed)
All the above deployments are located in the same VPC/subnet as the Fargate service. Thus, the problem appears to be as a result of running the app in a Fargate service. Has anybody encountered this type of issue with AWS Fargate before?