I have an async socket server written in C#, running on a Lightsail server running Amazon Linux. It consists of a TcpListener that accepts connections, starts up a new thread to listen when someone connects, initiates an SSL connection, and then acts as a server for an online game.
This server works fine for about a day, until suddenly all networking stops working on the server. The crash takes anywhere from 22 hours to one week to occur. The symptoms are as follows:
- Anyone already connected to the server will suddenly stop receiving/sending data. I can see in the logs that my inactivity checking code will eventually kick them for not sending heartbeat packets.
- The server will also be unable to connect to its MySQL database (which is running on the same system, so it's unable to connect to
localhost
? I can still access it through PHPMyAdmin during this time). - It is, however, still able to write both to files and to console, as my logger is still able to write to both.
The code looks like everyone else's (I did try the changes suggested for this question, but it still crashed after ~24 hours). None of the errors get logged, so it looks like it never encounters an exception. No exceptions precede the crash, which is why I've been having problems figuring this one out.
For completeness, here is my main loop:
public void ListenLoop()
{
TcpListener listener = new TcpListener(IPAddress.Any, 26000);
listener.Start();
while (true)
{
try
{
if (listener.Pending())
{
listener.BeginAcceptTcpClient(new AsyncCallback(AcceptConnection), listener);
Logger.Write(Logger.Level.INFO, "continuing the main loop");
}
// Yield so we're not stuck in a busy-loop
Thread.Sleep(5);
}
catch (Exception e)
{
Logger.Write(Logger.Level.ERROR, $"Error while waiting for listeners: {e.Message}\n{e.StackTrace}");
}
}
}
and here are the accept parts:
/// <summary>
/// Finish an async callback but spawn a new thread to handle it if necessary
/// </summary>
/// <param name="ar"></param>
private void AcceptConnection(IAsyncResult ar)
{
if (ar.CompletedSynchronously)
{
// Force the accept logic to run async, to keep our listening
// thread free.
Action accept = () => AcceptCallback(ar);
accept.BeginInvoke(accept.EndInvoke, null);
} else
{
AcceptCallback(ar);
}
}
private void AcceptCallback(IAsyncResult ar)
{
try
{
TcpListener listener = (TcpListener) ar.AsyncState;
TcpClient client = listener.EndAcceptTcpClient(ar);
// If the SSL connection takes longer than 5s we have a problem, and should stop
client.Client.ReceiveTimeout = 5000;
// Attempt to get the IP address of the client we're connecting to
IPEndPoint ipep = (IPEndPoint)client.Client.RemoteEndPoint;
string ip = ipep.Address.ToString();
Logger.Write(Logger.Level.INFO, $"Connection begun to {ip}");
// Authenticate and begin communicating with the client
SslStream stream = new SslStream(client.GetStream(), false);
try
{
stream.AuthenticateAsServer(
serverCertificate,
enabledSslProtocols: System.Security.Authentication.SslProtocols.Tls12,
clientCertificateRequired: false,
checkCertificateRevocation: true
);
stream.ReadTimeout = 3600000;
stream.WriteTimeout = 3600000;
NetworkPlayer player = new NetworkPlayer();
player.Name = ip;
player.Connection.Stream = stream;
player.Connection.Connected = true;
player.Connection.Client = client;
stream.BeginRead(player.Connection.Buffer, 0, 1024, new AsyncCallback(ReadCallback), player);
}
catch (Exception e)
{
Logger.Write(Logger.Level.ERROR, $"Error while starting the connection to {ip}: {e.Message}");
// The following code just calls stream.Close(); and client.Close(); but sends exceptions to my logger.
CloseConnectionSafely(client, stream);
}
}
catch (Exception e)
{
Logger.Write(Logger.Level.ERROR, $"Error while starting a connection to an unknown user: {e.Message}");
}
}