2

Getting this error message randomly from our .Net Core Web Application. There's no problem with connecting. But in the middle of performing a job to grab some report data which can take anywhere from 10 seconds to 50 seconds we occasionally get this error.

Full stack trace:

System.AggregateException
One or more errors occurred.
System.AggregateException: One or more errors occurred. ---> FirebirdSql.Data.FirebirdClient.FbException: Unable to complete network request to host "
No message for error code 335544721 found. ---> FirebirdSql.Data.Common.IscException: Unable to complete network request to host "
No message for error code 335544721 found. ---> System.IO.IOException: Unable to write data to the transport connection: An existing connection was forcibly closed by the remote host. ---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host
   at System.Net.Sockets.Socket.Send(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags)
   at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 size)
   --- End of inner exception stack trace ---
   at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 size)
   at FirebirdSql.Data.Client.Managed.FirebirdNetworkStream.Flush()
   at FirebirdSql.Data.Client.Managed.Version10.GdsTransaction.BeginTransaction(TransactionParameterBuffer tpb)
   --- End of inner exception stack trace ---
   at FirebirdSql.Data.Client.Managed.Version10.GdsTransaction.BeginTransaction(TransactionParameterBuffer tpb)
   at FirebirdSql.Data.Client.Managed.Version10.GdsDatabase.BeginTransaction(TransactionParameterBuffer tpb)
   at FirebirdSql.Data.FirebirdClient.FbTransaction.BeginTransaction()
   at FirebirdSql.Data.FirebirdClient.FbCommand.Prepare(Boolean returnsSet)
   at FirebirdSql.Data.FirebirdClient.FbCommand.ExecuteCommand(CommandBehavior behavior, Boolean returnsSet)
   at FirebirdSql.Data.FirebirdClient.FbCommand.ExecuteReader(CommandBehavior behavior)
   --- End of inner exception stack trace ---
   at FirebirdSql.Data.FirebirdClient.FbCommand.ExecuteReader(CommandBehavior behavior)
   at Dapper.SqlMapper.ExecuteReaderWithFlagsFallback(IDbCommand cmd, Boolean wasClosed, CommandBehavior behavior)
   at Dapper.SqlMapper.<QueryImpl>d__140`1.MoveNext()
   at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
   at Dapper.SqlMapper.Query[T](IDbConnection cnn, String sql, Object param, IDbTransaction transaction, Boolean buffered, Nullable`1 commandTimeout, Nullable`1 commandType)
   at SMP.Legacy.Utility.FirebirdHelper.ExecuteSql[T](String sql) in C:\bamboo-home\xml-data\build-dir\SMAT-SMPAPIB-JOB1\SMP.Legacy\Utility\FirebirdHelper.cs:line 27

The helper class that is calling the database uses this code:

using System;
using System.Collections.Generic;
using System.Linq;
using Dapper;
using FirebirdSql.Data.FirebirdClient;

namespace SMP.Legacy.Utility
{
    internal static class FirebirdHelper
    {
        public static List<T> ExecuteSql<T>(string sql)
        {
            try
            {
                using (var connection = new FbConnection(DatabaseProvider.ConnectionString))
                {
                    connection.Open();
                    var data = connection.Query<T>(sql).ToList();
                    connection.Close();
                    return data;
                }
            }
            catch (FbException e)
            {
                var error = $"Firebird Exception: {e.ErrorCode} - {e.Message}";
                Console.WriteLine(error);
                throw;
            }
            catch (Exception e)
            {
                Console.WriteLine(e.Message);
                throw;
            }
        }
    }
}

Firebird version information:

Firebird Server version: 3.0.7.33374

Firebird Sql Data Client version: 7.10.1.0

Connection string: Database=inet://10.100.0.100:3050/F:\Firebird\DATABASE.FDB;User=User;Password=Password;

Firebird conf:

ServerMode = Super
DefaultDbCachePages = 100K
FileSystemCacheThreshold = 100M
TempBlockSize = 2M
TempCacheLimit = 8000M
AuthServer = Legacy_Auth, Srp, Win_Sspi
AuthClient = Legacy_Auth, Srp, Win_Sspi
UserManager = Legacy_UserManager, Srp
WireCrypt = Enabled 
RemoteServicePort = 3050
LockMemSize = 30M
LockHashSlots = 30011
RemoteAccess = true

I've gotten our network guy to investigate and he could find no problem. He ran some tests from a normal PC to Firebird server through our LAN and from PC to AWS (Cloud). Here's his findings.

LOCAL NETWORK (LAN) – TEST CONDUCTED FROM PC TO FIREBIRD:

• Internal Latency: 0.6ms

• TCP UP Speed: 768 Mbps / TCP Down 461 Mbps

• UDP UP Speed: 831 Mbps / UDP Down 521 Mpbs

• Losses over UDP protocol (this is not for SMP/Firebird) – Firebird uses TCP 3050 – This tests and numbers are used for STREAMING and VOICE/VIDEO solutions.

o Loss on UPLOAD: 0.2%

o Loss on Download: 2.1%

These results means that our local network (LAN) is running perfectly. The losses doesn’t cause performance issues on applications and data transfer.

AWS VPC – SITE TO SITE VPN – TEST CONDUCTED FROM PC TO SMATA-PROD:

• Site-to-Site Latency (AWS VPC): 4ms

• TCP UP Speed: 252 Mbps / TCP Down 325 Mbps

• UDP UP Speed: 98 Mbps / UDP Down 68 Mpbs

• Losses over UDP protocol (this is not for SMP/Firebird) – Firebird uses TCP 3050 – This tests and numbers are used for STREAMING and VOICE/VIDEO solutions.

o Loss on UPLOAD: 0.1% - This is happening because there is QoS implemented from our server to AWS for any TCP and UPD traffic

o Loss on DOWNLOAD: 92% - This is happening because there is no QoS implemented from AWS to our server – This is not an issue at this moment because we do not use UDP from AWS to our server.

These results means that our VPC is running perfectly. The losses doesn’t cause performance issues on applications and data transfer.

I've checked other tickets relating to this but they seem to be not connecting at all whereas we can connect and do stuff. Just randomly it will fall over. When Firebird 3 was installed on our server a sweep of the disk was done to remove all existing Firebird files like fbclient.dll for example.

Not sure what the problem is. The port listening is set up correctly and there's no anti virus or firewall issues that we're aware of. Anyone had a similar issue with this?

Updated:

TCP Settings

Netstrata
  • 75
  • 1
  • 7
  • These tests only show that VPN works for data flow. What you got is most likely timeout for inactive TCP connection. The network guy must test stale connection with no packets sent inside. – user13964273 Jun 17 '21 at 10:47
  • This usually indicates that something (e.g. network hardware, firewall, etc) is closing idle connections (though < 60 seconds seems a bit short to me). If you can't address that, you may need to change config of your system to lower the TCP keepalive delay (tcp_keepalive_time), or enable Firebird to send dummy packets (setting `DummyPacketInterval` in firebird.conf), though tweaking the TCP keepalive delay is probably better. – Mark Rotteveel Jun 17 '21 at 13:30
  • The default command timeout for .NET Provider is 30 seconds, you said your statements usually take from 10 to 50 seconds, yet I don't see an explicit setting to set your command timeout. Add the commandTimeout parameter to your Query dapper statement. increase it to 60 seconds to see it if still fails. – Ed Mendez Jun 17 '21 at 21:57
  • Thanks @user13964273 I've mentioned this to the guy. – Netstrata Jun 18 '21 at 02:48
  • Thanks @MarkRotteveel We've gone through the TCP Settings. I've updated the question with our TCP settings for the firewall. The handshake is short but then they usually are right? – Netstrata Jun 18 '21 at 02:50
  • Thanks @EdMendez we'll test adding that parameter. – Netstrata Jun 18 '21 at 02:53
  • @Netstrata Here is a paper that can help to find out source of connection reset: https://portal.nutanix.com/page/documents/kbs/details?targetId=kA00e000000LTA6CAO – user13964273 Jun 18 '21 at 10:48
  • Thanks @user13964273 I've passed this information on. – Netstrata Jun 22 '21 at 06:55

1 Answers1

0

Managed to resolve this by increasing the MTU size on both the database server and the web client server.

How this was found was by performing the following tests:

C:\Users\itsupport>ping <server> -l 4096

Pinging <SERVER> [IP ADDRESS] with 4096 bytes of data:
Reply from 169.254.32.138: Packet needs to be fragmented but DF set.
Reply from 169.254.32.138: Packet needs to be fragmented but DF set.
Reply from 169.254.32.138: Packet needs to be fragmented but DF set.
Reply from 169.254.32.138: Packet needs to be fragmented but DF set.

This is not good and can be the reason of problem, because Firebird can send large packages.

Here's a decent article about MTU size https://techcommunity.microsoft.com/t5/core-infrastructure-and-security/mtu-size-matters/ba-p/1025286

and checked - it seems like MTU is 1394?

C:\Users\itsupport>ping <server> -l 1394 -f

Pinging <SERVER> [IP ADDRESS] with 1394 bytes of data:
Reply from 10.111.0.133: bytes=1394 time=4ms TTL=128
Reply from 10.111.0.133: bytes=1394 time=3ms TTL=128
Reply from 10.111.0.133: bytes=1394 time=3ms TTL=128
Reply from 10.111.0.133: bytes=1394 time=4ms TTL=128

Ping statistics for 10.111.0.133:
     Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
     Minimum = 3ms, Maximum = 4ms, Average = 3ms

C:\Users\itsupport>ping <server> -l 1395 -f

Pinging smp-fb30.netstrata.local [10.111.0.133] with 1395 bytes of data:
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.

Ping statistics for 10.111.0.133:
     Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),
Netstrata
  • 75
  • 1
  • 7