0

I have a TCP listener in a windows service that listens for any incoming TCP requests on a specific port and processes the message. It works fine when it is accessed directly. But once this is running behind a load balancer (in intranet), then it is not accepting any requests. I get errors like "unable to connect to remote server" OR "operation timed out". After a while the service terminates with "out of memory" exception. Please let me know what could be the reason for this. Pasting the code below. I even tried async mode as well (to avoid explicit thread launching). but that didn't help.

public class SampleListener: IDisposable
{
    public delegate void JobRecieved(HttpMessage msg);
    public event JobRecieved OnJobRecieved;

    #region Property

    private TcpListener _tcpListener;
    private Thread _listenerThread;

    public int Port { get; private set; }

    public string Url
    {
        get
        {
            return new UriBuilder { Scheme = "http", Port = Port, Host = Dns.GetHostName() }.ToString();
        }
    }

    #endregion

    public SampleListener(int port)
    {
        Port = port;
    }

    ~SampleListener()
    {
        DisposeImpl(false);
    }

    public void Start()
    {
        _tcpListener = new TcpListener(IPAddress.Any, Port);
        _tcpListener.Start();

        _listenerThread = new Thread(ListenCallback);
        _listenerThread.Start();
    }

    public void ListenCallback()
    {
        try
        {
            while (true)
            {
                using (TcpClient client = _tcpListener.AcceptTcpClient())
                using (var clientStream = client.GetStream())
                {
                    var msg = new HttpMessage();
                    msg.Receive(clientStream);
                    SendOKResponse(client, "");
                    OnJobRecieved(msg);
                    client.Close();
                }
            }
        }
        catch (System.Net.Sockets.SocketException e)
        {
            // Expected, TcpClient.Stop called                
        }
        catch (System.Threading.ThreadAbortException)
        {
            // Expected, thread going away
        }
        catch (System.IO.IOException)
        {
            // Expected, shutdown while reading
        }
    }

    private void SendOKResponse(TcpClient tcpClient, String responseBody)
    {
        var response = new HttpMessage
        {
            Status = "200",
            Reason = "OK",
            Version = "HTTP/1.1"
        };
        response.Send(tcpClient.GetStream(), responseBody);
    }

    public void Shutdown()
    {
        lock (this)
        {
            if (_listenerThread != null)
            {
                _listenerThread.Abort();
                _listenerThread = null;
            }

            if (_tcpListener != null)
            {
                _tcpListener.Stop();
                _tcpListener.Server.Close();
                _tcpListener = null;
            }                
        }
    }

    #region IDisposable Members

    private void DisposeImpl(Boolean bDisposing)
    {
        lock (this)
        {
            Shutdown();
        }
    }

    public void Dispose()
    {
        GC.SuppressFinalize(this);
        DisposeImpl(true);
    }

    #endregion

}
RKP
  • 5,285
  • 22
  • 70
  • 111
  • There could be several reasons depending on the type of load balancer, but most likely the load balander does not forward the port because it just plain does not know of its existence. – Joachim Isaksson May 06 '13 at 18:40
  • I agree with Joachim's analysis. Also, the fact that you are receiving an OutOfMemory exception may suggest that your client isn't handling the unsuccessful connection tries gracefully, leaving objects in memory. That's potentially a very serious issue. – OnoSendai May 06 '13 at 19:02
  • @OnoSendai, if you see the code above, I am disposing all objects in shutdown method. I also tried asynchronous implementation and that also throws out of memory exception and this is another issue I am facing apart from the main issue with LB. – RKP May 07 '13 at 06:52

1 Answers1

1

That's because NLB on Windows needs your application be a clustered one by default. And if it is not (which it is your case) you must use Sticky Sessions. Apparently your NLB is not using Sticky Sessions so requests may travel to different servers on each pass. That's why you get those exceptions (Take a look at this).

That happened to me on one of my own projects (a high performance TCP Server - the opposite of what you are doing).

Kaveh Shahbazian
  • 13,088
  • 13
  • 80
  • 139
  • thanks for your reply. as I said I tried changing the code to asynchronous mode (using callbacks). But still I am getting these out of memory exceptions. I checked this link http://stackoverflow.com/questions/6023264/high-performance-tcp-server-in-c-sharp in this site and that's also suggesting to go async – RKP May 07 '13 at 03:59
  • you say my application needs to be clustered one. what does that mean? pls explain – RKP May 07 '13 at 04:00
  • I have gone through the article and I do not use any sessions. every incoming TCP request can be handled independently without maintaining any state about previous requests in memory. so "no affinity" mode is right for me – RKP May 07 '13 at 06:47
  • 1 - NLB settings have not anything with your code. It's about keeping a TCP connection (conversation) on same server with same client (Sticky Sessions). Otherwise you may make a TCP connection on a server and then client responses to it and then NLB sends the answer to another server (that's the balancing part). But you need to get the answer on the same server (that means you need to use Sticky Sessions on your load balancer). – Kaveh Shahbazian May 07 '13 at 16:05
  • 2 - You do not need to make a clustered app because it's way to complicated. The answer to your problem is just using Sticky Sessions on NLB. But FYI windows services can be designed to be clustered; like IIS or MSMQ which means Windows (and the app itself) will take care of synchronizing the "state" (values of different parameters in RAM or disk or ...) on different servers. I hope this helps. – Kaveh Shahbazian May 07 '13 at 16:09
  • thanks for the tip about sticky sessions. Here both the TCP client and the server are behind NLB and we even tried disabling all nodes on the client keeping just one open and we still got timeout and unable to connect errors from the client. so not sure if sticky sessions is going to help in that case. Moreover, is it still better to go for asynchronous server regardless of this issue? – RKP May 07 '13 at 17:47
  • To be honest scaling-out is a totally tricky game and it's a totally different battlefield. It's no more just about async programming; it involves distributed programming too. At this level the only other test I can offer is to see if there is just one server and just one client (inside the whole system/infrastructure) it works or not. If it worked then still I think it's about the NLB strategy. – Kaveh Shahbazian May 07 '13 at 17:55