8

My server program (socket stream) is running and it accepts the clients. Due to some abnormal condition, server is getting terminated. The other side clients are waiting for server reply. How to I reconnect the running clients to new server? any functions in sockets?

saeed
  • 2,477
  • 2
  • 23
  • 40
loganaayahee
  • 809
  • 2
  • 8
  • 13

5 Answers5

13

A socket that had been connect()ed once cannot be reused with another call to connect().

The steps to connect to a TCP server and read/write some data are as follows (pseudo code):

sd = socket(...) // create socket descriptor (allocate socket resource)
connect(sd, server-address, ...) // connect to server
read/write(sd, data)  // read from server 
close(sd) // close /socket descriptor (free socket resource)

In case the server goes down after connect all the client could and shall do is

close(sd) // close socket descriptor (free socket resource)

and then start over beginning with:

sd = socket(...) // create socket descriptor (allocate socket resource)
...

Starting over beginning with:

connect(sd, server-address, ...) // connect to server
...

would probably lead to undefined behaviour, but at least to an error.

alk
  • 69,737
  • 10
  • 105
  • 255
0
int
connect_retry(int sockfd, const struct sockaddr *addr, socklen_t alen)
{
    int nsec;

    /*
     * Try to connect with exponential backoff.
     */

    for (nsec = 1; nsec <= MAXSLEEP; nsec <<= 1) {
        if (connect(sockfd, addr, alen) == 0) {

            /*
             * Connection accepted.
             */

            return(0);
        }

        /*
         * Delay before trying again.
         */

        if (nsec <= MAXSLEEP/2)
            sleep(nsec);
    }
    return(-1);
}

refered by Advanced programming in the unix Environment book.

You can also use:

SO_REUSEADDR in setsockopt(). It allows the reuse of local addresses.

alk
  • 69,737
  • 10
  • 105
  • 255
Anitha Mani
  • 863
  • 1
  • 9
  • 17
  • 1
    Isn't the OP talking about the case, that the server goes down **after having successfully connected**? – alk Apr 25 '13 at 12:22
  • This code does not work. You can't reconnect a TCP socket, even if the prior connect attempt failed. You have to close it and create a new one. Don't post untested code techniques here. – user207421 May 05 '20 at 07:38
0

You connect to the new server the same way you connected to the original server. There isn't a different API for this. I don't see why you would think otherwise..

user207421
  • 305,947
  • 44
  • 307
  • 483
0

You cant handle this in server but you can create a session for your clients then when your client reconnect restore their settings and continue on for sending and receiving messages and in your client side application create a thread with specific interval to check if the server is available or not and if so try a reconnect procedure but, I suggest you to check your server side program what happens that your program goes down?

saeed
  • 2,477
  • 2
  • 23
  • 40
0

Let me start by saying that anything is possible. There is a function which does that for you. It is the same connect that you might have used for your TCP client. You just need to think about when you need to call this connect again.

So when do I use that connect function now?

Let me present one possible solution.

You need to have some sort of monitoring software (maybe a daemon) which keeps track of the state of the server process. It could, say, periodically poke the server process to see if it's alive.

Consider the case of a single client and server. The client is running on system A; the server, on system B.

Say the server has run and crashed right before it recved anything. This means the client would have successfully connected to the server and its send will fail. When the send fails, you can contact your monitoring software on system B to see what happened.

If the monitoring software reports that it didn't find anything wrong with the server, something else went wrong then (possibly an interrupt, your NIC went kaput, whatever). Those reasons are outside the scope of this discussion.

If your monitoring software replies saying that it found that the server program died, then you can:

  • reply to the monitoring software asking it to start the server again
  • OR tell it to shut itself down
  • OR do something else that you see fit.

Now, in the client in system A, begin the process of socket, connect, send, recv, etc again.

Essentially, you are creating another server, X, which looks after your current server, Y. When server Y dies, you look to server X for reasons.

Community
  • 1
  • 1
Anish Ramaswamy
  • 2,326
  • 3
  • 32
  • 63
  • What exactly can server X do that the client can't? And what's wrong with an ordinary read timeout reading the response, and no server X at all? – user207421 May 05 '20 at 07:37
  • Not saying you can't do that. The trade-off is if that client dies or is compromised integrally you'd need to handle that. Also, if your devops team suddenly decides to re-provision another server located elsewhere, will you issue a client update (which potentially costs money depending on platform)? There are trade-offs to every situation. – Anish Ramaswamy May 05 '20 at 08:11
  • You're over-complicating a very simple situation. Every network client ever written needs a read timeout facility. You can't write magical code in server X: you can only write code that should have been in the clients in the first place. Read timeout is one line of code to set and one catch block to detect. This is very basic. – user207421 May 08 '20 at 02:29
  • Repeating your earlier statement in a condescending tone isn't going to change what I said. I am not at all saying you can't or shouldn't do client read-timeouts. All I am saying is that there are trade-offs to both our solutions. I also explicitly stated that you can simply reconnect right in my first paragraph. – Anish Ramaswamy May 08 '20 at 11:21