0

I want to create a socket in Android with NDK, but sometimes I get some errors when connecting to server, and I can make sure that the network of the users' cell phone is available. One case is timeout error, and the other seems like connect refused, because I get the following logs, I think the second error is connect refused because the error from getsockopt is 111 even though the strerror give me Operation now in progress, but the server address is valid:

connect::socket error: Operation now in progress 
Or 
connect::error:111, Operation now in progress

Here is my code snippet:

bool connect(int sockfd, struct sockaddr *address, socklen_t address_len, int timeout) {
    int ret = 0;
    struct timeval tv;
    fd_set mask;

    // set socket non block
    int flags = fcntl(sockfd, F_GETFL, 0);
    fcntl(sockfd, F_SETFL, flags | O_NONBLOCK);
    // use select to check socket connection
    ret = connect(sockfd, address, address_len);
    if (-1 == ret) {
        if (errno != EINPROGRESS) {
            perror("connect");
            inetConnectFailCode = errno;
            LOG(TAG.c_str(), "connect::errno != EINPROGRESS: %s", strerror(errno));
            return false;
        }
        LOG(TAG.c_str(), "connecting...\n");

        FD_ZERO(&mask);
        FD_SET(sockfd, &mask);
        tv.tv_sec = timeout;
        tv.tv_usec = 0;

        if (select(sockfd + 1, NULL, &mask, NULL, &tv) > 0) {
            int error = 0;
            socklen_t tmpLen = sizeof(int);
            int retopt = getsockopt(sockfd, SOL_SOCKET, SO_ERROR, &error, &tmpLen);
            if (retopt != -1) {
                if (0 == error) {
                    LOG(TAG.c_str(), "has connect");
                    return true;
                } else {
                    //I get error here
                    LOG(TAG.c_str(), "connect::error:%d, %s", error, strerror(errno));
                    return false;
                }
            } else {
                LOG(TAG.c_str(), "connect::socket error:%d", error);
                return false;
            }
        } else {
            //timeout, and I get error here sometimes
            LOG(TAG.c_str(), "connect::socket error: %s", strerror(errno));
            return false;
        }
    }
    LOG(TAG.c_str(), "has connect");
    return true;
}

This problem has been bothering me for a long time, anybody can give me a favor, thanks for advance.

wqycsu
  • 290
  • 2
  • 16

1 Answers1

2

You are displaying the wrong error message.

If getsockopt(SO_ERROR) fails, you are outputting error even though it does not have a valid value. But, more importantly, if getsockopt(SO_ERROR) is successful, but error is not 0, you are passing errno to strerror() instead of passing error. errno is still EINPROGRESS from the initial failed connect() call, so your error message says "Operation now in progress". Error 111 is ECONNREFUSED, which would be "Connection refused" instead.

Also, if select() returns <= 0, you are outputting strerror(errno) regardless of what select() actually returned. errno is only valid in that case if select() returns -1. If select() returns 0 instead, it is not guaranteed to update errno.

Try something more like this instead:

bool connect(int sockfd, struct sockaddr *address, socklen_t address_len, int timeout) {
    // set socket non block
    int flags = fcntl(sockfd, F_GETFL, 0);
    fcntl(sockfd, F_SETFL, flags | O_NONBLOCK);

    // use select to check socket connection
    int ret = connect(sockfd, address, address_len);
    if (-1 == ret) {
        int error = errno;
        if (EINPROGRESS == error) {
            LOG(TAG.c_str(), "connecting...\n");

            fd_set mask;
            struct timeval tv;

            FD_ZERO(&mask);
            FD_SET(sockfd, &mask);

            tv.tv_sec = timeout;
            tv.tv_usec = 0;

            ret = select(sockfd + 1, NULL, &mask, NULL, &tv);
            if (0 < ret) {
                socklen_t tmpLen = sizeof(int);
                ret = getsockopt(sockfd, SOL_SOCKET, SO_ERROR, &error, &tmpLen);
                if (-1 == ret) {
                    error = errno;
                }
            } else if (0 == ret) {
                error = ETIMEDOUT;
            } else {
                error = errno;
            }
        }

        if (0 != error) {
            inetConnectFailCode = error;
            LOG(TAG.c_str(), "connect::error:%d, %s", error, strerror(error));
            return false;
        }
    }

    LOG(TAG.c_str(), "has connect");
    return true;
}

With that said, you can get an ECONNREFUSED error if the server machine is reachable and receives your connection request, but actively refuses it because it cannot accept your connection at that time, either because:

  • the requested port is not opened for listening.

  • the listening socket's backlog of pending connections is full.

There is no way to differentiate which is the case, so you will just have to re-attempt the connection at a later time, preferably after an small interval of time elapses (say, at least 5 seconds, for instance). Fail the connection if several attempts in succession all fail.

You can get an ETIMEDOUT error instead if the server machine is reachable and listening, but does not complete the 3-way handshake before your timeout period elapses.

Either error may also happen if a firewall blocks your connection.

Update: according to a comment on an earlier answer I posted, only Windows servers cause ECONNREFUSED if the backlog is full. *Nix servers cause ETIMEDOUT instead.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • Thanks, but the reason you post may not happen, because the server to connect is a IM server, it works continuously, and we don't find any exceptional case. Other network requests (not to the IM server) of this App are normal that make it more confusing. – wqycsu Dec 14 '18 at 01:16
  • @wqycsu You are getting an `ETIMEDOUT` (110) or `ECONNREFUSED` (111) error. There are only a few ways those errors can happen during `connect()`, regardless of the *type* of server you are connecting to. Either 1) you are connecting to the wrong IP/Port, or it is not listening for connections, 2) the server is too busy to accept the connection at that moment, or 3) the connection is being blocked by a firewall/router. – Remy Lebeau Dec 14 '18 at 01:21