non blocking tcp connect with epoll

Question

My linux application is performing non-blocking TCP connect syscall and then use epoll_wait to detect three way handshake completion. Sometimes epoll_wait returns with both POLLOUT & POLLERR events set for the same socket descriptor.

I would like to understand what's going on at TCP level. I'm not able to reproduce it on demand. My guess is that between two calls to epoll_wait inside my event loop we had a SYN+ACK/ACK/FIN sequence but again I'm not able to reproduce it.

score 6 · Answer 1 · answered May 21 '10 at 01:17

It is likely for this to happen if the connect has failed - for example with "connection timed out" (for sockets doing a non-blocking connect, POLLOUT becomes set when the connect operation has finished for both successful and unsuccessful outcomes).

When POLLOUT becomes set for the socket, use getsockopt(sock, SOL_SOCKET, SO_ERROR, ...) to check if the connect succeeded or not (the SO_ERROR socket option is 0 in this case, and otherwise indicates why the connect failed).

score 4 · Answer 2 · answered Nov 10 '10 at 23:25

Here is some good information on non-blocking tcp connect().

When a socket error is detected (i.e. connection closed/refused/timedout), epoll will return the registered interest events POLLIN/POLLOUT with POLLERR. So epoll_wait() will return POLLOUT|POLLERR if you registered POLLOUT, or POLLIN|POLLOUT|POLLERR if POLLIN|POLLOUT was registered.

Just because epoll returns POLLIN doesn't mean there will be data available to read, since recv() may just return the error from the non-blocking connect() call. I think epoll returns all the registered events with POLLERR to make sure the program calls send()/recv()/etc.. and gets the socket error. Some programs never check for POLLERR/POLLHUP and only catch socket errors on the next send()/recv() call.

non blocking tcp connect with epoll

2 Answers2

Linked