Using multiple threads with accept() on a nonblocking listener in each process

Question

The following strategies seem to work well:

Using a single thread/process with a nonblocking accept() call on the listener socket, regardless of how the program handles the accepted request.
Using multiple threads/processes with a blocking accept() call in each process. When a connection comes in, this wakes up exactly one accept().

What doesn't work well is EPOLLIN watching the listener socket each thread/process with accept() in a callback. This wakes every thread/process up, though only one can succeed in actually accept()ing. This is just like the bad old days of blocking accept() causing a stampede when a connection would come in.

Is there a way to only have a single thread/process wake up to accept() while still using EPOLLIN? Or should I rewrite to use blocking accept()s, just isolated using threads?

It's not an option to have only a single thread/process run accept() because I'm trying to manage the processes as a pool in a way where each process doesn't need to know whether it's the only daemon accept()ing on the listener socket.

score 2 · Answer 1 · answered Oct 21 '13 at 14:17

2

How about multiple sockets listening on the same proto+address+port? This can be accomplished with the Linux SO_REUSERPORT. https://lwn.net/Articles/542629/ . I have not tried it, but I think it should work even with epoll, since only one socket gets the actual event.

caveat emptor This is a non-portable, Linux-only solution. SO_REUSEPORT also suffers from some bugs/features that are detailed in the linked article.

answered Oct 21 '13 at 14:17

thuovila

1,960
13
21

This isn't a question about how to bind() multiple threads or processes to the same port or give them access to the same listener socket. It's how to coordinate among them for accept() once that's already been done. – David Timothy Strauss Oct 21 '13 at 22:58
1

@DavidTimothyStrauss As Ive understood it, SO_REUSEPORT distributes the connections equally between the processes, so there would be no need for you to coordinate. – thuovila Oct 22 '13 at 06:29

Chris Dodd · Accepted Answer · 2013-10-21T16:46:33.900

You need to use EPOLLET or EPOLLONESHOT so that exactly one thread gets woken by the EPOLLIN event when a new connection comes in. The handling thread then needs to call accept in a loop until it returns EAGAIN (EPOLLET) or manually reset with epoll_ctl (EPOLLONESHOT) in order for more connections to be handled.

In general when using multiple threads and epoll, you want to use EPOLLET or EPOLLONESHOT. Otherwise when an event happens, multiple threads will be woken to handle it and they may interfere with each other. At best, they'll just waste time figuring out that some other thread is handling the event before waiting again. At worst they'll deadlock or corrupt stuff.

With `EPOLLONESHOT`, I still see `EAGAIN` from `accept()`. Is this expected? I don't see the stampede I did with just `EPOLLIN`, but should I still expect some processes to fail to `accept()`? — David Timothy Strauss, Oct 22 '13 at 01:20

Using multiple threads with accept() on a nonblocking listener in each process

2 Answers2