2

The following strategies seem to work well:

What doesn't work well is EPOLLIN watching the listener socket each thread/process with accept() in a callback. This wakes every thread/process up, though only one can succeed in actually accept()ing. This is just like the bad old days of blocking accept() causing a stampede when a connection would come in.

Is there a way to only have a single thread/process wake up to accept() while still using EPOLLIN? Or should I rewrite to use blocking accept()s, just isolated using threads?

It's not an option to have only a single thread/process run accept() because I'm trying to manage the processes as a pool in a way where each process doesn't need to know whether it's the only daemon accept()ing on the listener socket.

Community
  • 1
  • 1

2 Answers2

2

How about multiple sockets listening on the same proto+address+port? This can be accomplished with the Linux SO_REUSERPORT. https://lwn.net/Articles/542629/ . I have not tried it, but I think it should work even with epoll, since only one socket gets the actual event.

caveat emptor This is a non-portable, Linux-only solution. SO_REUSEPORT also suffers from some bugs/features that are detailed in the linked article.

thuovila
  • 1,960
  • 13
  • 21
  • This isn't a question about how to bind() multiple threads or processes to the same port or give them access to the same listener socket. It's how to coordinate among them for accept() once that's already been done. – David Timothy Strauss Oct 21 '13 at 22:58
  • 1
    @DavidTimothyStrauss As Ive understood it, SO_REUSEPORT distributes the connections equally between the processes, so there would be no need for you to coordinate. – thuovila Oct 22 '13 at 06:29
2

You need to use EPOLLET or EPOLLONESHOT so that exactly one thread gets woken by the EPOLLIN event when a new connection comes in. The handling thread then needs to call accept in a loop until it returns EAGAIN (EPOLLET) or manually reset with epoll_ctl (EPOLLONESHOT) in order for more connections to be handled.

In general when using multiple threads and epoll, you want to use EPOLLET or EPOLLONESHOT. Otherwise when an event happens, multiple threads will be woken to handle it and they may interfere with each other. At best, they'll just waste time figuring out that some other thread is handling the event before waiting again. At worst they'll deadlock or corrupt stuff.

Chris Dodd
  • 119,907
  • 13
  • 134
  • 226
  • With `EPOLLONESHOT`, I still see `EAGAIN` from `accept()`. Is this expected? I don't see the stampede I did with just `EPOLLIN`, but should I still expect some processes to fail to `accept()`? – David Timothy Strauss Oct 22 '13 at 01:20