0

The facilities we can use in Unices systems for asynchronous I/O alerts, such as epoll on Linux, kqueue on BSD systems and Solaris /dev/poll or I/O Ports, all let the user to specify a pointer to be associated to the file descriptor the user want to receive I/O alerts.

Usually in this pointer the user specify the pointer to a structure which will abstract a file descriptor (such as a "Stream" structure, or things like that), and the user will allocate a new structure every time a new file descriptor is open.

E.g. struct stream { int fd; int flags; callback_t on_read_fn; /* ... */ };

Now, my question is: how to safely deallocate this structure the user allocate in a multithreaded envinronment?

I ask this, because of the nature of epoll/kqueue/etc: You usually have a thread which "downloads" from the kernel a vector of events, containing file descriptors which have some I/O readiness, and an user pointer associated with that file descriptor.

Now, let's consider I have 2 threads: T1 which downloads those events and process them, e.g. calling stream->on_read_fn(); etc, and T2 which simply runs user code, user events, and stuff like that.

If T2 wants to close a file descriptor, simply does close(stream->fd); and T1 will not receive any I/O alerts for that fd anymore, so its safe to deallocate the stream structure there.

But WHAT IF the T1 thread has already downloaded that very same file descriptor in the vector of events it is processing right now, and it hasn't processed that file descriptor yet?

If T1 is scheduled BEFORE T2, it will be OK, but if T2 is scheduled BEFORE T1, it will close the file descriptor and deallocate the stream structure, so the thread T1, when it will process that file descriptor, will have a user associated pointer, pointing to an already deallocated structure! Of course this will crash badly.

My point is that T2 will never know IF the thread T1 downloaded some I/O alerts for that specific file descriptor, neither T2 can forecast IF T1 will ever download some I/O alerts or not at all!

This is very tricky, and its making my head spin. Any thoughts? When is safe to deallocate the user specified pointer in this scenario?

NOTE: a friend of mine suggested to remove the file descriptor from the epoll/kqueue queue BEFORE calling close(2) on it. This is right, and this is what I do right now, but this won't solve the problem, because T2 can remove the file descriptor from the epoll/kqueue queue, but this won't assure an I/O event for that file descriptor hasn't been already "downloaded" from the kernel and will be processed soon by thread T1.

Marco Pagliaricci
  • 1,366
  • 17
  • 31

3 Answers3

1

i had the exact similar problem, that's why in the new linux kernel proposals, someone (cant' remember the name) suggested to implement a DISABLED status for the FD so you can skip processing if it has been deallocated by another thread.

Personally, I moved from a multithreaded epool calls, to a single thread that epool() on the FDs , and then schedule the events to multiple threads. Object themself inside are reference counted, and collected later by a garbage collector. Works quite good honestly and without noticeable degradation against the multithreaded epool solution...

* EDITED *

Also, I've investigated another way to close the FD from the same thread than handles epool by creating a std::set protected by a mutex, and filled in by consumer threads as long as the FD needs to be closed. This worked quite good too.

Leonardo Bernardini
  • 1,076
  • 13
  • 23
  • also another solution, if you want to multithread epool absolutely, is to split the FDs queue across the threads, basically assign a different epool set to each thread, and keep a reference to the epool into the object so you know which set you need to arm. – Leonardo Bernardini Feb 27 '15 at 13:07
  • Thanks for your reply. Yeah, I already tried all of those approaches, but I was wondering if it would be possible just to use multiple threads with the same epoll fd, without using std::set or similar "tricks". The DISABLED status for the fd is quite interesting, but it'd need one more syscall to be called. – Marco Pagliaricci Feb 27 '15 at 15:07
  • I've found the patch, take a look at this: http://lwn.net/Articles/520022/ seems just an additional flag, no syscall. IF you don't need kernel compatibility this sounds like a very performant solution. – Leonardo Bernardini Feb 28 '15 at 16:17
  • That is very interesting, thanks Leonardo. I will try it. Unfortunately it doesn't solve the problem in BSD or Solaris. – Marco Pagliaricci Mar 03 '15 at 11:53
  • EPOLL_CTL_DISABLED is described here https://lwn.net/Articles/520012/ but it was reverted (a80a6b85b) before 3.7 was released. Oddly the issue seemed to have fallen through the cracks since then. – Ant Manelope Jun 01 '15 at 19:46
1

I solved this problem in my program by not freeing the struct, and instead marking it as "dead" and adding it to a list so it can be reused later. That way the pointer always stays valid, though it may have been reused.

tbodt
  • 16,609
  • 6
  • 58
  • 83
0

I'd rather avoid sharing the same data structure between 2 threads.

In the past, use "one-shot" trick, which appears to be portable on many systems. With one-shot behavior, once event is signalled, it is temporarily "taken out" of the queue, i.e no other thread would be notified of any fd becomes readable or writable.

Once you finish processing the event, you need to add it back to epoll/kqueue (as Linux doc puts it, "re-arm" the fd).

  • On Linux :

    Add to epoll : epoll_ctl()/EPOLL_CTL_ADD , flags EPOLLET|EPOLLONESHOT

    Re-arm : epoll_ctl()/EPOLL_CTL_MOD using the same event flags.

  • On BSD/OSX with kqueue

    Add to kqueue: EV_SET(...EV_ADD|EV_ONESHOT...);

    Re-arm : EV_SET(...EV_ADD|EV_ONESHOT...);

  • On Solaris

    just use port_associate(), to add and re-arm.

Vladislav Vaintroub
  • 5,308
  • 25
  • 31
  • It doesn't solve the problem: the case I've described above may happen even with ONESHOT enabled. The thread T2 want to close and deallocate, even if T1 has "downloaded" the fd in its vector, with the ONESHOT enabled – Marco Pagliaricci Mar 06 '15 at 12:32
  • How does this happen. T1 should never get get a notification, because previous event (that is being processed in T2) was not yet processed. The whole reason behind ONESHOT is to fix concurrent access to the same event. – Vladislav Vaintroub Mar 06 '15 at 16:20
  • OK, As a general rule, you should only close/free when you are handling socket read() or write() error in event processing code To enforce socket event and error (from the thread other than event processing), you need to shutdown() on the socket , but not close() it. I do not recall 100%, but you might have to have EPOLLHUP flag also set in this case. – Vladislav Vaintroub Mar 06 '15 at 16:45
  • EPOLLHUP is one of those flags epoll sets *always* by default, you can't even unset it. Btw, you're correct, you should only close the file descriptor from the read() or write() processing, but this is like closing/deallocating always from T1, the question is involving another thread T2... – Marco Pagliaricci Mar 06 '15 at 18:25
  • Ok, but perhaps you get the general theme. a) avoid accessing same structs from different threads as much as you can, here ONESHOT is a big help. b) if you need to 'kill' connection from outside of the event processing for this connection, do shutdown() socket and nothing else. This will cause connection to be signalted. c) inside even processing for the connection, close sockets and deallocate structs only when you run into socket error (from epoll or read/write). This comes from my own experience of implementing "KILL CONNECTION" with threadpooling in MariaDB – Vladislav Vaintroub Mar 06 '15 at 19:23
  • Thanks for sharing your experience man, I agree on those points. Unfortunately epoll doesn't seem to support multithreading very well, so multple threads for multiple epoll fds (so 1 thread for each epoll fd) seems a much better, safer and easier option, maybe also faster – Marco Pagliaricci Mar 06 '15 at 22:24
  • 1
    Sad but true. epoll is bad at multithreading indeed, as compared to kqueue, Solaris ports or IOCP. multiple epoll_wait() on the same epollfd do not scale, this adds lots of complexity – Vladislav Vaintroub Mar 07 '15 at 01:48