Questions tagged [epoll]

epoll is a Linux 2.6 readiness notification API for sockets, pipes, and special event-, signal-, and timer descriptors which can operate both in level- and edge-triggered mode, although presently only level-triggered behaviour is in accordance with the documentation. As opposed to poll or select, epoll scales O(1) in respect to the number of descriptors and O(N) in respect realized events.

The epoll API is built around 3 functions:

  • epoll_create creates a new epoll instance and returns a file descriptor that refers to it. This descriptor can be operated on with the other epoll functions and can be added to a different epoll instance
  • epoll_ctl allows file descriptors (sockets, pipes, eventfd, timerfd, signalfd, and epoll) being added and removed to an epoll's set of monitored descriptors, as well as flags of existing descriptors being modified
  • epoll_wait will return up to maxevents queued events. If no events are available, it will return zero. If a timeout is provided and no events are available, epoll_wait will block for the duration of the timeout (a value of -1 means forever).

The conceptual idea behind the API is that applications usually have a certain set of descriptors that changes rarely if ever, but which needs to be observed for readiness many times. Also, typically a lot fewer descriptors are ready than open. epoll therefore separates copying the list of descriptors to watch from the actual watching and notifies registered listeners instead of iterating a list of descriptors.

The operation of level-triggered mode (default) is easy, since it is identical of how poll/select works. As long as the resource is ready (e.g. as long as there remains data to be read), every call to epoll_wait will return an event.

The operation of edge-triggered mode (EPOLLET flag) is more complicated, more error-prone, inconsistenly documented, and inconsistently implemented. In epoll(7), it is explained in terms of reading partial data causing the next call to epoll_wait to block until new data arrives, but not while some data remains in the buffers. It is therefore recommended to use non-blocking descriptors and reading until EAGAIN is received.
According to The Linux Programming Interface, edge-triggered mode only reports events that happened since the last call to epoll_wait.
In reality, it does a mixture of both (i.e. both reads and epoll_wait reset the status to "not ready"), and it does not work as indicated in respect of several epoll instances listening to the same socket or several threads waiting on the same epoll instance (observed under kernel 2.6.38 with timerfd and eventfd). Although epoll is supposed to signal all waiters upon arrival of an event, in edge-triggered mode it only ever signals a single waiter.

792 questions
4
votes
5 answers

How to correctly read data when using epoll_wait

I am trying to port to Linux an existing Windows C++ code that uses IOCP. Having decided to use epoll_wait to achieve high concurrency, I am already faced with a theoretical issue of when we try to process received data. Imagine two threads calling…
charfeddine.ahmed
  • 526
  • 2
  • 8
  • 16
4
votes
1 answer

What happened if I don't set EPOLLOUT event and direct call write() function?

I have EPOLLIN event for read data only. Is it ok to direct write the data without setting EPOLLOUT event?
Oktaheta
  • 606
  • 5
  • 21
4
votes
2 answers

epoll order of events from epoll_wait

I have ported a program over to epoll from select to increase the number of sockets we can handle. I have added the sockets to the epoll FD and can read and write happily. However, I am concerned about potential starvation of sockets even though I…
Mr. Rogers
  • 43
  • 5
4
votes
1 answer

epoll_wait() consume too much CPU

my epoll_wait() is consuming too much CPU, a simple strace shows that: strace -c -f -p 3655 Process 3655 attached with 5 threads ^CProcess 3655 detached Process 3656 detached Process 3657 detached Process 3658 detached …
Vince.Wu
  • 870
  • 1
  • 10
  • 17
4
votes
5 answers

epoll_wait() receives socket closed twice (read()/recv() returns 0)

We have an application that uses epoll to listen and process http-connections. Sometimes epoll_wait() receives close event on fd twice in a "row". Meaning: epoll_wait() returns connection fd on which read()/recv() returns 0. This is a problem, since…
4
votes
3 answers

Choosing a IPC solution for an event-driven application

I am currently working on a rather large single-threaded, event-based, application designed around epoll under Linux and comparable technologies under other platforms. Currently, whenever we wish two instances to communicate, they typically do it…
Yoric
  • 3,348
  • 3
  • 19
  • 26
4
votes
1 answer

Kqueue (edge-triggered): Does a short read mean that read-readiness was lost?

When working with Linux epoll in edge triggered mode (EPOLLET), and a read/write fails with EAGAIN/EWOULDBLOCK, it means that read/write-readiness was lost, and that a new readiness event is guaranteed to be made available via epoll_wait() as soon…
Kristian Spangsege
  • 2,903
  • 1
  • 20
  • 43
4
votes
1 answer

May `epoll_ctl` modify the `epoll_event` structure passed to it?

The Linux kernel manpages declare the epoll_ctl procedure as follows: int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event); As evident, the event parameter is declared as a pointer to the epoll_event struct. The significance of said…
Armen Michaeli
  • 8,625
  • 8
  • 58
  • 95
4
votes
1 answer

Is libuv under the hood use epoll or select(2) in unix

I have been reading around how nodejs uses libuv to perform asynchronous I/O. Reading more about it give me a feeling that it almost sound similar to how select(2) and epoll. So, my question if I'm using libuv(via node) is it true internally I using…
Noobie
  • 461
  • 1
  • 12
  • 34
4
votes
2 answers

Are there any major performance differences between epoll and kqueue?

My development machine is a MacBook (which of course has kqueue). However, in production we're running Linux (which of course uses epoll). Obviously, to know the performance characteristics of my code I need to run it using epoll. That said, is…
Jason Baker
  • 192,085
  • 135
  • 376
  • 510
4
votes
1 answer

Epoll with edge triggered and oneshot only reports once

I'm currently adding sockfds created from accept to an epoll instance with the following events: const int EVENTS = ( EPOLLET | EPOLLIN | EPOLLRDHUP | EPOLLONESHOT | EPOLLERR | EPOLLHUP); Once an event is triggered, I pass…
nathansizemore
  • 3,028
  • 7
  • 39
  • 63
4
votes
1 answer

How to build netty-transport-native-epoll-4.0.32.Final-linux-x86_64.jar?

I am using native epoll transport in netty and was able to download netty-transport-native-epoll-4.0.32.jar from the repository. However I also need netty-transport-native-epoll-4.0.32.Final-linux-x86_64.jar but not unable to find it anywhere.…
Dev G
  • 103
  • 7
4
votes
2 answers

Why does the read() block in this case?(linux epoll)

I am new to unix programming and today I am trying epoll but getting stuck in a problem. Under level-triggered mode, I think each new input event including Ctrl-D will cause epoll_wait to return. It works fine. But when I type somethings like aaa,…
Hayes Pan
  • 585
  • 1
  • 4
  • 19
4
votes
1 answer

How to use signalfd and epoll to get event when my child process exit?

I create a sigset_t and set it empty, then add SIGCHLD to it, then set it BLOCK: sigset_t sigmask; sigemptyset (&sigmask); sigprocmask (SIG_BLOCK, &sigmask, NULL); Then create a signalfd via signalfd4() int signalfd = signalfd4 (-1, &sigmask,…
ScorpioCPH
  • 151
  • 2
  • 8
4
votes
1 answer

Epoll_wait returning events on closed file descriptor

I'm working with a multithreaded embedded application in which epoll is used for IO in one of the threads. I'm relying on a particular feature of epoll that specifies that closing a file descriptor automatically removes it from the epoll set…
duffsterlp
  • 347
  • 1
  • 5
  • 15