4

We have an application that uses epoll to listen and process http-connections. Sometimes epoll_wait() receives close event on fd twice in a "row". Meaning: epoll_wait() returns connection fd on which read()/recv() returns 0. This is a problem, since I have malloc:ed pointer saved in epoll_event struct (struct epoll_event.data.ptr) and which is freed when fd(socket) is detected as closed the first time. Second time it crashes.

This problem occurs very rarely in real use (except one site, which actually has around 500-1000 users per server). I can replicate the problem using http siege with >1000 simultaneous connections per second. In this case application segfaults (because of invalid pointer) very randomly, sometimes after few seconds, usually after tens of minutes. I have been able to replicate the problem with fewer connections per second, but for that I have to run the application a long time, many days, even weeks.

All new accept() connection fd:s are set as non-blocking and added to epoll as one-shot, edge-triggering and waiting for read() to be available. So somewhy when the server load is high, epoll thinks that my application didn't get the close-event and queues new one?

epoll_wait() is running in it's own thread and queues fd events to be handled elsewhere. I noticed that there was multiple closes incoming with simple code that checks if there comes event twice in a row from epoll to same fd. It did happen and the events where both closes (recv(.., MSG_PEEK) told this to me :)).

epoll fd is created:

epoll_create(1024);

epoll_wait() is run as follows:

epoll_wait(epoll_fd, events, 256, 300);

new fd is set as non-blocking after accept():

int flags = fcntl(fd, F_GETFL, 0);
err = fcntl(fd, F_SETFL, flags | O_NONBLOCK);

new fd is added to epoll (client is malloc:ed struct pointer):

static struct epoll_event ev;
ev.events = EPOLLIN | EPOLLONESHOT | EPOLLET;
ev.data.ptr = client;
err = epoll_ctl(epoll_fd, EPOLL_CTL_ADD, client->fd, &ev);

And after receiving and handling data from fd, it is re-armed (of course since EPOLLONESHOT). At first I wasn't using edge-triggering and non-blocking io, but I tested it and got a nice perfomance boost using those. This problem existed before adding them though. Btw. shutdown(fd, SHUT_RDWR) is used on other threads to trigger proper close event to be received trough epoll when the server needs to close the fd because of some http-error etc (I don't actually know if this is the right way to do it, but it has worked perfectly).

tshepang
  • 12,111
  • 21
  • 91
  • 136

5 Answers5

4

As soon as the first read() returns 0, this means that the connection was closed by the peer. Why does the kernel generate a EPOLLIN event for this case? Well, there's no other way to indicate the socket's closure when you're only subscribed to EPOLLIN. You can add EPOLLRDHUP which is basically the same as checking for read() returning 0. However, make sure to test for this flag before you test for EPOLLIN.

  if (flag & EPOLLRDHUP) {
     /* Connection was closed. */
     deleteConnectionData(...);
     close(fd); /* Will unregister yourself from epoll. */
     return;
  }

  if (flag & EPOLLIN) {
    readData(...);
  }

  if (flag & EPOLLOUT) {
    writeData(...);
  }

The way I've ordered these blocks is relevant and the return for EPOLLRDHUP is important too, because it is likely that deleteConnectionData() may have destroyed internal structures. As EPOLLIN is set as well in case of a closure, this could lead to some problems. Ignoring EPOLLIN is safe because it won't yield any data anyway. Same for EPOLLOUT as it's never sent in conjunction with EPOLLRDHUP!

Josh Paul
  • 76
  • 5
1

epoll_wait() is running in it's own thread and queues fd events to be handled elsewhere. ... So why when the server load is high, epoll thinks that my application didn't get the close-event and queues new one?

Assuming that EPOLLONESHOT is bug free (I haven't searched for associated bugs though), the fact that you are processing your epoll events in another thread and that it crashes sporadically or under heavy load may mean that there is a race condition somewhere in your application.

May be the object pointed to by epoll_event.data.ptr gets deallocated prematurely before the epoll event is unregistered in another thread when you server does an active close of the client connection.

My first try would be to run it under valgrind and see if it reports any errors.

Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271
  • Problem running it under valgrind is that with that huge amount of connections valgrind simply takes too much cpu. And even with smaller connection amounts it seems valgrind/gdb has the effect that the problem simply does not occur anymore. I don't think the problem is premature deallocation. I get events fine from epoll (it never crashes there), application only crashes when same fd gets multiple close events in a row: application tries to free the same resource twice. I think this should never happen, atleast with one-shot edge-triggered sockets.. – Antti Partanen Jan 18 '11 at 13:41
0

Register Signal 0x2000 for Remote host closed connection ex ev.events = EPOLLIN | EPOLLONESHOT | EPOLLET | 0x2000 and check if (flag & 0x2000) for Remote Host Close Connection

0

I would re-check myself against the following sections from epoll(7):

Q6
Will closing a file descriptor cause it to be removed from all epoll sets automatically?

and

o If using an event cache...

There're some good points there.

Nikolai Fetissov
  • 82,306
  • 11
  • 110
  • 171
0

Removing EPOLLONESHOT made the problem disappear after few other changes. Unfortunately I'm not totally sure what caused it. Using EPOLLONESHOT with threads and adding the fd again manually into the epoll queue was quite certainly the problem. Also the data pointer in epoll struct is released after a delay. Works perfectly now.

  • 1
    I think that delay is the trick. When one thread waits for `epoll_wait` to return and another thread (nicely) 1) removes a fd from the epoll interest list with `epoll_ctl(..., EPOLL_CTL_DEL, ...)`, 2) closes the fd, 3) deletes resources that `data.ptr` of the epoll struct was pointing too, then you still have a problem: the first thread could have just returned from `epoll_wait()` before the second thread does all that; if then the first thread continuous running it processes an event with a deleted `data.ptr` (or closed `data.fd`). – Carlo Wood Jul 10 '19 at 01:27