Currently I am implementing a muliti-thread network-client application with epoll. My model is simple:
get client_fd & write request to remote server
set fd nonblocking & add it to epfd(EPOLLIN|EPOLLET|EPOLLONESHOT) to wait for response
get EPOLLIN from fd, read the whole response and release the resources
The problem I encounter is that occasionally I get multiple EPOLLIN on the same fd (BY USING EPOLLIN|EPOLLET|EPOLLONESHOT). Since I had released all the resources (including the client_fd) at the first EPOLLIN evt, the second evt crashed my program.
Any suggestions strongly appreciated:)
Here is the code snippet:
//multi-thread wait on the sem, since there should be only one thread
//at epoll_wait at the same time(L-F model)
sem_wait(wait_sem);
int nfds = epoll_wait(epoll_fd,evts,max_evt_cnt,wait_time_out);
//leader got the fds to proceed
for(int i =0; i < nfds; ++i){
io_request* req = (io_request*)evts[i].data.ptr;
int sockfd = req->fd;
if(evts[i].events & EPOLLIN){
ev.data.fd=sockfd;
if(0!=epoll_ctl(epoll_fd,EPOLL_CTL_DEL,sockfd,&ev)){
switch(errno){
case EBADF:
//multiple EPOLLIN cause EPOLL_CTL_DEL fail
WARNING("delete fd failed for EBADF");
break;
default:
WARNING("delete fd failed for %d", errno);
}
}
else{
//currently walk around by just ignore the error fd
crt_idx.push_back(i);
}
}
}
if(crt_idx.size() != nfds)//just warning when the case happen
WARNING("crt_idx.size():%u != nfds:%d there has been some error!!", crt_idx.size(), nfds);
//current leader waked up next leader, and become a follower
sem_post(wait_sem);
for(int i = 0; i < crt_idx.size(); ++i)
{
io_request* req = (io_request*)evts[crt_idx[i]].data.ptr;
...do business logic...
...release the resources & release the client_fd
}