2

I've run into a strange problem where a huge number of messages from snmplib's snmp_synch_response() are managing to fill up a 60GB hard drive within about three hours. The messages are all "Use snmp_sess_select_info2() for processing large file descriptors", sometimes repeated over a hundred times per line. I'm still working with the customer to figure out how to reproduce this in-house, but I thought I'd ask here in case it was an old issue or, at least, seen by somebody else in some fashion.

Here's the basic system info: 8.1-RELEASE-p2 FreeBSD i386. The NET-SNMP version is 5.5.

Below is a simplified snippet of the key parts of my code. The code first makes a list of tasks with initialized, but not open, sessions. Elsewhere, each task, up to a small limit (64 in this case), is forked and the children open the SNMP session sockets with snmp_open(), and so on. I've scoured each of set(), get(), and getnext(), and am sure that they all call snmp_close() appropriately — there aren't any early returns or other jumps over those calls — so I don't think that I'm explicitly leaking any sockets, but descriptors must be hanging around for some reason. Does this ring any bells for anybody?

for(…){
    …
    snmp_sess_init(&task->sess_info);
    addtask(taskList, task);
    …
}

…

for(task = taskList; task && nkids < maxkids; task = task->next){
    if(fork() == 0){
        set(task);
        get(task);
        getnext(task);
        …
    }
    nkids++;
}

void set(Task *task){
    …
    sess = snmp_open(&task->sess_info);
    …
    pdu = snmp_pdu_create(SNMP_MSG_SET);
    …
    status = snmp_synch_response(sess, pdu, &resp);
    // check return, retr
    snmp_close(sess);
}

void get(Task *task){
    …
    sess = snmp_open(sess_info);
    …
    pdu = snmp_pdu_create(SNMP_MSG_GET);
    …
    status = snmp_synch_response(sess, pdu, &resp);
    // check return, read variables
    snmp_close(sess);
}

void getnext(Task *task){
    …
    sess = snmp_open(sess_info);
    for(obj = task->objs; obj; obj = obj->next){
        …
        pdu = snmp_pdu_create(SNMP_MSG_GET);
        …
        status = snmp_synch_response(sess, pdu, &resp);
        // check return, read variables
    }
    snmp_close(sess);
}
Steve M
  • 8,246
  • 2
  • 25
  • 26

2 Answers2

2

In case anybody manages to run into something similar, this (unsurprisingly) ended up not having anything to do with net-snmp. Each child process communicates back to the parent via their own socket. By the basic nature of fork(), the parent's list of sockets was being copied to each child; the solution was simply to close the sockets in this list in the child code.

Steve M
  • 8,246
  • 2
  • 25
  • 26
1

For those who may end up here googling the same error message. The problem in my code was that making new sessions when the old ones are not properly closed (snmp_close may fail, I did not check for this) MAY throw this error on the new sessions.

I solved this by using snmp_close_sessions().

arynaq
  • 6,710
  • 9
  • 44
  • 74