1

I am moderately experienced in python and C but new to writing python modules as wrappers on C functions. For a project I needed one function named "score" to run much faster than I was able to get in python so I coded it in C and literally just want to be able to call it from python. It takes in a python list of integers and I want the C function to get an array of integers, the length of that array, and then return an integer back to python. Here is my current (working) solution.

static PyObject *module_score(PyObject *self, PyObject *args) {
    int i, size, value, *gene;
    PyObject *seq, *data;

    /* Parse the input tuple */
    if (!PyArg_ParseTuple(args, "O", &data))
        return NULL;
    seq = PySequence_Fast(data, "expected a sequence");
    size = PySequence_Size(seq);

    gene = (int*) PyMem_Malloc(size * sizeof(int));
    for (i = 0; i < size; i++)
        gene[i] = PyInt_AsLong(PySequence_Fast_GET_ITEM(seq, i));

    /* Call the external C function*/
    value = score(gene, size);

    PyMem_Free(gene);

    /* Build the output tuple */
    PyObject *ret = Py_BuildValue("i", value);
    return ret;
}

This works but seems to leak memory and at a rate I can't ignore. I made sure that the leak is happening in the shown function by temporarily making the score function just return 0 and still saw the leaking behavior. I had thought that the call to PyMem_Free should take care of the PyMem_Malloc'ed storage but my current guess is that something in this function is getting allocated and retained on each call since the leaking behavior is proportional to the number of calls to this function. Am I not doing the sequence to array conversion correctly or am I possibly returning the ending value inefficiently? Any help is appreciated.

hackartist
  • 5,172
  • 4
  • 33
  • 48
  • It think, Python has a memory pool, and calling `PyMem_Free` will not free the pointer immediately. It will free it internaly so Python can reuse it without allocating it again. But, I am not sure. – Iharob Al Asimi Dec 18 '14 at 23:02
  • Are you on Linux? And how did you determine that there was a leak? – Iharob Al Asimi Dec 18 '14 at 23:03
  • I am on windows using cygwin which is a Linux like environment. I used windows task manager and top to see the python process eat more and more memory, then put in random waits with prints to see that the memory only was growing when this and not other python functions were being called. – hackartist Dec 18 '14 at 23:08
  • I would recommend a `memory debugging tool`, for Linux there is a great one called `valgrind`, may be you can search the web for any similar tool for windows. The function you posted, doesn't look wrong at all, at least to me. – Iharob Al Asimi Dec 18 '14 at 23:10
  • Note that you do have one problem, i.e. you are allocating your memory as `sizeof(int)*size` but you are filling it with `longs` rather than `int`s. – Steve Barnes Dec 18 '14 at 23:14
  • @SteveBarnes I had searched around for PyInt_AsInt but this doesn't exist since apparently python ints are C longs? How should I do this instead so that I am actually using C ints? – hackartist Dec 18 '14 at 23:17
  • @hackartist There is no `PyInt_AsInt` for `Python 3`, all integers are `long`s. – Iharob Al Asimi Dec 19 '14 at 01:08

1 Answers1

3

seq is a new Python object so you will need delete that object. You should check if seq is NULL, too.

Something like (untested):

static PyObject *module_score(PyObject *self, PyObject *args) {
    int i, size, value, *gene;
    long temp;
    PyObject *seq, *data;

    /* Parse the input tuple */
    if (!PyArg_ParseTuple(args, "O", &data))
        return NULL;
    if (!(seq = PySequence_Fast(data, "expected a sequence")))
        return NULL;

    size = PySequence_Size(seq);

    gene = (int*) PyMem_Malloc(size * sizeof(int));
    for (i = 0; i < size; i++) {
        temp = PyInt_AsLong(PySequence_Fast_GET_ITEM(seq, i));
        if (temp == -1 && PyErr_Occurred()) {
            Py_DECREF(seq);
            PyErr_SetString(PyExc_ValueError, "an integer value is required");
            return NULL;
        }
        /* Do whatever you need to verify temp will fit in an int */
        gene[i] = (int*)temp;
    }

    /* Call the external C function*/
    value = score(gene, size);

    PyMem_Free(gene);
    Py_DECREF(seq):

    /* Build the output tuple */
    PyObject *ret = Py_BuildValue("i", value);
    return ret;
}
Steve Barnes
  • 27,618
  • 6
  • 63
  • 73
casevh
  • 11,093
  • 1
  • 24
  • 35
  • Sorry I'm new at this... how do I delete the seq object? I don't see anywhere in your example that you delete it. It shouldn't be freed unless it was created with calloc or malloc right? – hackartist Dec 18 '14 at 23:20
  • I updated the answer. To delete a Python object, use `Py_DECREF()`. I also included a check for the return value of `PyInt_AsLong()` and added a place to verify the size of value before casting to an int. – casevh Dec 18 '14 at 23:30
  • Thank you, that was it! No more leaking memory now. – hackartist Dec 19 '14 at 00:48