0

Whenever I call this function, memory usage is increases a lot per call, so I think there is some memory leak here.

PyObject *pScript, *pModule, *pFunc, *pValue;
PyObject *pArgs = NULL;
long ret = 1;

// Initialize python, set system path and load the module
pScript = SetPyObjectString(PYTHON_SCRIPT_NAME);
PyRun_SimpleString("import sys");
PyRun_SimpleString("sys.path.append('"PYTHON_SCRIPT_PATH"')");
pModule = PyImport_Import(pScript);
Py_XDECREF(pScript);

if (pModule != NULL) {
    // Get function object from python module
    pFunc = PyObject_GetAttrString(pModule, operation.c_str());

    if (pFunc && PyCallable_Check(pFunc)) {
        // Create argument(s) as Python tuples
        if (operation == UPDATE_KEY) {
            // If operation is Update key, create two arguments - key and value
            pArgs = PyTuple_New(2);
        }
        else {
            pArgs = PyTuple_New(1);
        }

        pValue = SetPyObjectString(key.c_str());
        // Set argument(s) with key/value strings
        PyTuple_SetItem(pArgs, 0, pValue);
        if (operation == UPDATE_KEY) {
            // If operation is Update key, set two arguments - key and value
            pValue = SetPyObjectString(value.c_str());
            PyTuple_SetItem(pArgs, 1, pValue);
        }

        // Call the function using function object and arguments
        pValue = PyObject_CallObject(pFunc, pArgs);
        Py_XDECREF(pArgs);

        if (pValue != NULL) {
            // Parse the return values
            ret = PyLong_AsLong(PyList_GetItem(pValue, 0));
            value = GetPyObjectString(PyList_GetItem(pValue, 1));
        }
        else {
            ERROR("Function call to %s failed", operation.c_str());
        }

        Py_XDECREF(pValue);
        Py_XDECREF(pFunc);
    }
    else {
        ERROR("Cannot find function in python module");
    }
    Py_XDECREF(pModule);
}
else {
    ERROR("Failed to load python module");
}

I am leaking some memory when this C++ snippet in my code calls the python script and I want to know why. I think I am doing something wrong with my Py_DECREFs. Any help would be much appreciated.

Venkat Krishnan
  • 107
  • 1
  • 2
  • 8
  • 1
    You're missing at least one decref—what if `pFunc` is not null, but it's not callable? And you reassign a new value to `pValue` without releasing the old one. You might be missing more, but you really should use a memory debugger rather than trying to guess by inspection. Or, even better—if you're using C++, not C, why not use PyCXX or boost::python or some other C++ wrapper that uses RAII to take care of the refcounting for you? – abarnert Apr 26 '18 at 21:42
  • @abarnert I cannot use any other wrapper because this is a small part of a big project that has a lot of dependencies. It is difficult to use anything other than the default Python C API. Can you take look at the PyObjects pArgs and pValue? I don't understand which function needs a DECREF and which one needs a INCREF – Venkat Krishnan Apr 26 '18 at 21:46
  • You can use PyCXX locally in a big project, You may not be able to use some of the higher-level features like the extension-builder facility or the Python<->STL conversions, but you can easily use just the smart pointers. Or, if not, you can write your own smart pointer. But really, if you can't understand what needs a DECREF you really shouldn't be writing this code. Getting the refcounting right is no something you can do through copying and pasting code you don't understand and shotgun-debugging it until it seems to work. – abarnert Apr 26 '18 at 21:51
  • Okay I will look into PyCXX or boost::python. Thanks! – Venkat Krishnan Apr 26 '18 at 21:59
  • @VenkatKrishnan If you are looking for wrappers, I recommend [pybind11](https://pybind11.readthedocs.org/). – Henri Menke Apr 26 '18 at 23:04
  • @VenkatKrishnan you should not program the Python C API without the documentation in front of you. For any call you make, unless the documentation says it returns a "borrowed reference" then you MUST decref it. If the reference is borrowed, then you must NOT decref it. – jwm Apr 26 '18 at 23:08

1 Answers1

0

I spotted one missing decref from a quick glance:


pFunc = PyObject_GetAttrString(pModule, operation.c_str());

if (pFunc && PyCallable_Check(pFunc)) {
    // ...
    Py_XDECREF(pFunc);
}

This will leak any non-callable attribute matching operation.


The two reassignments of pValue… I think that's OK, because PyTuple_SetItem steals the reference to each of the original values.


For this line that you asked about:

value = GetPyObjectString(PyList_GetItem(pValue, 1));

The PyList_GetItem returns a borrowed reference, so the fact that you don't decref it is correct.

But I don't see the declaration for value anywhere, or GetPyObjectString, so I have no idea what that part is doing. Maybe it's just getting a borrowed buffer out of a PyUnicodeObject * and copying it into some C++ wstring or UTF-32 string type, or maybe it's leaking a Python object or a copied buffer, or returning a raw C buffer that you just leak here, or… who knows?


But I certainly wouldn't trust that some guy on the internet found all of them on a quick scan. Learn to use a memory debugger.

Or: You're using C++. RAII is almost the whole point of using C++—in other words, instead of using raw PyObject * values, you can use a smart pointer that decrefs things for you automatically. Or, even better, use a ready-made library like PyCXX.

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • Okay thanks. This is what I was looking for. I was confused because of this https://docs.python.org/3/extending/extending.html#ownership-rules. It says PyTuple_SetItem() takes over ownership. I will use two different PyObjects for this now. Also, `ret = PyLong_AsLong(PyList_GetItem(pValue, 0));` in this line, should I DECREF anything? – Venkat Krishnan Apr 26 '18 at 21:53
  • @VenkatKrishnan Oops, I may have missed that. I’ll check and update the answer if I got it wrong. I haven’t written raw C API extensions for anything nontrivial in a long time—if I can’t use Cython, or PyCXX, or RustPy, etc., I usually just create a plain C API and write the bindings in Python with cffi or ctypes. In part because this stuff is a pain to debug. But you’re not even trying to debug it, just trying to get it right statically, which makes it even more painful. – abarnert Apr 26 '18 at 22:02