So you have a number of issues that all need to be corrected. I've listed them all under separate headings so you can go through them one at a time.
Always returning listObj
When you get an error in your for loop, you would goto
the error label, which was still returning the list. By returning this list you hide that there was an error in your function. You must always return NULL
when you expect your function to raise an exception.
Does not increment listObj
ref count on return
When your function is invoked you are given a borrowed reference to your arguments. When you return one of those arguments you are creating a new reference to your list, and so must increment its reference count. Otherwise the interpreter will have a reference count that is one lower than the number of actual references to the object. This will end up with a bug where the interpreter deallocates your list when there is only 1 reference rather than 0! This could result in a seg fault, or it could in the worst case scenario result in random parts of the program access the that has since been deallocated and allocated for some other object.
Uses PyObject_SetItem
with primitive
PyObject_SetItem
can be used with dicts and other class that implements obj[key] = val
. So you cannot supply it with an argument of type Py_ssize_t
. Instead, use PyList_SetItem
which only accepts Py_ssize_t
as its index argument.
Bad memory handling of item
and incremented_item
PyObject_SetItem
and PyList_SetItem
both handle decreasing the reference count of the object that was already at the position that was being set. So we don't need to worry about managing the reference count of item
as we are only working with a reference borrowed from the list. These pair of functions also steal a reference to incremented_item
, and so we don't need to worry about managing its reference count either.
Memory leak on incorrect arguments
For example, when you call your function with an int. You will create a new reference to the 100 int object, but because you return NULL
rather than goto error
, this reference will be lost. As such you need to handle such scenarios differently. In my solution, I move the PyLong_FromLong
call to after the arg and type checking. In this way we are only create this new* object once we are guaranteed it will be used.
Working code
Side note: I removed the goto statements as there was only one left, and so it made more sense to do the error handling at that point rather than later.
static PyObject *
testadd_update_list(PyObject *self, PyObject *args)
{
PyObject *listObj = NULL;
PyObject *item = NULL;
PyObject *mult = NULL;
PyObject *incremented_item = NULL;
Py_ssize_t numLines;
if (!PyArg_ParseTuple(args, "O:update_list", &listObj))
{
return NULL;
}
if (!PyList_Check(listObj)) {
PyErr_BadArgument();
return NULL;
}
/* get the number of lines passed to us */
// Don't want to rely on the error checking of this function as it gives a weird stack trace.
// Instead, we use Py_ListCheck() and PyErr_BadArgument() as above. Since list is definitely
// a list now, then PyList_Size will never throw an error, and so we could use
// PyList_GET_SIZE(listObj) instead.
numLines = PyList_Size(listObj);
// only initialise mult here, otherwise the above returns would create a memory leak
mult = PyLong_FromLong(100);
if (mult == NULL) {
return NULL;
}
for (Py_ssize_t i=0; i<numLines; i++) {
// pick the item
// It is possible for this line to raise an error, but our invariants should
// ensure no error is ever raised. `list` is always of type list and `i` is always
// in bounds.
item = PyList_GetItem(listObj, i);
// increment it, and check for type errors or memory errors
incremented_item = PyNumber_Add(item, mult);
if (incremented_item == NULL) {
// ERROR!
Py_DECREF(mult);
return NULL;
}
// update the list item
// We definitely have a list, and our index is in bounds, so we should never see an error
// here.
PyList_SetItem(listObj, i, incremented_item);
// PyList_SetItem steals our reference to incremented_item, and so we must be careful in
// how we handle incremented_item now. Either incremented_item will not be our
// responsibility any more or it is NULL. As such, we can just remove our Py_XDECREF call
}
// success!
// We are returning a *new reference* to listObj. We must increment its ref count as a result!
Py_INCREF(listObj);
Py_DECREF(mult);
return listObj;
}
Footnote:
* PyLong_FromLong(100)
doesn't actually create a new object, but rather returns a new reference to an existing object. Integers with low values (0 <= i < 128
I think) are all cached and this same object is returned when needed. This is an implementation detail that is meant to avoid high levels of allocating and deallocating integers for small values, and so improve the performance of Python.