Python C-API: segmentation fault on PyDict_GetItem, possible reference problem?

Question

I have a reference in C to a Python list of dictionaries. I am writing a function to calculate the dot product between two members of the list:

PyObject *handle; // reference to a list of dictionaries
virtual float dot_product () (unsigned i, unsigned j) const {
    // dot product of handle[i] and handle[j]
    PyObject *a = (PyObject*)PyList_GetItem(handle, (Py_ssize_t)i);
    PyObject *b = (PyObject*)PyList_GetItem(handle, (Py_ssize_t)j);
    PyObject *key, *a_value;
    Py_ssize_t pos = 0;
    double dot_product = 0;
    while (PyDict_Next(a, &pos, &key, &a_value)) {
        PyObject* b_value = PyDict_GetItem(b, key);
        if (b_value != NULL){
            dot_product += PyFloat_AsDouble(a_value) * PyFloat_AsDouble(b_value);
        }
    }
    return dot_product;
}

This results in a segmentation fault. Using gdb to debug, it appears that the segmentation fault is being caused by PyDict_GetItem(b, key). This makes me suspect there's something wrong with my reference counts.

After reading the documentation on Reference Counts, it seems that all of the references in the above code are borrowed, so I assumed there was no need to use Py_INCREF or Py_DECREF...but I could easily be wrong. Is there a place in the above code that I need to use Py_INCREF or Py_DECREF?

Edit: I should note that I have already done checks to make sure a and b are not null, and also checks to ensure i and j do not exceed the size of the list. I removed these checks from the code in my question to keep it simpler. –

Are there any threads, at all, involved in this program? Are the keys of the `dict`s Python built-ins, or some custom class? By not holding owned references, it's possible the `__hash__` or `__eq__` methods of your keys either directly change the `list`s or `dict`s you're working with, or invoke Python code that allows for GIL handoff, allowing some other thread to begin running and do the same thing. — ShadowRanger, Dec 18 '18 at 01:24
Beyond that, you aren't checking return values; `a` and `b` could easily be `NULL` if `j` is greater than the length of the `list` stored in `handle`, and if you don't check the return values, you'll end up trying to deference null pointers. `PyDict_GetItem` suppresses the exception for a missing key, but it doesn't try to guard against a `NULL` value for the `dict` itself. — ShadowRanger, Dec 18 '18 at 01:27
The keys of the dict are integers, so Python built-ins. You might be onto something with threads: the encompassing program is multi-threaded, and so dot_product may be called simultaneously. I will try a single-threaded version and see if that fixes things. — Agargara, Dec 18 '18 at 01:56
It also depends on whether your `dot_product()` call is wrapped directly under a Python call or not. Python function calls increment the ref count implicitly for you, but if you are actively calling your dot_product from C++ you need to do that by yourself. — YiFei, Dec 18 '18 at 04:04
Are you using Python 3.10 or later? In earlier versions, `PyDict_GetItem` does not check for the GIL, which can cause problems. — o11c, Dec 09 '21 at 02:57

score 0 · Answer 1 · answered Dec 18 '18 at 01:38

Check your return values. Both a and b can be NULL if i and j respectively exceed the length of the list referenced by handle. PyDict_GetItem assumes the dict passed is not a NULL pointer, dereferencing it without confirming that assumption, which would cause an immediate segfault.

You main problem is determining how to report an error. A C++ exception would work for C++, but Python won't understand it unless you catch it and convert it to a Python level exception. In any event, until you figure that out, returns NaN to indicate failure:

#include <cmath>

PyObject *handle; // reference to a list of dictionaries
virtual float dot_product () (unsigned i, unsigned j) const {
    // dot product of handle[i] and handle[j]
    PyObject *key, *a_value;
    Py_ssize_t pos = 0;
    double dot_product = 0;

    // Check both indices are valid
    PyObject *a = (PyObject*)PyList_GetItem(handle, (Py_ssize_t)i);
    if (!a) return NAN;

    PyObject *b = (PyObject*)PyList_GetItem(handle, (Py_ssize_t)j);
    if (!b) return NAN;

    // Test if you actually got dicts
    if (!PyDict_Check(a) || !PyDict_Check(b)) return NAN;

    while (PyDict_Next(a, &pos, &key, &a_value)) {
        PyObject* b_value = PyDict_GetItem(b, key);
        if (b_value != NULL){
            // Check that both values are really Python floats and extract C double
            double a_val = PyFloat_AsDouble(a_value);
            if (a_val == -1.0 && PyErr_Occurred()) return NAN;

            double b_val = PyFloat_AsDouble(b_value);
            if (b_val == -1.0 && PyErr_Occurred()) return NAN;

            dot_product += a_val * b_val;
        }
    }
    return dot_product;
}

I should have said this in my question, but I've already done null checking and making sure i and j do not exceed the size of the list. I removed the checking from the code in my question to keep it simpler. — Agargara, Dec 18 '18 at 01:48

Python C-API: segmentation fault on PyDict_GetItem, possible reference problem?

1 Answers1