i am getting a segmentation violation in the python interpreter when trying to access a variable which is returned by my own OpenMP C++ extension.
All the solutions which i have found are either using ctypes or cython (which i cannot use). http://snatverk.blogspot.de/2012/03/c-parallel-modules-for-python.html shows a small example of a OpenMP enabled python extension. Although i tried to implement my for loops like in the example, it still does not work.
My Code extension code function looks like this:
static PyObject *
matcher_match(PyObject *self, PyObject *args)
{
if(PyTuple_Size(args) != 2)
{
return NULL;
}
PyObject *names = PyTuple_GetItem(args, 0);
Py_ssize_t namesSize = PyList_Size(names);
PyObject *namesB = PyTuple_GetItem(args, 1);
Py_ssize_t namesBSize = PyList_Size(namesB);
PyObject *matchIdcs = PyList_New(namesSize);
Py_BEGIN_ALLOW_THREADS;
int i, j;
#pragma omp parallel for private(i, j)
for(i = 0; i < namesSize; i++)
{
for(j = 0; j < namesBSize; j++)
{
// test_pair_ij is a pure C function without callbacks into python
// it only uses the C++ STL like std::vector
float a = PyFloat_AsDouble(PyList_GetItem(names, i));
float b = PyFloat_AsDouble(PyList_GetItem(namesB, j));
bool res = test_pair_ij(a, b)
PyObject *matchVal;
if(res)
{
matchVal = Py_BuildValue("i", j);
}
else
{
matchVal = Py_BuildValue("i", -1);
}
PyList_SetItem(matchIdcs, i, matchVal);
}
}
Py_END_ALLOW_THREADS;
return matchIdcs;
}
The function matcher_match() receives two lists, names and namesB. I check every combination of names and namesB (their float attributes) for a specific condition which is indicated by the function test_pair_ij(). The function is a pure C(++) implementation which does not callback into python.
The C extension is called with:
from matcher import match
# some random lists for this example
names = ['123', '231', ...]
namesB = ['342', ...]
matchResult = match(names, namesB)
import pandas as pd
mr = pd.Series(matchResult)
mr.to_csv('matchResult.csv')
When the lists names and namesB are rather small, the code is running ok. But with larger lists, i cannot access matchResult anymore in the python code. When i try to, i get a segmentation violation (which is inside the python interpreter i guess). I have recompiled the C extension without openmp and it ran ok again, even with the larger lists.
I guess the problem is some messup in the memory of the python variables which i access from my extension. This may have to do with the GIL, although i am releasing and acquiring it. Do i need to make any more variables private in this case? Any other ideas on this?
EDIT: fixed calling arguments of function test_pair_ij.
EDIT 2: fixed code of storing matchIdcs
ANSWER:
the code was releasing the GIL and the call to PyList_SetItem(matchIdcs, i, matchVal); was modifying a python structure, which is not allowed (see http://docs.cython.org/src/userguide/external_C_code.html#releasing-the-gil).