12

I'm working on my first C++ extension for a python program. I have been trying to debug this particular piece of code for hours and I am out of ideas.

The segfault seems to have something to do with the PyArrayObject old_simplices_array that is getting passed to the C++ code. That object is a 2d numpy array of type uint32.

This code was modified directly from what scipy.weave puts together. Everything works fine when the code is formatted for and used by scipy.weave.inline. This seems to eliminate the python portion of my program and the algorithm itself from being possible culprits.

That just leaves the syntax and types. Does anyone see any incorrect syntax or type casting the code?

static PyObject* exterior(PyObject* self,
                          PyArrayObject* old_simplices_array)
{
    const short unsigned int step = old_simplices_array->dimensions[1];
    const short unsigned int j_max = step - 1;
    const long unsigned int col_max = 
        old_simplices_array->dimensions[0] * step;
    short unsigned int j, k, face_index;
    long unsigned int col;
    unsigned int num_simplices = 0;

    PyObject* indices = PyList_New(0);
    PyObject* indptr =  PyList_New(0);
    PyObject* data =  PyList_New(0);
    PyObject* simplices = PyList_New(0);
    PyList_Append(indptr, PyLong_FromLong(0));
    PyObject* simplex_to_index = PyDict_New();

    for(col = 0; col < col_max; col+=step)
    {
        for(j = 0; j <= j_max; j++)
        {
            face_index = 0;
            PyObject* face = PyTuple_New(j_max);
            for(k = 0; k <= j_max; k++)
            {
                if(j != k)
                {
                    PyTuple_SetItem(face, face_index, 
                        PyLong_FromLong(old_simplices_array->data[col + k]));
                    face_index++;
                }
            }

            if(PyDict_Contains(simplex_to_index, face))
            {
                PyList_Append(indices, 
                    PyDict_GetItem(simplex_to_index, face));
            }
            else
            {
                PyDict_SetItem(simplex_to_index, face, 
                    PyLong_FromLong(num_simplices));
                PyList_Append(simplices, face);
                num_simplices++;
            }
            PyList_Append(data, PyLong_FromLong(1 - 2 * (j % 2)));
        }
        PyList_Append(indptr, PyLong_FromLong(col + j));
    }
    return PyTuple_Pack(3, PyTuple_Pack(3, data, indices, indptr), simplices,
        simplex_to_index);
}                                

------UPDATE------

gdb indicates

const short unsigned int step = old_simplices_array->dimensions[1];

causes a segfault. Did I misuse types?

------UPDATE------

Despite GDB telling me,

const short unsigned int step = old_simplices_array->dimensions[1];

causes the segfault, if I return from the program just before the for loop, I get no segfault (just an error on the python side complaining about returning a NoneType).

This is the full backtrace:

Program received signal SIGSEGV, Segmentation fault.
exterior (self=<optimized out>, old_simplices_array=0xec0a50)
    at src/_alto.cpp:39
warning: Source file is more recent than executable.
39      const short unsigned int step = old_simplices_array->dimensions[1];
(gdb) bt
exterior (self=<optimized out>, old_simplices_array=0xec0a50)
    at src/_alto.cpp:39
0x00007ffff7aedad2 in PyEval_EvalFrameEx ()
   from /usr/lib/sagemath/local/lib/libpython2.7.so.1.0
0x00007ffff7aeddc9 in PyEval_EvalFrameEx ()
   from /usr/lib/sagemath/local/lib/libpython2.7.so.1.0
0x00007ffff7aee902 in PyEval_EvalCodeEx ()
   from /usr/lib/sagemath/local/lib/libpython2.7.so.1.0
0x00007ffff7a70ad6 in ?? ()
   from /usr/lib/sagemath/local/lib/libpython2.7.so.1.0
0x00007ffff7a4565e in PyObject_Call ()
   from /usr/lib/sagemath/local/lib/libpython2.7.so.1.0
0x00007ffff7a53b80 in ?? ()
   from /usr/lib/sagemath/local/lib/libpython2.7.so.1.0
0x00007ffff7a4565e in PyObject_Call ()
   from /usr/lib/sagemath/local/lib/libpython2.7.so.1.0
0x00007ffff7aaaea0 in ?? ()
   from /usr/lib/sagemath/local/lib/libpython2.7.so.1.0
0x00007ffff7aa68bc in ?? ()
   from /usr/lib/sagemath/local/lib/libpython2.7.so.1.0
0x00007ffff7a4565e in PyObject_Call ()
   from /usr/lib/sagemath/local/lib/libpython2.7.so.1.0
0x00007ffff7ae9bce in PyEval_EvalFrameEx ()
   from /usr/lib/sagemath/local/lib/libpython2.7.so.1.0
0x00007ffff7aee902 in PyEval_EvalCodeEx ()
   from /usr/lib/sagemath/local/lib/libpython2.7.so.1.0
0x00007ffff7aeea32 in PyEval_EvalCode ()
   from /usr/lib/sagemath/local/lib/libpython2.7.so.1.0
0x00007ffff7b103fa in PyRun_FileExFlags ()
   from /usr/lib/sagemath/local/lib/libpython2.7.so.1.0
0x00007ffff7b10e3d in PyRun_SimpleFileExFlags ()
   from /usr/lib/sagemath/local/lib/libpython2.7.so.1.0
0x00007ffff7b26972 in Py_Main ()
   from /usr/lib/sagemath/local/lib/libpython2.7.so.1.0
0x00007ffff6d29ea5 in __libc_start_main ()
   from /lib/x86_64-linux-gnu/libc.so.6
0x00000000004006d1 in _start ()
UpAndAdam
  • 4,515
  • 3
  • 28
  • 46
Alex Eftimiades
  • 2,527
  • 3
  • 24
  • 33
  • did you tried with a debugger? At what line you have segfault? – hidrargyro Sep 12 '13 at 18:46
  • @hidrargyro thanks for the suggestion. That pointed me to one of the first lines: const short unsigned int step = old_simplices_array->dimensions[1]; – Alex Eftimiades Sep 12 '13 at 18:56
  • What happens if you break there and `p old_simplices_array` first? The fact that it's happening immediately means that the problem isn't in this function, it's in the caller; either `old_simplices_array` is null or otherwise broken, or (less likely) `old_simplices_array->dimensions` is of length less than 2. – Danica Sep 12 '13 at 19:31
  • 1
    @Feynman According to the doc, it looks like `PyArrayObject.nd` tells you how big the dimension array is passed in. Can you `assert(old_simplices_array->nd >= 2);` doesn't fail? – greatwolf Sep 12 '13 at 19:45
  • @Dougal I can confirm that the data structure getting passed to the C++ function "exterior" is a 2 dimensional numpy array. Unless it is passed as a PyObject*, I have no idea why old_simplices_array would be broken. – Alex Eftimiades Sep 12 '13 at 19:46
  • @greatwolf, I tried your suggestion and the assertion holds fine. – Alex Eftimiades Sep 12 '13 at 19:49
  • Can you also add the backtrace to your question that you get when it segfaults? – greatwolf Sep 12 '13 at 19:53
  • Note, you can get a full backtrace using `bt` in gdb. – greatwolf Sep 12 '13 at 20:00
  • The `dimensions` member of the `PyArrayObject` struct is of type `npy_intp`, see [here](http://docs.scipy.org/doc/numpy/reference/c-api.types-and-structures.html#pyarray-type). `npy_intp` is defined in `npy_common.h` to be an alias for `Py_intptr_t`. `Py_intptr_t` is defined in `pyport.h` and depends on the platform, but will typically be an `int` or a `long`, not a `short`. If you think the type is what's causing the segfault, try typing it as `npy_intp` or `Py_intptr_t`, or at least drop the `short`. – Jaime Sep 12 '13 at 21:07
  • @Jaime nice try, but I still get a segfault--even when I change all my numbers to type npy_intp. – Alex Eftimiades Sep 12 '13 at 21:13
  • 1
    try to print the old_simplices_array->dimensions values, only for check the integrity of the object – hidrargyro Sep 12 '13 at 21:14
  • @hidrargyro printf("%ld",old_simplices_array->dimensions[0]); results in a segfault for both dimension 0 and dimension 1. – Alex Eftimiades Sep 12 '13 at 21:23
  • 1
    I'm almost certain your problems are coming from the way you are accessing `old_simplices_array->data`. Is your array contiguous? What dtype is it? – Jaime Sep 12 '13 at 21:27
  • @Jaime, originally my array was not contiguous, but I tried changing it and it made no difference (I have since stuck with the contiguous version of the python code anyway). I can comment out the part of the code that accesses the data anyway and I will still get the segfault. The dtype is int32. – Alex Eftimiades Sep 12 '13 at 21:32
  • The point is that `old_simplices_array->data` is of type `char *`, it's just a stream of bytes. To access the data you need to cast it, something like `*(npy_uint32 *)(old_simplices_array->data + data_offset)`, and the steps you add to `data_offset` have to be multiplied by `sizeof(npy_uint32)`. You can get the actual strides to advance in the array from `old_simplices_array->strides`, simialrly to `old_simplices_array->strides`. But if your array is contiguous this should not send you out of the owned memory of the array... – Jaime Sep 12 '13 at 21:43
  • @Jaime If you posted this as an answer, this question would come off the unanswered list, which it has been on for the longest time. – Marcin Sep 18 '13 at 16:49
  • @Marcin I think what I found are other bugs different from the one causing the segfault. They would produce weird data values, but not access unallocated memory. SO it isn't really an asnwer to the question... – Jaime Sep 18 '13 at 17:48

1 Answers1

1

In general, the signature of a method in a C module is PyObject* f(PyObject* self, PyObject* args), where args is intended to be parsed by PyArg_ParseTuple. You can see this in the code scipy.weave generates: http://docs.scipy.org/doc/scipy/reference/tutorial/weave.html#a-quick-look-at-the-code). Unless there's some wrapper function you haven't posted that calls PyArg_ParseTuple for you, your exterior method must call it to extract the PyArrayObject from the generic PyObject* args.

Ben Darnell
  • 21,844
  • 3
  • 29
  • 50