0

I'm having troubles figuring out what's wrong in my code. I'm trying to use the Python function anderson_ksamp from the scipy.stats module using Python's and numpy's C APIs in my program written in C++. The thing is that I get this error and I can't figure out why.

I've managed to recreate a small piece of code that gives me the error:

// Test.cpp
#include <Python.h>
#include <numpy/arrayobject.h>
#include <iostream>
#include <vector> // std::vector

int limited_rand(int limit) {
    int r, d = RAND_MAX / limit;
    limit *= d;
    do { r = rand(); } while (r >= limit);
    return r / d;
}

void initialize_c_2d_array(int*& c_array, unsigned long row_length_c_array, std::vector<int> &row1, std::vector<int> &row2) {
    for (unsigned int i = 0; i < row_length_c_array; i++) {
        c_array[i] = row1[i];
        c_array[row_length_c_array + i] = row2[i];
    }
}

int main(int argc, const char * argv[]) {

    std::vector<int> left_sample;
    std::vector<int> right_sample;

    for (int i = 0; i < 250; i++) {
        if (i < 200) {
            left_sample.push_back(limited_rand(200));
        }
        right_sample.push_back(limited_rand(200));
    }

    Py_Initialize();

    PyObject* scipy_stats_module = PyImport_ImportModule("scipy.stats"); // importing "scipy.stats" module

    if (scipy_stats_module) {
        import_array();

        while (true) {
            unsigned long k = std::min(left_sample.size(), right_sample.size());
            int* both_samples = (int*) (malloc(2 * k));
            initialize_c_2d_array(both_samples, k, left_sample, right_sample);
            npy_intp dim3[] = {2, (npy_intp) (k)};
            PyObject* both_samples_nparray = PyArray_SimpleNewFromData(2, dim3, NPY_INT, both_samples);

            PyObject* anderson_ksamp = PyObject_GetAttrString(scipy_stats_module, "anderson_ksamp");

            if (anderson_ksamp && PyCallable_Check(anderson_ksamp)) {
                // v------- Getting EXC_BAD_ACCESS on this line ---------v
                PyObject* anderson_2samp_return_val = PyObject_CallFunctionObjArgs(anderson_ksamp, both_samples_nparray, NULL);                   
                Py_DecRef(both_samples_nparray);
                free(both_samples); // <---------- Getting SIGABRT here
                Py_DecRef(anderson_ksamp);

                if (anderson_2samp_return_val) {
                    double p_value = PyFloat_AsDouble(PyTuple_GetItem(anderson_2samp_return_val, 2));
                    Py_DecRef(anderson_2samp_return_val);
                } else {
                    Py_DecRef(anderson_2samp_return_val);
                    printf("Call to scipy.stats.anderson_ksamp failed.\n");
                    PyErr_Print();
                }
            } else {
                Py_DecRef(both_samples_nparray);
                free(both_samples);
                Py_XDECREF(anderson_ksamp);
                std::cout << "Failed to import function scipy.stats.anderson_ksamp.\n";
                PyErr_Print();
            }

        }
    } else {
        Py_XDECREF(scipy_stats_module);
        printf("Failed to import scipy.stats module.\n");
        PyErr_Print();
    }

    Py_Finalize();

    return 0;

}

Every time I execute this piece of code I get:

  • either a SIGABRT error on free(both_samples) with the following error message: Test(4473,0x1003713c0) malloc: *** error for object 0x10076bbb0: incorrect checksum for freed object - object was probably modified after being freed. *** set a breakpoint in malloc_error_break to debug

  • or a EXC_BAD_ACCESS on PyObject* anderson_2samp_return_val = PyObject_CallFunctionObjArgs(anderson_ksamp, both_samples_nparray, NULL);.

I think the problem might be with my initialization of array both_samples since, in my main script, I'm also using the scipy function ks_2samp with exactly the same code that I have here, with the only difference that ks_2samp takes two arrays as arguments and so I don't need to create 2-D array like I have to do here.

One other odd thing that I noticed is that the reference counter of anderson_ksamp after the call to PyObject_GetAttrString(scipy_stats_module, "anderson_ksamp"); jumps directly to 3 instead of going up just by 1.

I'm using Pyhton3.6 and the latest versions of scipy (0.19.0) and numpy (1.13.0).

Thank you in advance for any kind of help.

Edit: this is the output that valgrind gives me when I run valgrind --leak-check=yes Test:

==24810== Memcheck, a memory error detector
==24810== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==24810== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==24810== Command: Test
==24810== 
==24810== Syscall param msg->desc.port.name points to uninitialised byte(s)
==24810==    at 0x1003AD34A: mach_msg_trap (in /usr/lib/system/libsystem_kernel.dylib)
==24810==    by 0x1003AC796: mach_msg (in /usr/lib/system/libsystem_kernel.dylib)
==24810==    by 0x1003A6485: task_set_special_port (in /usr/lib/system/libsystem_kernel.dylib)
==24810==    by 0x10054210E: _os_trace_create_debug_control_port (in /usr/lib/system/libsystem_trace.dylib)
==24810==    by 0x100542458: _libtrace_init (in /usr/lib/system/libsystem_trace.dylib)
==24810==    by 0x1000AB9DF: libSystem_initializer (in /usr/lib/libSystem.B.dylib)
==24810==    by 0x10001BA1A: ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==24810==    by 0x10001BC1D: ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==24810==    by 0x1000174A9: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==24810==    by 0x100017440: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==24810==    by 0x100016523: ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==24810==    by 0x1000165B8: ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) (in /usr/lib/dyld)
==24810==  Address 0x10488ee0c is on thread 1's stack
==24810==  in frame #2, created by task_set_special_port (???:)
==24810== 
==24810== 
==24810== HEAP SUMMARY:
==24810==     in use at exit: 17,836 bytes in 157 blocks
==24810==   total heap usage: 173 allocs, 16 frees, 23,980 bytes allocated
==24810== 
==24810== 72 bytes in 3 blocks are possibly lost in loss record 26 of 41
==24810==    at 0x10009A232: calloc (vg_replace_malloc.c:714)
==24810==    by 0x1005B6846: map_images_nolock (in /usr/lib/libobjc.A.dylib)
==24810==    by 0x1005C9FE8: objc_object::sidetable_retainCount() (in /usr/lib/libobjc.A.dylib)
==24810==    by 0x10000B03B: dyld::notifyBatchPartial(dyld_image_states, bool, char const* (*)(dyld_image_states, unsigned int, dyld_image_info const*), bool, bool) (in /usr/lib/dyld)
==24810==    by 0x10000B255: dyld::registerObjCNotifiers(void (*)(unsigned int, char const* const*, mach_header const* const*), void (*)(char const*, mach_header const*), void (*)(char const*, mach_header const*)) (in /usr/lib/dyld)
==24810==    by 0x10020400A: _dyld_objc_notify_register (in /usr/lib/system/libdyld.dylib)
==24810==    by 0x1005B6074: _objc_init (in /usr/lib/libobjc.A.dylib)
==24810==    by 0x10019768D: _os_object_init (in /usr/lib/system/libdispatch.dylib)
==24810==    by 0x10019763A: libdispatch_init (in /usr/lib/system/libdispatch.dylib)
==24810==    by 0x1000AB9D5: libSystem_initializer (in /usr/lib/libSystem.B.dylib)
==24810==    by 0x10001BA1A: ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==24810==    by 0x10001BC1D: ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==24810== 
==24810== LEAK SUMMARY:
==24810==    definitely lost: 0 bytes in 0 blocks
==24810==    indirectly lost: 0 bytes in 0 blocks
==24810==      possibly lost: 72 bytes in 3 blocks
==24810==    still reachable: 200 bytes in 6 blocks
==24810==         suppressed: 17,564 bytes in 148 blocks
==24810== Reachable blocks (those to which a pointer was found) are not shown.
==24810== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==24810== 
==24810== For counts of detected and suppressed errors, rerun with: -v
==24810== Use --track-origins=yes to see where uninitialised values come from
==24810== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 12 from 12)
jackscorrow
  • 682
  • 1
  • 9
  • 27

1 Answers1

0

Ok, this question is probably dead but I'll answer it anyway. After some days I've figured out my mistake. And of course was something very silly.

I simply wasn't allocating enough space for my array when calling malloc. I wrote:

unsigned long k = std::min(left_sample.size(), right_sample.size());
int* both_samples = (int*) (malloc(2 * k));

but I forgot to take into account that my array is an array of int and that needs memory too, and so the right amount of memory needed is: 2 * k * sizeof(int).

As I said, very stupid...

I fixed that and now everything works just fine.

jackscorrow
  • 682
  • 1
  • 9
  • 27