0

Summary of the problem

I am calling my own C extension for Python in Django, which is in charge of some long computations. The extension works fine and even the user can navigate through the Django app while the C extension makes its computations, since I have implemented Global Interpreter Lock (GIL) management inside the extension. However, when another user tries to execute the extension (while it is running for the initial user), Django crashes (process is killed) without any error message.

Code Example

In the code below, you can see the Django view that calls (via a request POST) the C extension ftaCalculate.

import ftaCalculate
from django.views.generic.base import TemplateView

# Computation
class ComputationView(TemplateView):
    template_name = 'Generic/computation_page.html'

    @staticmethod
    def get_results(request):
        if request.method == "POST":  # When button is pressed
            # Long computations
            cs_min = ftaCalculate.mcs(N, tree)
        else:
            cs_min = None
        return render(request, ComputationView.template_name, {'cs_min': cs_min})

Django crashes when two users run in parallel the function ftaCalculate.mcs. I leave hereafter the main function of the C code. Inside function comb is where I make usage of Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS.

// Compute the Minimal cutsets
PyObject* Cmcs(int n, PyObject* tree)
{
    // Initialize cs_all
    cs_all = NULL;

    // Get the leaves of the FTA (pointer)
    PyObject *leaf_seq = PyObject_GetAttrString(tree, "leafs");
    int n_comb = PyObject_Length(leaf_seq);

    // All possible combinations of boolean values
    int *indeces = (int *)malloc(n_comb * sizeof(int));

    // Initialize mcs_index
    ArrayInt num_mcs_index;
    initArrayInt(&num_mcs_index, 1);
    ArrayArray mcs_index;
    initArrayArray(&mcs_index, 1, n_comb);

    // String buffer for 0s/1s
    int* str_buff = (int*) malloc(n_comb * sizeof(int));
    int ind_str = 0;

    // Get the combination
    comb(n_comb, n, str_buff, ind_str, n_comb, 0, leaf_seq, indeces, &num_mcs_index, &mcs_index, tree);
    free(str_buff);
    str_buff=NULL;
    free(indeces);
    indeces=NULL;

    // Order the cutsets
    // Order the cutset by its count
    HASH_SORT(cs_all, val_sort);

    // Make array without redundant cutsets
    PyObject *cs_all_sorted_readable = PyList_New(0);
    
    struct my_struct *current_user, *tmp = NULL;
    const char *cutset_name = NULL;
    HASH_ITER(hh, cs_all, current_user, tmp) {
        cutset_name = current_user->name;
        PyObject *result_array= PyList_New(0);
        for(int i = 0; cutset_name[i] != '\0'; i++) {
            if(cutset_name[i] == '1'){
                PyObject * str_int = PyObject_Str(PyList_GetItem(leaf_seq, i));
                PyList_Append(result_array, str_int);
                Py_XDECREF(str_int);
            }    
        }
        PyList_Append(cs_all_sorted_readable, result_array);
        
        // Free allocated memory space
        Py_CLEAR(result_array);
    }
    free(cutset_name);
    cutset_name = NULL;
    /*free(current_user);
    free(tmp);*/

    // Free allocated memory space
    delete_all(cs_all);  /* free any structures */
    freeArrayInt(&num_mcs_index);
    freeArrayArray(&mcs_index);
    Py_CLEAR(leaf_seq);
    
    return cs_all_sorted_readable;
}

// Our Python binding to our C function
// This will take one and only one non-keyword argument
static PyObject* mcs(PyObject* self, PyObject* args)
{
    // instantiate the expected arguments
    int n; // Order of cutset
    PyObject *cb; // Tree object

    // Parse the arguments
    if(!PyArg_ParseTuple(args, "iO", &n, &cb))
        return NULL;

    // Determine whether the object has the expected structure
    if (!PyObject_HasAttrString(cb, "pk")) {
        PyErr_SetString(PyExc_TypeError, "mcs: the object structure is not the expected one");
        return 0;
    }

    // Function to return the result back to Python
    PyObject* result_cms = Cmcs(n, cb);

    if(result_cms==NULL){
        PyErr_SetString(PyExc_TypeError, "mcs: the returned object is null");
        return NULL;
    }

    return result_cms;
}



// Our Module's Function Definition struct
// We require this `NULL` to signal the end of our method
// definition
static PyMethodDef myMethods[] = {
    { "mcs", mcs, METH_VARARGS | METH_KEYWORDS, "Computes Minimal Cutsets" },
    { NULL, NULL, 0, NULL }
};

// Our Module Definition struct
static struct PyModuleDef ftaCalculate = {
    PyModuleDef_HEAD_INIT,
    "ftaCalculate",
    "Test Module",
    -1,
    myMethods
};

// Initializes our module using our above struct
PyMODINIT_FUNC PyInit_ftaCalculate(void)
{
    return PyModule_Create(&ftaCalculate);
}

Question

Is it normal that this behavior happens or am I missing something that needs to be implemented in the C extension?

David Duran
  • 1,786
  • 1
  • 25
  • 36
  • It sounds like your extension is not thread safe. There is more to that than just correct GIL management, and it is mostly on the C side, for which you haven't presented any code. – John Bollinger Oct 03 '21 at 12:24
  • Hi @JohnBollinger, thanks for your comment. How do I make it "thread safe"? Is there anything specific to do? I have not shown the C code since it is really long, but I can add any requested part you need... – David Duran Oct 03 '21 at 12:30
  • You need to protect any critical section of your code to prevent race condition and then avoid dead locks. It is impossible to tell you what to do without any further information. Write a complete [mcve]. – jlandercy Oct 03 '21 at 12:43
  • Whole books have been written on the topic of multithreaded programming,@DavidDuran. There are *lots* of things you might have to do, depending on the details of your function's behavior. However, one thing that might take you a long way would be to ensure that none of the functions in your extension manipulate any shared data, other than Python objects manipulated while holding the GIL. That is, ensure that your code does not modify any global or static variables, and avoid library functions that maintain internal static state (`rand()` and `strtok()` for example). – John Bollinger Oct 03 '21 at 12:49
  • Thanks for your comments. I have added part of the C code. Anyway, I would look into what you are saying to make sure that I am doing everything fine. – David Duran Oct 03 '21 at 12:52
  • I also take it from your response that you are not familiar with the term "thread safe". This is widely used terminology that is essential for anyone writing multithreaded code to know and understand well. And you *are* writing multithreaded code. django hides many of the details from you, but how else do you think django can respond to other requests while it is also working on your long-running computation? – John Bollinger Oct 03 '21 at 12:55
  • You said you implemented GIL management in the extension, but I don't see that. – John Bollinger Oct 03 '21 at 12:59
  • I have implemented GIL management inside the function `comb`, since there is where I am working with variables only in C. – David Duran Oct 03 '21 at 13:12
  • This is not a [mre], e.g. what is cs_all? It could be your problem by the way. – ead Oct 03 '21 at 13:23
  • Yes, sorry @ead. I will try to make it a simple example: Anyway, to your question: `struct my_struct *cs_all = NULL;` – David Duran Oct 03 '21 at 13:32
  • But how it is guarded from being set to NULL from one thread why used for calculation in another? – ead Oct 03 '21 at 13:37
  • I see, ok. I will carefully look into all possible sources of problems when using multi-threading. Thanks. – David Duran Oct 03 '21 at 13:52

0 Answers0