1

I have a C application that embeds the Python 2.7 interpreter. At some point in my program, a potentially large string (char*) is generated and needs to be processed by some Python code. I use PyObject_CallFunction to call the Python function and pass the string as an argument. This Python function then uses the multiprocessing library to analyze the data in a separate process.

Passing the string to the Python function will create a copy of the data in a Python str object. I tried to avoid this extra copy by passing a buffer object to the Python function. Unfortunately, this generates an error in the multiprocessing process during unpickling:

TypeError: buffer() takes at least 1 argument (0 given)

It seems as though buffer objects can be pickled, but not unpickled.

Any suggestions on passing the char* from C to the multiprocessing function without making an extra copy?

flashk
  • 2,451
  • 2
  • 20
  • 31
  • I'm not sure how exactly this could be implemented in your situation, but try using ``multiprocessing.sharedctypes`` somehow. I used it to to pass the data between python processes without copy – dmytro Mar 29 '12 at 19:33
  • Would it be preferable to write it to a pipe or other file resource and pass a handle to it? – Steve Mayne Mar 29 '12 at 20:38

1 Answers1

1

Approach that worked for me:

Before you create your big C string, allocate memory for it using Python:

PyObject *pystr = PyString_FromStringAndSize(NULL, size);
char *str = PyString_AS_STRING(pystr);
/* now fill <str> with <size> bytes */

This way, when the time comes to pass it to Python, you don't have to create a copy:

PyObject *result = PyObject_CallFunctionObjArgs(callable, pystr, NULL);
/* or PyObject_CallFunction(callable, "O", pystr) if you prefer */

Note that you shouldn't modify the string once this is done.

yak
  • 8,851
  • 2
  • 29
  • 23