0

I got very strange behavior of my python extension built with the boost.python library. Namely, in the piece of code:

import my_ext

j = 0
while j<5:
    print j
    my_ext.do_something(j)
    j = j + 1

i do not see j being printed out, while the extension code (my_ext.do_something(j)) is doing some work for different j (lets say prints j-th file). Moreover it only prints 2 files for j = 0 and j = 1 and then the whole script finishes without errors or other notifications.

All this makes me think the code is executed kinda in parallel (multi-threading), but without proper handling of such a parallelizm. I guess this may be related to the fact that the boost.python library i have built is made by default with --threading=multi option. However, trying to rebuild with option --threading=single does not give any effect and it is still built as multi-threading library. This post http://mail.python.org/pipermail/cplusplus-sig/2010-October/015771.html reports similar build process problem, however it is unanswered.

So my question is how to build boost libraries and boost.python in particular to be single-threaded. Alternatively, the problem may be related to something other than single/multi threading of boost.python libraries.

Additional info: i'm using cygwin, boost_1.50.0, python2.6, my os is Win 7 with multi-core CPU and nvidia vram (both latter harware may favor multi-threading execution of my extension without letting me know).

user938720
  • 271
  • 3
  • 14

1 Answers1

1

Boost.Python tends to be a special case in the Boost build system because it is coupled to the Python version and configuration for which Python has been built. For example, if Python is built without debug, then Boost.Python will build without debug, even if Boost was explicitly told to build a debug variant. I believe the same holds true for the threading property, as it will indirectly include pyconfig.h by including Python.h.

Python defaults to running programs in a single thread, regardless if Python was built with thread support. Boost.Python does not change this behavior. As a general rule of thumb, threading only becomes a factor with Boost.Python when one desires concurrency. For example, if my_ext.do_something() was going to read a large file into memory, then it may be optimal to perform the read without holding a lock on the interpreter.

Consider starting with a simplified implementation of the extension, then expanding upon it. For example, when I build my_ext as:

#include <iostream>
#include <boost/python.hpp>

void do_something(unsigned int j)
{
  std::cout << "do_something(): " << j << std::endl;
}

BOOST_PYTHON_MODULE(my_ext)
{
  namespace python = boost::python;
  python::def("do_something", &do_something);
}

My test script of:

import my_ext

for j in xrange(5):
    print j
    my_ext.do_something(j)

produces:

0
do_something(): 0
1
do_something(): 1
2
do_something(): 2
3
do_something(): 3
4
do_something(): 4

An alternative is to build debug versions of Python and Boost.Python, then step through the program with a debugger. More information on debug builds can be found here.

Tanner Sansbury
  • 51,153
  • 9
  • 112
  • 169
  • thank you, twsansbury, for the explanation. in my case the actual do_something function was quite complex and it was causing some problems, so once i solved those problems the whole combination started working properly. yet, even in the case of proper execution of the do_something function it may be expected that the multi-threading support could interfere and cause additional problems. But, if i understand you correctly, in most of the cases this should not be a problem unless python is explicitly configured to run extensions with the multiple threads. – user938720 Mar 25 '13 at 23:58
  • 1
    @user938720: Aye. Even if Python is built with thread support and multiple threads have been created in Python, it has limited concurrency. Basically, when calling Python code, the interpreter can do cooperative yielding, causing the active thread to sleep and pending thread to go active. When calling into C-extension code, the extension must explicitly yield (release the interpreter lock) if it wants to support concurrency, then reacquire the lock when returning to Python. – Tanner Sansbury Mar 26 '13 at 00:09