0

I am creating Python bindings for a C library.

In C the code to use the functions would look like this:

Ihandle *foo;
foo = MethFunc();
SetArribute(foo, 's');

I am trying to get this into Python. Where I have MethFunc() and SetAttribute() functions that could be used in my Python code:

import mymodule
foo = mymodule.MethFunc()
mymodule.SetAttribute(foo)

So far my C code to return the function looks like this:

static PyObject * _MethFunc(PyObject *self, PyObject *args) {
    return Py_BuildValue("O", MethFunc());
}

But that fails by crashing (no errors)

I have also tried return MethFunc(); but that failed.

How can I return the function foo (or if what I am trying to achieve is completely wrong, how should I go about passing MethFunc() to SetAttribute())?

Xantium
  • 11,201
  • 10
  • 62
  • 89
  • You may want to look into generating bindings with [SWIG](http://www.swig.org/). It's probably a lot more convenient than writing up an extension module and defining a bunch of extension types yourself, or going through ctypes for everything. – user2357112 Jul 10 '18 at 21:05
  • Ctypes is definitly not an option. I would prefer to use the standard api if possibl, but thank you for the advice – Xantium Jul 10 '18 at 21:09
  • It's also worth looking into lower-level libraries like PyCxx, boost::python, Pyd, rust-python, etc. Instead of writing code to drive an interface generator like SWIG, you write code in a general-purpose language like C++ or D or Rust that can talk directly to C, and there are templates/macros/etc. that generate Python types, manage Python refcounting, etc. so you don't have to deal with all of that boilerplate (and all of the annoying bugs that you're going to face in doing this). – abarnert Jul 10 '18 at 21:23

1 Answers1

2

The problem here is that MethFunc() returns an IHandle *, but you're telling Python to treat it as a PyObject *. Presumably those are completely unrelated types.

A PyObject * (or any struct you or Python defines that starts with an appropriate HEAD macro) begins with pointers to a refcount and a type, and the first thing Python is going to do with any object you hand it is deal with those pointers. So, if you give it an object that instead starts with, say, two ints, Python is going to end up trying to access a type at 0x00020001 or similar, which is almost certain to segfault.

If you need to pass around a pointer to some C object, you have to wrap it up in a Python object. There are three ways to do this, from hackiest to most solid.


First, you can just cast the IHandle * to a size_t, then PyLong_FromSize_t it.

This is dead simple to implement. But it means these objects are going to look exactly like numbers from the Python side, because that's all they are.

Obviously you can't attach a method to this number; instead, your API has to be a free function that takes a number, then casts that number back to an IHandle* and calls a method.

It's more like, e.g., C's stdio, where you have to keep passing stdin or f as an argument to fread, instead of Python's io, where you call methods on sys.stdin or f.

But even worse, because there's no type checking, static or dynamic, to protect you from some Python code accidentally passing you the number 42. Which you'll then cast to an IHandle * and try to dereference, leading to a segfault…

And if you were hoping Python's garbage collector would help you know when the object is still referenced, you're out of luck. You need to make your users manually keep track of the number and call some CloseHandle function when they're done with it.

Really, this isn't that much better than accessing your code from ctypes, so hopefully that inspires you to keep reading.


A better solution is to cast the IHandle * to a void *, then PyCapsule_New it.

If you haven't read about capsules, you need to at least skim the main chapter. But the basic idea is that it wraps up a void* as a Python object.

So, it's almost as simple as passing around numbers, but solves most of the problems. Capsules are opaque values which your Python users can't accidentally do arithmetic on; they can't send you 42 in place of a capsule; you can attach a function that gets called when the last reference to a capsule goes away; you can even give it a nice name to show up in the repr.

But you still can't attach any behavior to capsules.

So, your API will still have to be a MethSetAttribute(mymodule, foo) instead of mymeth.SetAttribute(foo) if mymodule is a capsule, just as if it's an int. (Except now it's type-safe.)


Finally, you can build a new Python extension type for a struct that contains an IHandle *.

This is a lot more work. And if you haven't read the tutorial on Defining Extension Types, you need to go thoroughly read through that whole chapter.

But it means that you have an actual Python type, with everything that goes with it.

You can give it a SetAttribute method, and Python code can just call that method. You can give it whatever __str__ and __repr__ you want. You can give it a __doc__. Python code can do isinstance(mymodule, MyMeth). And so on.


If you're willing to use C++, or D, or Rust instead of C, there are some great libraries (PyCxx, boost::python, Pyd, rust-python, etc.) that can do most of the boilerplate for you. You just declare that you want a Python class and how you want its attributes and methods bound to your C attributes and methods and you get something you can use like a C++ class, except that it's actually a PyObject * under the covers. (And it'll even takes care of all the refcounting cruft for you via RAII, which will save you endless weekends debugging segfaults and memory leaks…)

Or you can use Cython, which lets you write C extension modules in a language that's basically Python, but extended to interface with C code. So your wrapper class is just a class, but with a special private cdef attribute that holds the IHandle *, and your SetAttribute(self, s) can just call the C SetAttribute function with that private attribute.

Or, as suggested by user, you can also use SWIG to generate the C bindings for you. For simple cases, it's pretty trivial—just feed it your C API, and it gives you back the code to build your Python .so. For less simple cases, I personally find it a lot more painful than something like PyCxx, but it definitely has a lower learning curve if you don't already know C++.

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • 2
    `long` and `PyLong_FromLong` should probably be `size_t` and `PyLong_FromSize_t`, since `long` might not be big enough. (That option is still a bad idea even with `size_t`, though.) – user2357112 Jul 10 '18 at 21:25
  • @user2357112 Fixed. But anyway, the point of that section is really to scare people into understanding why they want to do all the hard work of writing a real extension type (or, better, letting PyCxx do it for them). Anyone who looks at that and says, "Yes, that's what I want, Python objects that can't be inspected, and that regularly cause segfaults" is probably beyond help… – abarnert Jul 10 '18 at 21:32
  • "D, or Rust" are not supported sadly, and I don't know C++ so I think I am stuck with C for the forseeable future. – Xantium Jul 10 '18 at 21:35
  • 1
    @Simon One more option to consider: can you use Cython? It lets you write C extension modules in a language that’s basically Python, except that it can directly interface with C. So your wrapper class looks more like a Python class with a special private member variable than like a bunch of functions using `PyArgs` calls and a couple of structs to tie them together. – abarnert Jul 10 '18 at 21:45
  • Can it deal with things like header files and runtimes? I'm stuck with a bunch of header files and runtimes (the reason I am using C in the first place). I cannot interact with the runtimes directly. Only with the help of the header files – Xantium Jul 10 '18 at 21:54
  • 1
    I don't know what you mean by "runtimes". Do you mean .so/.dylib/.dll files? If so, then yes, Cython can deal with those. It may make the `setup.py` slightly more complicated, and of course you need to distribute those shared libs together with your module, but otherwise, it doesn't really matter whether the C code you're linking to is in `.c` files, `.o`/`.obj` files, or shared libs. – abarnert Jul 10 '18 at 21:57
  • I meant `.dll`s. I'll look into an alternative as you suggest. Cython or SWIG both seem feasable, thank you so much for you detailed answer and help. I really do appreciate it. – Xantium Jul 10 '18 at 22:01
  • 3
    @Simon It's definitely worth learning how to write a C extension module—including at least one real extension type—from scratch. But once you've learned it, it's often worth doing whatever you can (Cython, PyCxx, ctypes/cffi, SWIG, …) to avoid doing it again. :) – abarnert Jul 10 '18 at 22:22