I'm working with a legacy C library that I've wrapped with a Python C Extension. The C library has a recursive data structure Foo
with an API similar to below:
Foo *Foo_create(void) /* Create new Foo memory */
int Foo_push(Foo *parent, int field, Foo *child) /* Add a child Foo to a parent Foo */
int Foo_destroy(Foo *foo) /* Framework will free all children, caller cannot reuse children after */
Foo *Foo_pop(Foo *parent, int field) /* User responsible for calling Foo_destroy on popped field */
I have a PyFoo
struct that wraps Foo
, something like:
typedef struct {
PyObject_HEAD
Foo *foo;
PyObject *parent;
} PyFoo;
As well as other functions that wrap the Foo_* functions and incref/decref appropriately.
The problem I have come across is that it's possible for two independent PyFoo objects, with independent refcounts, to point to the same Foo *. If one of the PyFoo objects goes out of scope, it will call Foo_destroy, but the user may access the second PyFoo object and cause a segmentation fault.
I'm trying to prevent the user of my library from doing the following in Python:
parent = Foo() # Foo_create(); parent's refcount is 1
a = Foo() # Foo_create(); a's refcount is 1
parent[1] = a # Foo_push(parent, 1, a); parent's refcount is 2; a's refcount is 1
b = parent.pop(1) # Foo_pop(parent, 1);
# parent's refcount is 1; a's refcount is 1; b's refcount is 1
# a and b's are now independent PyFoo objects with reference count = 1
# HOWEVER both of the *foo pointers point to the same memory
# Delete a, dropping reference count to 0, which calls Foo_destroy
del a # parents refcount is 1; a's refcount is 0; b's refcount is 1
# Access b, which may segfault, since Foo_destroy was called in the last call.
print(b)
In other words, a
and b
both point to the same Foo
memory. However, they are independent Python objects, with independent refcounts. Once a
goes out of scope, it will destroy the memory that b
points to. Accessing b
will probably segfault.
This seems like this would be a common problem in writing Python Extensions.
I suppose what I want is some way to base the reference counting on the Foo pointer. For example, a
and b
should actually have the same identity in the above example. Or perhaps what I need is some data structure that counts the number of PyFoos that share the same Foo pointer, and Foo_destroy is only called when the the count for the Foo pointer drops to 0.
What is the idiomatic way to solve this problem?
Here is the corresponding scenario in C:
Foo *parent = Foo_create();
Foo *a = Foo_create();
Foo_push(parent, 1, a);
Foo *b = Foo_pop(parent, 1);
/* a and b both point to same memory */
Foo_destroy(a);
/* better not access b after this */
a = NULL;
b = NULL;