I came across some funny memory behavior working with numpy + cython, while trying to get data from a numpy array as a C array, to use in a GIL-free function. I've taken look at both the cython and numpy's array API but I haven't found any explanation. So consider the following lines of code:
cdef np.float32_t *a1 = <np.float32_t *>np.PyArray_DATA(np.empty(2, dtype="float32"))
print "{0:x}".format(<unsigned int>a1)
cdef np.float32_t *a2 = <np.float32_t *>np.PyArray_DATA(np.empty(2, dtype="float32"))
print "{0:x}".format(<unsigned int>a2)[]
I allocate two numpy arrays with numpy's empty function, and want to retrieve a pointer to the data buffer for each of them. You would expect these two pointers to point to two different memory addresses on the heap, possibly spaced by 2*4 bytes. But no, I get pointers to the same memory address, e.g.
>>>96a7aec0
>>>96a7aec0
How come? I managed to work around that by declaring my numpy arrays outside of the PyArray_DATA call, in such case, I get what I expect.
The only explanation I can think of, is that I don't create any Python object out of the scope of the PyArray_DATA function, and calling this function doesn't increment Python's reference count. Therefore the GC reclaims this memory space right after, and the next array is allocated at the now free previous memory address. Could somebody more cython-savvy than me could confirm that or give another explanation?