0

I'd like to use the numpy c-api to write a datatype that's a tree structure, with pointers to children. I initially thought this possible without being a flexible datatype, since I don't need a variable number of direct children, and I'm fine with copying them. However, I'm not sure there's a way around calling malloc to allocate them - the number of indirect children isn't fixed. This strikes me as counter to the philosophy of the existing datatype api; I've used it a few times, and I don't believe there's any descr function called on deallocation in PyArray_ArrFuncs. Is it even possible to write any recursive datatype with variable indirect children in the current api? How about the experimental one, with parametric datatypes?

I wrote a small proof-of-concept that makes the malloc calls without freeing the memory and it works fine, but that of course can't scale. Any workarounds?

bockyboh
  • 11
  • 1
  • 2
  • Seems like a bit of an odd thing for numpy, but you can store a tree as a flat array of nodes, and use indices instead of pointers... – Dan Mašek Nov 04 '22 at 18:48
  • if i recall correctly there is a field that you use to define the children, see the "supporting cyclic garbage collection" in the documentation https://docs.python.org/3/extending/newtypes_tutorial.html#supporting-cyclic-garbage-collection – Ahmed AEK Nov 04 '22 at 19:24
  • I haven't worked with the c-api level, but I don't see how this can be done. There are 3 types of `dtype`. 1) standard numeric ones and strings, 2) object dtype, 3) compound dtype. All of these have a determined `itemsize`, so the storage of array is fixed - by shape and `itemsize`. `object` dtype arrays are a lot like lists, with similar processing speed. As with lists the objects can themselves be arrays (or lists) of objects. Compound dtypes can be nested, but not in any sort of flexible or recursive sense. – hpaulj Nov 04 '22 at 19:53

0 Answers0