4

I am using the following code to try to work with C++ vectors:

from libcpp.vector cimport vector                                                                                                                                         

cdef struct StartEnd:
    long start, end 

cdef vector[StartEnd] vect
print(type(vect))
cdef int i
cdef StartEnd j
k = {}
k['hi'] = vect
for i in range(10):
    j.start = i 
    j.end = i + 2 
    k['hi'].push_back(j)
for i in range(10):
    print(k['hi'][i])

The exact functionality here isn't important, this is just a dummy program. The issue is that running this generates the error: AttributeError: 'list' object has no attribute 'push_back' This works if there is no dictionary, but I think that the dictionary is necessary for my use case. Is there a way to make this work?

I do not want to be copying vectors back and forth as these vectors will get to be tens of millions of entries long. Maybe I can store pointers to the vector instead?

Mike D
  • 727
  • 2
  • 10
  • 26

1 Answers1

6

The C++ vector automatically gets converted to list at the Cython/Python borderline (hence the error message you see). The Python dict expects to store Python objects rather than C++ vectors. Create a cdef class that holds a C++ Vector and put that in the dict instead:

cdef class VecHolder:
   cdef vector[StartEnd] wrapped_vector

   # the easiest thing to do is add short wrappers for the methods you need
   def push_back(self,obj):
     self.wrapped_vector.push_back(obj)

cdef int i
cdef StartEnd j
k = {}
k['hi'] = VecHolder()
for i in range(10):
   j.start = i 
   j.end = i + 2 
   k['hi'].push_back(j) # note that you're calling 
       # the wrapper method here which then calls the c++ function
DavidW
  • 29,336
  • 6
  • 55
  • 86
  • Great answer, thank you. I got the lookup portion to work by making the wrapped_vector variable public and then doing `print(k['hi'].wrapped_vector[i])` – Mike D Dec 15 '15 at 22:03
  • 1
    Bear in mind that that will copy the vector to a temporary list when you do that (which may be slow!). Define a `__getitem__(self,idx)` on `VecHolder` to avoid that (although you still need to think about how you print `StartEnd`...) – DavidW Dec 15 '15 at 22:06
  • Thanks, I did that and it worked. There is one thing that is strange though: when I work directly with the StartEnd struct I access the components by dot (e.g. `j.start = i`), however, when I get the structure back from the vector, it is a dictionary, and I have to access like this: `k['hi'][i]['start']`, not `k['hi'][i].start`. I am not sure yet if this means something is going wrong (aka the structure has been converted into a python object instead of being preserved as a C structure. Any thoughts? – Mike D Dec 15 '15 at 23:02
  • Yes - it's being converted into a python dictionary matching the structure. I didn't realise it did that! The big problem there is that `k['hi'][i]['start']=5` won't actually change anything. I think there's two solutions: 1) you work using `wrapped_vector` (you may have to do it as two lines with a typecast `cdef VecHolder vh = k['hi']; k['hi'].wrapped_vector[i].start`) – DavidW Dec 16 '15 at 07:39
  • 1
    Or 2) - you have `__getitem__` return a small wrapper around `StartEnd` which stores a reference to the `VecHolder` and the index and uses properties to access the underlying `start` and `end` (see http://stackoverflow.com/a/33679481/4657412 for an implementation - look at `PyPeak2` - ignore the "move" stuff). Doing this kind of stuff well always ends up more trouble than it seems! – DavidW Dec 16 '15 at 07:47
  • @DavidW Does not this create insane amounts of overhead? My `__getitem__` and `__setitem__` become all yellow, even with boundschecks turned off and type annotations added. Asked a Q here: https://stackoverflow.com/questions/52706022/removing-python-overhead-when-wrapping-c-vectors – The Unfun Cat Oct 08 '18 at 15:51
  • @TheUnfunCat The `__getitem__` and `__setitem__` are really just a Python interface. For Cython you access the vector directly. They definitely have overhead (that you can avoid in Cython) – DavidW Oct 08 '18 at 17:20
  • @DavidW I get the error 'Vector object does not support indexing` when I try `vec[i] = 5` – The Unfun Cat Oct 08 '18 at 17:30