3

I have the following codes in Python:

class DisjointSet:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0 for x in range(n)]

    def find(self, v):
        if v != self.parent[v]:
            self.parent[v] = self.find(self.parent[v])
        return self.parent[v]

The rest of the codes are similar in terms of the 'code complexity' and are not included here.

So I want to translate the above code into cython code (I know a little bit of C++ and I manage to translate all my code into C++, but I wanna try cython and see how it compares with c++ and python). I have something like:

disjointset.pyx:

# distutils: language=c++
from libcpp.vector cimport vector

cdef class DisjointSet:
    cpdef public vector[int] parent, rank

    def __init__(self, int n):
        for i in range(n):
            self.parent.push_back(i)
            self.rank.push_back(0)

    def find(self, int v):
        if v != self.parent[v]:
            self.parent[v] = self.find(self.parent[v])
        return self.parent[v]

setup.py:

from distutils.core import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("cPercolation.pyx", annotate=True)
)

And I run python setup.py build_ext --inplace in the Windows powershell to compile the code. However, when I import the code and try it in Python, it sometimes gives error (process doesn't return 0) and sometimes gives RecursionError when I call the find method. So what's the correct way to translate the above code? I have read the official documentation and I'm still not sure about things like cdef, cpdef.

Edit: I've added the for loop to fix the problem, but still how should I improve the cython code? When I look at the html file generated there are still lots of yellow highlights (python interaction). Specifically, I want to ask how I should use cdef, cpdef to make the class methods (DisjointSet.find) more like C++ code.

Physicist
  • 2,848
  • 8
  • 33
  • 62
  • And for which input does it crash? – ead Aug 21 '18 at 03:32
  • Your python and cython code do different things: after the initialization there is only one element in self.parent and not n, as it is case for python version – ead Aug 21 '18 at 03:34
  • I have spotted my mistakes and fixed it by adding for loop, but I still don't know how I should optimize my code to make it more like c++ using cython. – Physicist Aug 21 '18 at 03:53

1 Answers1

6

C++ vector operator [] does not check bounds, out of bound access gives a random value, which resulting segment fault in subsequent vector access, you will notice a non-zero exit code.

Instead, use .at(), which has bounds checking, cython will translate std::out_of_range exception into IndexError:

 def find(self, int v):
     try:
         pv = self.parent.at(v)
     except IndexError:
         return None
     ...
georgexsh
  • 15,984
  • 2
  • 37
  • 62