2

I'm new to cython, and I'm trying to write an algorithm that needs to repeatedly sort partially sorted arrays. It appears that python's standard sort (timsort?) is quite good for this, but I haven't figure out how to call this function from inside a cythonized function.

Namely, I want to do something like:

cdef void myfunc(double* y) nogil:
    double *y_sort = sort(y)

Any pointers on how to do this would be appreciated.

Jules
  • 14,200
  • 13
  • 56
  • 101
firdaus
  • 541
  • 1
  • 6
  • 13
  • Where is this array coming from? Is it a NumPy array, or a standard library `array.array`, or is it coming from a library written in C, or what? At the very least, you're going to need a length as well as a pointer into the array. Also, Python's built-in sort requires a Python iterable of Python objects, so you can't make it sort C doubles without a lot of wrapper overhead. – user2357112 Jan 29 '16 at 23:33
  • I'm trying to a class that matches the interface exposed by the scikit-learn's RegressionCriterion class in sklearn.tree._criterion. The original array is a numpy array, but somewhere higher up scikit-learn's tree module (namely sklearn.tree._splitter) it gets converted into a double* type. Preferably, I would like a highly-optimized library sort, rather than writing one myself. – firdaus Jan 30 '16 at 00:03
  • In accordance with the http://stackoverflow.com/users/1945981/pfnuesel request, here is my doubt(s): (1) Do you really need `double *` instead of `double[:]`? ’Cause you will need to known the vector length. (2) You need a new vector/array/memoryview, or you can sort the original one? – Arĥimedeς ℳontegasppα ℭacilhας Jan 30 '16 at 10:10

1 Answers1

5

Standard C library provides qsort

# from libc.stdlib cimport qsort
# ... declaring "const void *" type seems problematic

# https://stackoverflow.com/questions/8353076/how-do-i-pass-a-pointer-to-a-c-fun$
cdef extern from "stdlib.h":
    ctypedef void const_void "const void"
    void qsort(void *base, int nmemb, int size,
                int(*compar)(const_void *, const_void *)) nogil

cdef int mycmp(const_void * pa, const_void * pb):
    cdef double a = (<double *>pa)[0]
    cdef double b = (<double *>pb)[0]
    if a < b:
        return -1
    elif a > b:
        return 1
    else:
        return 0

cdef void myfunc(double * y, ssize_t l) nogil:
    qsort(y, l, sizeof(double), mycmp)

If the array is "almost-sorted", then insertion sort might be better: https://stackoverflow.com/a/2726841/5781248

Community
  • 1
  • 1
J.J. Hakala
  • 6,136
  • 6
  • 27
  • 61