9

I'm trying to compile this kind of code:

def my_func(double c, int m):
    cdef double f[m][m]

    f = [[c for x in range(m)] for y in range(m)]
    ...

which raises:

Error compiling Cython file:
------------------------------------------------------------
def grow(double alpha, double beta, double gamma, int m, int s):
    cdef double f[m][m]
                     ^
------------------------------------------------------------
test.pyx:6:22: Not allowed in a constant expression

after which I assume I can't use variable at the pointed place and I try with numeric value:

def my_func(double c, int m):
    cdef double f[500][500]

    f = [[c for x in range(500)] for y in range(500)]
    ...

but then I get:

Error compiling Cython file:
------------------------------------------------------------
    f = [[beta for x in range(500)] for y in range(500)]
     ^
------------------------------------------------------------
test.pyx:13:6: Assignment to non-lvalue 'f'

So, I'm wondering how to declare and make 2D list in cython code. I couldn't find this kind of example in documentation of googling for "cython 2D list"

theta
  • 24,593
  • 37
  • 119
  • 159
  • Well, if I leave declaration out, I get compiled code out, so I guess my declaration is wrong – theta Jan 02 '13 at 09:11
  • Do you actually want a list of lists, or a 2d C array? –  Jan 02 '13 at 09:26
  • Yes, it's like it's written. I'm trying to speedup very slow Python code that loops over each element of this (and two more) lists. Just imagine how slow that is. – theta Jan 02 '13 at 09:29
  • That was an either-or question. You declared a 2d C array, but use Python lists to initialize `f`, so I'm trying to find out whether you confuse the two (or aren't even aware of the difference, as your language indicates) or want a specific one and are just using wrong syntax. –  Jan 02 '13 at 09:31
  • Yes learning Cython :) In documentation I saw object, that appeared to me as Python list, declared as `p[1000]` so I thought I should declare list like that. Should I try to declare or lists doesn't need to be declared? I saw the example here: http://docs.cython.org/src/userguide/tutorial.html#primes – theta Jan 02 '13 at 09:34

2 Answers2

9

Do not use list comprehension in Cython. There are not speedups as they create regular python list. Wiki says, that you should use dynamic allocation in Cython as follow:

from libc.stdlib cimport malloc, free

def my_func(double c, int m):
    cdef int x
    cdef int y
    cdef double *my_array = <double *>malloc(m * m * sizeof(double))

    try:

        for y in range(m):
            for x in range(m):
                #Row major array access
                my_array[ x + y * m ] = c

        #do some thing with my_array

    finally:
       free( my_array )

But if you need to have a python object of a 2D array, its recommended to use NumPy.

Arpegius
  • 5,817
  • 38
  • 53
  • Thanks for your snippet, but I'm just afraid when I see `malloc` and similar terms. I'm not there yet, and it's easier for me to use f2py, but wanted to start with Cython for a long time. As I replied I'll try now with Cython and Numpy arrays instead Cython and Python lists. Thanks – theta Jan 02 '13 at 10:18
7
cdef double f[500][500]

This is declaring a C array of 500 C arrays of 500 doubles. That's 500 * 500 packed double values (stored on the stack in this case, unless Cython does something funky) without any indirection, which aids performance and cache utilization, but obviously adds severe restrictions. Maybe you want this, but you should learn enough C to know what that means first. By the way, one restriction is that the size must be a compile-time constant (depending on the C version; C99 and C10 allow it), which is what the first error message is about.

If you do use arrays, you don't initialize f the way you did, because that doesn't make any sense. f is already 500x500 double variables, and arrays as a whole can't be assigned to (which is what the latter error message is trying to tell you). In particular, list comprehension creates a fully-blown Python list object (which you could also use from Cython, see below) containing fully-blown "boxed" Python objects (float objects, in this case). A list is not compatible with a C array. Use a nested for loop with item assignment for the initialization. Lastly, such an array takes 500 * 500 * 8 byte, which is almost 2 MiB. On some systems, that's larger than the entire stack, and on all other systems, it's such a large portion of the stack that it's a bad idea. You should heap-allocate that array.

If you use a Python list, be warned that you won't get much improvement in performance and memory use (assuming your code will mostly be manipulating that list), though you may gain back some convenience in return. You could just leave off the cdef, or use list as the type (object should work too, but you gain nothing from it, so you might as well omit it).

A NumPy array may be faster, more memory-efficient, and more convenient to use. If you can implement the performance-critical parts of your algorithm in terms of NumPy's operations, you may gain the desired speedup without using Cython at all.

  • Thanks for your explanation. I guess I should look for a C book a side, and same advice floats just by browsing Cython documentation. So until I learn more about C types and operations, I just use list in this example. C arrays should be more efficient I assume, but then they can't be returned from Cython function to Python code, but just use inside Cython code? Speedup is just 3x for now, but I'll try to go further, first with Numpy and Cython, as using just Numpy arrays instead lists does not improve performance for this example code if it's for believing. – theta Jan 02 '13 at 10:14