3

I want to use a typed memoryview for optimizing a function, but I don't what would be the argument type. It could be an numpy array or even a scalar. How should I use typed memoryview then?

AstroCB
  • 12,337
  • 20
  • 57
  • 73
bewithaman
  • 768
  • 8
  • 16
  • 2
    this question sounds a little like: `I want to use static typing, but I don't know the type at compile time` – cel Apr 07 '15 at 07:34
  • @cel Yes. That sounds the same. – bewithaman Apr 07 '15 at 07:49
  • 1
    I think what cel may have been hinting is that your two requirements are mutually contradictory. However, one approach would be: 1) define an "implementation" function, that operates on a 1D memoryview. 2) Define a wrapper function that operates on any python object. a) If it's passed a 1D memoryview, call the implementation function; b) if it's passed a scalar, make a 1x1 array and call the implementation function; c) If it's passed a multi-D array then either flatten it for the implementation function, or iterate over the rows, calling the implementation function for each row. – DavidW Apr 07 '15 at 17:03
  • 2
    Alternatively, you could look up how to make numpy "ufuncs" in Cython, which probably do what you want. However, it is potentially quite tricky. – DavidW Apr 07 '15 at 17:05
  • @DavidW for me you could add your comment as an answer... that pretty much solves the OP's needs – Saullo G. P. Castro May 03 '15 at 12:31
  • @SaulloCastro I didn't post it as an answer at the time because I didn't want to write the code to illustrate it (and I think it probably does need code to be a proper answer). I've posted it now though. – DavidW May 05 '15 at 13:51

1 Answers1

2

The issue with this sort of problem is that Python on dynamically typed, so you're always going to lose speed when picking which code-path to take. However, in principle you can make the individual code-paths pretty quick. An approach which might get you good results would be:

  1. define an "implementation" function, that operates on a 1D memoryview.
  2. Define a wrapper function that operates on any python object.
    1. If it's passed a 1D memoryview, call the implementation function;
    2. if it's passed a scalar, make a 1x1 array and call the implementation function;
    3. If it's passed a multi-D array then either flatten it for the implementation function, or iterate over the rows, calling the implementation function for each row.

A quick implementation follows. This assumes that you want a function applied to every element of the input array (and want an output array the same size). The illustrative function I've chosen just adds 1 to each value. It also uses numpy in places (rather than just typed memoryviews) which I think is reasonable:

cimport cython
import numpy as np
import numbers

@cython.boundscheck(False)
cdef double[:] _plus_one_impl(double[:] x):
  cdef int n
  cdef double[:] output

  output = x.copy()
  for n in range(x.shape[0]):
    output[n] = x[n]+1
  return output

def plus_one(x):
  if isinstance(x,numbers.Real): # check if it's a number
    return _plus_one_impl(np.array([x]))[0]
  else:
    try:
      return _plus_one_impl(x)
    except ValueError: # this gets thrown if conversion fails
      if len(x.shape)<2:
        raise ValueError('x could not be converted to double [:]')
      output = np.empty_like(x) # output is all numpy, whatever the input is
      for n in range(x.shape[0]): # this loop isn't typed, so is likely to be pretty slow
        output[n,...] = plus_one(x[n,...])
      return output

This code is likely to end up somewhat slow in some cases (i.e. a 2D array with a short second dimension).

However, my real recommendation is to look into numpy ufuncs, which provide an interface for achieving this kind of thing efficiently. (See http://docs.scipy.org/doc/numpy-dev/user/c-info.ufunc-tutorial.html). Unfortunately they are a more complicated than Cython.

DavidW
  • 29,336
  • 6
  • 55
  • 86