1

Is there a python or numpy approach similar to MATLABs "cellfun"? I want to apply a function to an object which is a MATLAB cell array with ~300k cells of different lengths.

A very simple example:

>>> xx = [(4,2), (1,2,3)]
>>> yy = np.exp(xx)

Traceback (most recent call last):
File "<pyshell#47>", line 1, in <module>
yy = np.exp(xx)
AttributeError: 'tuple' object has no attribute 'exp'
rakbar
  • 103
  • 1
  • 2
  • 7

2 Answers2

5

The most readable/maintainable approach will probably be to use a list comprehension:

yy = [ np.exp(xxi) for xxi in xx ]

That relies on numpy.exp to implicitly convert each tuple into a numpy.ndarray, which in turn means that you'll get a list of numpy.ndarrays back rather than a list of tuples. That's probably OK for nearly all purposes, but if you absolutely have to have tuples that's also easy enough to arrange:

yy = [ tuple(np.exp(xxi)) for xxi in xx ]

For some purposes (e.g. to avoid memory bottlenecks) you may prefer to use a generator expression rather than a list comprehension (round brackets instead of square).

jez
  • 14,867
  • 5
  • 37
  • 64
  • may also want to consider using math.exp rather than np.exp, as there may be little to gain from np.exp – Lewis Fogden Oct 12 '16 at 20:09
  • @LewisFogden could do, but I think then you'd have to explicitly iterate over the inner tuples as well as the outer list. – jez Oct 13 '16 at 01:18
2

MATLAB cells were it's attempt to handle general lists like a real language. But being MATLAB they have to be 2d. But in general, in Python uses lists where MATLAB uses cells. numpy arrays with dtype=object behave similarly, adding multidimensions.

Taking the object array route, I can use frompyfunc to apply this function to elements of a list or array:

In [231]: np.frompyfunc(np.exp,1,1)([(4,2),(1,2,3)])
Out[231]: 
array([array([ 54.59815003,   7.3890561 ]),
       array([  2.71828183,   7.3890561 ,  20.08553692])], dtype=object)
In [232]: np.frompyfunc(np.exp,1,1)([(4,2),(1,2)])
Out[232]: 
array([[54.598150033144236, 7.3890560989306504],
       [2.7182818284590451, 7.3890560989306504]], dtype=object)

In the 2nd case the result is (2,2), in the first (2,) shape. That's because of how np.array([...]) handles those 2 inputs.

List comprehensions are just as fast, and probably give better control. Or at least can be more predictable.

hpaulj
  • 221,503
  • 14
  • 230
  • 353