2

How can I write a wrapper class that makes this work?

def foo(a, b):
    print a

data = np.empty(20, dtype=[('a', np.float32), ('b', np.float32)])

data = my_magic_ndarray_subclass(data)

foo(**data[0])

Some more background:

I had a pair of functions like this that I wanted to vectorize:

def start_the_work(some_arg):
    some_calculation = ...
    something_else = ...

    cost = some_calculation * something_else

    return cost, dict(
        some_calculation=some_calculation,
        some_other_calculation=some_other_calculation
    )

def finish_the_work(some_arg, some_calculation, some_other_calculation):
    ...

With the intent that start_the_work is called with a bunch of different arguments, and then the lowest cost item is complete. A lot of the same calculations are used by both functions, so a dictionary and kwarg-splatting is used to pass on those results:

def run():
    best, best_cost, continuation = min(
        ((some_arg,) + start_the_work(some_arg)
         for some_arg in [1, 2, 3, 4]),
        key=lambda t: t[1]  # cost
    )
    return finish_the_work(best, **continuation)

One way I can vectorize them is as follows:

def start_the_work(some_arg):
    some_calculation = ...
    something_else = ...

    cost = some_calculation * something_else

    continuation = np.empty(cost.shape, dtype=[
        ('some_calculation', np.float32),
        ('some_other_calculation', np.float32)
    ])
    continuation['some_calculation'] = some_calculation
    continuation['some_other_calculation'] = some_other_calculation

    return cost, continuation

But despite looking like a dictionary, continuation cannot be kwarg-splatted.

Eric
  • 95,302
  • 53
  • 242
  • 374
  • `foo(*data[0])` works because a record of a structured array behaves (for iteration purposes) like a tuple. – hpaulj Apr 15 '16 at 22:34
  • What behaviour does a class need to have to work as a `**kwarg`? – Eric Apr 16 '16 at 02:45
  • Seems the answer to that question is `keys()` and `__getitem__` – Eric Apr 16 '16 at 03:07
  • `np.ma` might give you ideas on how to implement the `__getitem__`. `np.lib.index_tricks` also has classes that implement `__getitem__` (though they don't subclass array. – hpaulj Apr 16 '16 at 04:36
  • @hpaulj: `__getitem__` is already implemented on structured arrays? All that's needed is `keys()` – Eric Apr 16 '16 at 05:18
  • The array itself implements it; I'm just wondering if you need to do something extra. `**data[0]` presumably is indexing with `0` and with each key, So `{k:data[k] for k in data,keys()}` works, but does `{k:data[0][k] for k in data[0].keys}`? – hpaulj Apr 16 '16 at 05:42
  • The problem is that `data[0]` does not have a keys method, because it's a scalar type not an array type – Eric Apr 16 '16 at 17:32

3 Answers3

2

It may not be exactly what you want, but wrapping the array in a pandas DataFrame allows something like this:

import pandas as pd

def foo(a, b):
    print(a)

data = np.empty(20, dtype=[('a', np.float32), ('b', np.float32)])

data = pd.DataFrame(data).T

foo(**data[0])
# 0.0

Note that the dataframe is transposed, because pandas' primary index is the column rather than the row.

jakevdp
  • 77,104
  • 11
  • 125
  • 160
  • That's worth knowing, thanks! I don't think pandas is really the right too for most of what I'm doing though – Eric Apr 15 '16 at 04:49
1

Are you thinking that because fields of a structured array can be accessed by name, that they might pass as the items of a dictionary?

In [26]: x=np.ones((3,),dtype='i,f,i')

In [27]: x
Out[27]: 
array([(1, 1.0, 1), (1, 1.0, 1), (1, 1.0, 1)], 
      dtype=[('f0', '<i4'), ('f1', '<f4'), ('f2', '<i4')])

In [28]: x['f0']
Out[28]: array([1, 1, 1])

Converting it to a dictionary works:

In [29]: dd={'f0':x['f0'], 'f1':x['f1'], 'f2':x['f2']}

In [30]: def foo(**kwargs):
    ...:     print kwargs
    ...:     

In [31]: foo(**dd)
{'f0': array([1, 1, 1]), 'f1': array([ 1.,  1.,  1.], dtype=float32), 'f2': array([1, 1, 1])}

In [32]: foo(**x)  # the array itself won't work
...
TypeError: foo() argument after ** must be a mapping, not numpy.ndarray 

Or using a dictionary comprehension:

In [34]: foo(**{name:x[name] for name in x.dtype.names})
{'f0': array([1, 1, 1]), 'f1': array([ 1.,  1.,  1.], dtype=float32), 'f2': array([1, 1, 1])}

**kwargs may depend on the object having a .keys() method. An array does not.


The element of a structured array is np.void:

In [163]: a=np.array([(1,2),(3,4)],dtype='i,i')

In [164]: a[0]
Out[164]: (1, 2)

In [165]: type(a[0])
Out[165]: numpy.void

It has a dtype and names:

In [166]: a[0].dtype.names
Out[166]: ('f0', 'f1')

In [167]: [{k:b[k] for k in b.dtype.names} for b in a]
Out[167]: [{'f0': 1, 'f1': 2}, {'f0': 3, 'f1': 4}]

with your array subclass, a view has this keys:

class spArray(np.ndarray):
    def keys(self):
       return self.dtype.names

In [171]: asp=a.view(spArray)

In [172]: asp
Out[172]: 
spArray([(1, 2), (3, 4)], 
      dtype=[('f0', '<i4'), ('f1', '<i4')])

In [173]: asp.keys()
Out[173]: ('f0', 'f1')

Other ways of constructing this class don't work (i.e. direct call) - that's part of the complexity of subclassing ndarray.

def foo(**kwargs):
    print kwargs  

In [175]: foo(**asp)
{'f0': spArray([1, 3]), 'f1': spArray([2, 4])}

In [176]: foo(**asp[0])
 ...
TypeError: foo() argument after ** must be a mapping, not numpy.void 

In [177]: foo(**asp[[0]])
{'f0': spArray([1]), 'f1': spArray([2])}

splatting the array, or a 1 element array extracted from it work, but the element, in this case the np.void element does not. It does not have the key method.

I tried subclassing np.void as you did array; it accepts the definition. But I can't find a way of creating such an object.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • The issue here is that I have a baes class api that expects a kwarg-splattable type, and that code needs to keep working when a dictionary is returned. I'd prefer not to special case numpy arrays there, and instead am wondering if I can add kwargs-splatting with `data.view(my_class)` – Eric Apr 15 '16 at 18:42
  • It's awkward to modify the `ndarray` class with a new or modified method. But you can implement this array to dictionary conversion in other ways - as a standalone function, a method of wrapper class, etc. – hpaulj Apr 15 '16 at 19:29
0

This almost works:

class SplattableArray(np.ndarray):
    def keys(self):
        return self.dtype.names

data = np.empty(20, dtype=[('a', np.float32), ('b', np.float32)])
data_splat = data.view(SplattableArray)

def foo(a, b):
    return a*b

foo(**data_splat)  # works!
foo(**data_splat[0])  # doesn't work :(

If we're willing to be terrible people, then this works:

from forbiddenfruit import curse
import numpy as np

def keys(obj):
    return obj.dtype.names

curse(np.void, 'keys', keys)
curse(np.ndarray, 'keys', keys)

data = np.empty(10, dtype='i,i')
def foo(**kwargs):
    return kwargs

foo(**data[0])
foo(**data)
Eric
  • 95,302
  • 53
  • 242
  • 374