2

I am trying to implement a subclass of a numpy recarray (recsub) and assign instances of it to an ndarray of dtype 'object' (ndarr). It works well, but i have a problem when the subclassed recarray is instantiated with an empty array. This is the code for the subclassed recarry:

class recsub(numpy.recarray):
"""subclassed recarray"""

def __new__(cls, *args, **kwargs):

    obj = numpy.recarray.__new__(cls, *args, **kwargs)

    return obj

def __init__(self, *arg, **kwargs):

    self.x = -1

def new_method(self):
    print 'new_method() : fooooooooooooo'

I create the ndarray as :

ndarr = numpy.ndarray(5, 'object')

now if i create two instances of recsub :

ndarr[0] = recsub(2, [('a','f8')])
ndarr[1] = recsub((), [('a','f8')])

Now here is the weird stuff that is happening. The output of :

print type(ndarr[0])
print type(ndarr[1])

is:

>>> <class '__main__.recsub'>
>>> <class 'numpy.core.records.record'>

so i can not access ndarr[1].x

This used to work in numpy 1.7, but not anymore in numpy 1.8! So it seems something is missing upon instantiating the recarray with a shape () as opposed to (n)

any suggestion is welcome,

tnx in advance,

mher
  • 369
  • 3
  • 7
  • `()` shaped, scalar arrays can behave rather differently, but I suggest you fix your code, `numpy.ndarray(5)` and assigning to that? That *can't* be right. – seberg Dec 05 '13 at 22:56
  • oh, sorry, that was meant to be 'ndarr = numpy.ndarray(5, 'object') i just copied it wrong. i'll fix it. – mher Dec 06 '13 at 03:01
  • Hmm, it kinda looks like the item getting calls one layer of scalar conversion code too much for object arrays, but I didn't manage to pinpoint the change responsible. This should be a numpy issue not a SO question IMO... – seberg Dec 06 '13 at 12:16
  • Ah, I found the bug, but I don't like fixing issues based on SO questions :P – seberg Dec 06 '13 at 12:50

1 Answers1

1

I get similar behavior in dev 1.9 with simpler arrays

ndarr = np.ndarray(2,dtype=np.object)
x = np.array([1,2])
ndarr[0] = x
y = np.array(3)
ndarr[1] = y
type(ndarr[0])
# numpy.ndarray
type(ndarr[1])
# numpy.int32
ndarr
# array([array([1, 2]), 3], dtype=object)

So the array with shape () gets inserted into ndarr as a scalar.

I don't know whether this is a bug, feature, or intended consequence of some change between 1.7 and 1.8. I guess the first place to look is the release notes for 1.8.

This issue may relevant: https://github.com/numpy/numpy/issues/1679

array([array([]), array(0, object)])
array([array([], dtype=float64), 0], dtype=object)

with the bug fix, https://github.com/numpy/numpy/pull/4109, the items that were stored as array are returned in the same way (instead of as scalars).

type(ndarr[1])
# <type 'numpy.ndarray'>
ndarr
# [array([1, 2]) array(3)]
# [array([], dtype=float64) array(0, dtype=object)]
# [array([], dtype=float64) 0]

And the OP example runs as expected.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • Don't let yourself be fooled by the printing, to inspect this here, try the `.item()` method instead. It actually gets inserted just fine ;), not that it matters... – seberg Dec 06 '13 at 13:02
  • it looks like a bug to me. Because this kind of behavior defeats the purpose of having an object dtype. – mher Dec 06 '13 at 17:05
  • So the `numpy` but is in fetching (`ndarr[1]` for print or use), not in the storing. – hpaulj Dec 06 '13 at 21:56