0

Imagine a numpy array of N x M dimension. In each cell, it contains a structured array with X elements, each containing an x_label.

I would like to access a specific x_label so it returns a N x M array only containing the value of the label of interest.

Is there a way to so so without having to use a for loop (or a np.map()) function and creating a new array?

Example:

import numpy as np
arr = np.array([[[],[]],
                [[],[]]])

# Each cell contains:
np.array([('par1', 'par2', 'par3')], dtype=[('label_1', 'U10'), ('label_2', 'U10'), ('label3', 'U10')])

How can I get a 2x2 np.array returned with the par1 values only? I have tried unsuccessfully:

arr['label_1']
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

Thank you!

1 Answers1

1

I'm assuming your outer array is of Object dtype, otherwise there should be no problems:

>>> x = np.array([('par1', 'par2', 'par3')], dtype=[('label_1', 'U10'), ('label_2', 'U10'), ('label3', 'U10')])
>>> Y = np.array(4*[x]+[None])[:-1].reshape(2,2)
>>> Y
array([[array([('par1', 'par2', 'par3')],
      dtype=[('label_1', '<U10'), ('label_2', '<U10'), ('label3', '<U10')]),
        array([('par1', 'par2', 'par3')],
      dtype=[('label_1', '<U10'), ('label_2', '<U10'), ('label3', '<U10')])],
       [array([('par1', 'par2', 'par3')],
      dtype=[('label_1', '<U10'), ('label_2', '<U10'), ('label3', '<U10')]),
        array([('par1', 'par2', 'par3')],
      dtype=[('label_1', '<U10'), ('label_2', '<U10'), ('label3', '<U10')])]],
      dtype=object)

(Note how I have to jump through hoops to even create such a thing.)

Make your life easy by converting to a proper structured array:

>>> Z = np.concatenate(Y.ravel()).reshape(Y.shape)
>>> Z
array([[('par1', 'par2', 'par3'), ('par1', 'par2', 'par3')],
       [('par1', 'par2', 'par3'), ('par1', 'par2', 'par3')]],
      dtype=[('label_1', '<U10'), ('label_2', '<U10'), ('label3', '<U10')])

Now, you can simply index by label:

>>> Z['label_1']
array([['par1', 'par1'],
       ['par1', 'par1']], dtype='<U10')
Paul Panzer
  • 51,835
  • 3
  • 54
  • 99
  • Thank you @paul! I did not create the array myself, it is the way a Python package returns a function... I will convert the array to a proper structured array :-) – Jordi Ferrer Apr 08 '20 at 11:16
  • Another question on this: Imagine one of the values in the array (the parameters called 'parX') is empty. Then, the `np.concatenate()` will change the size of the initial array and the `.np.reshape()` will give the error `ValueError: cannot reshape array of size ### into shape (####)`. Any easy fix for that @paul ? – Jordi Ferrer Apr 08 '20 at 15:18
  • `numpy.lib.recfunctions.stack_arrays(arr.ravel().tolist(),usemask=False).reshape(arr.shape)` might work. – Paul Panzer Apr 08 '20 at 20:42
  • Unfortunately it is not working. It gives the same `ValueError`... – Jordi Ferrer Apr 09 '20 at 09:32
  • Hm, sorry, this is now getting too complex to work it out with comments. Perhaps if the field is empty you could delete it completely (there is a function for that in `np.lib.recfunctions` and then try again. In any case I suggest you make a new question with an example that shows your problem. – Paul Panzer Apr 09 '20 at 12:04