I have a numpy structured array a
and create a view b
on it:
import numpy as np
a = np.zeros(3, dtype={'names':['A','B','C'], 'formats':['int','int','float']})
b = a[['A', 'C']]
The descr
component of the data type of b
indicates that the data are stored somehow "scattered".
>>> b.dtype.descr
[('A', '<i4'), ('', '|V4'), ('C', '<f8')]
(After reading the documentation I believe that the component ('', '|V4')
indicates a "gap" in the data, as b
is just a view on a
. )
If this bothers me, I can copy the data:
import numpy.lib.recfunctions as rf
c = rf.repack_fields(b)
and
>>> c.dtype.descr
[('A', '<i4'), ('C', '<f8')]
as desired.
This step requires me to copy the data. Now sometimes, I would like to apply an operation to the view. Often, these operations would return a copy of the array anyways. For example,
d = np.concatenate((b,b))
returns a copy of the data in b
and a
. Nonetheless,
>>> d.dtype.descr
[('A', '<i4'), ('', '|V4'), ('C', '<f8')]
indicates that the data are not stored efficiently.
So is there a way to work with views without producing "scattered" results? Would I always have to create a copy beforehand? Or is there no efficiency issue but only a weird way how descr
describes the data type? (If so, how can I avoid that?)
This question becomes particularly relevent, if I want to neglect intermediate steps:
d = np.concatenate((a[['A', 'C']], a[['A', 'C']]))
I am working with numpy 1.16 and python 3.7.