1
# pre-allocate data cube
cube = np.empty((len(time_ix), len(data_ix), len(id_ix)))
cube[:] = np.NaN
# filling of cube
tix=3
idx=5
data = cube[tix,:,idx]

Data is describing the values of 20 columns roughly at that day for that id

I am creating a cube to slice better my data afterwards, unfortunately by using such statement I can fill my 2nd dimension with the data having data type only float64 that turned to be quite expensive in terms of storage. I didn't find a way to declare the above mentioned cube as a rec array, in such a way that I could fit into the data_ix dimension heterogeneous data types.

Alternatively, is there a way to represent a 3d array (cube) with 2 indexes to easily slic (time and id) to get the according dataset with pandas dataframes?

Guido
  • 441
  • 3
  • 22
  • A `pandas` dataframe is a 2d object (in the simplest case with uniform column `dtype`). It may have columns with names like "x, y, z", representing point coordinates in 3d space. But that's very different from a 3d numpy array. Be careful when equating numpy 3d with a "cube". – hpaulj Mar 18 '21 at 20:51
  • Sure @hpaulj - I think you got my idea though, if you feel like I should change something in my question to make it clear please go ahead and I will approve the changes – Guido Mar 18 '21 at 21:17

1 Answers1

0

IIUC:

Use a 2-D rec array

x = np.array([[(1.0, 2), (3.0, 4)], [(0.0, 1), (7.0, 5)]], dtype=[('x', '<f8'), ('y', '<i8')])

Then

x['x']

array([[1., 3.],
       [0., 7.]])

Gives a float array while

x['y']

array([[2, 4],
       [1, 5]], dtype=int64)

gives an integer array

piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • Thanks for your answer. I am not sure this fits my needs and please feel free to suggest a better explanation in case I was not clear. By using the `cube` I mentioned before I could access my dataset my selecting `cube[1,:,5]` as dummy example: how could I do the same with a 2D recarray? Basically once selected the time index and the id index, I would retrieve the values of 18 columns, that might even be a dataframe if it would fit different data structures – Guido Mar 18 '21 at 21:22
  • Instead of `cube[1, :, 5]` you slice just two dimensions and specify the field. In my example `x[1, :]['y']`. The first two dimensions are as you expect. The 3rd dimension is just the record itself which has several fields. – piRSquared Mar 18 '21 at 21:34