How to actually delete a column from numpy structured array (so that it won't show up in binary file)

Question

I have a structured array which is loaded from a binary file.

In [85]: dx = np.dtype([('op', '<u8'), ('me', '<u8'), ('gw', '<u8'), ('md', '<u8'), ('tt', '<u8'), ('bb', '<u8'), ('en', '<u8'), ('ab', '<u8'), ('st', 'u1')])

In [86]: s = np.fromfile("somefile.bin", dtype=dx)

In [87]: s
Out[87]:
array([(1574647200000000000, 1574647200000000000, 1574647200000000000, 1574647200000000000, 1574647200000000000, 1574647200000000000, 19374, 9223372036854775808, 0)],
      dtype=[('op', '<u8'), ('me', '<u8'), ('gw', '<u8'), ('md', '<u8'), ('tt', '<u8'), ('bb', '<u8'), ('en', '<u8'), ('ab', '<u8'), ('st', 'u1')])

Now I need to remove some of those columns, and save the data in binary format which would be loadable from C.

In numpy v1.13.3, the following code works file:

In [88]: x = s[['op', 'st']]

In [89]: x
Out[89]:
array([(1574647200000000000, 0)],
      dtype={'names':['op','st'], 'formats':['<u8','u1'], 'offsets':[0,64], 'itemsize':65})

In [90]: x.tofile("updated.bin")

Meaning, if I now open the updated.bin in a hex editor or C code, it only has the 8 byte and 1 byte uint values.

Now switch to numpy v1.17.1 or 1.18.x, That code doesn't work, and the binary file has all the data from the first file! it seems like when I did x = s[...] x was still a view, and when writing x to file, it wrote the whole data.

I've tried the np.delete() and np.copy() and ndarray.copy() with no luck.

Can anyone help me?

Use `repack_fields` to generate a copy with just the desired fields: https://docs.scipy.org/doc/numpy/user/basics.rec.html?highlight=s#numpy.lib.recfunctions.repack_fields. As of 1.16, multi-field indexing produces a `view` which retains the source layout. The change was in the works, on and off, over several releases. — hpaulj, Jan 23 '20 at 16:27
@hpaulj thanks. that was the solution. It would be good if you turn your comment into an answer! for future people; — Mehrdad, Jan 24 '20 at 14:08

score 1 · Accepted Answer · answered Jan 25 '20 at 01:23

Use repack_fields to generate a copy with just the desired fields: https://docs.scipy.org/doc/numpy/user/basics.rec.html?highlight=s#numpy.lib.recfunctions.repack_fields.

As of 1.16, multi-field indexing produces a view which retains the source layout. The change was in the works, on and off, over several releases.

How to *actually* delete a column from numpy structured array (so that it won't show up in binary file)

1 Answers1

How to actually delete a column from numpy structured array (so that it won't show up in binary file)